by computerex 4 hours ago

I have had bad experience with neuralwatt GLM 5.2. Seems like they may be using quantized version of the model.

scottcha 3 hours ago | [-0 more]

Hi I'm the CTO of neuralwatt, would love to hear your feedback on what your experience was. Feel free to email me scott@neuralwatt.com. Also for GLM5.2 we run the FP8 quantization at 1M context which is a common deployment target.