by sroerick 4 hours ago
I'm using synthetic.new and Neuralwatt with pi and its good and also cheap
I have had bad experience with neuralwatt GLM 5.2. Seems like they may be using quantized version of the model.
Hi I'm the CTO of neuralwatt, would love to hear your feedback on what your experience was. Feel free to email me scott@neuralwatt.com. Also for GLM5.2 we run the FP8 quantization at 1M context which is a common deployment target.