by rockwotj 3 hours ago

I have also found deepseek flash beat pro in some of my own internal evals for tasklet.ai it’s really surprising and I don’t understand it

freakynit 2 hours ago | [-0 more]

Same.. although rare, but have observed twice till date.

Some blog post I read few weeks back said that DSV4Flash in xHigh effort beats even the pro model in xHigh effort.

xbmcuser 41 minutes ago | [-0 more]

maybe they distilled claude for the flash version and not for the other hence better tool use and programming benchmarks

onoesworkacct 2 hours ago | [-2 more]

The rumour is that it's trained on Opus, but who knows

rockwotj 2 hours ago | [-1 more]

Oh of course all deepseek and glm are. Multiple people have seen GLM self report that it is claude, which makes it super obvious.

I think the surprising thing is I expect flash to be a pure distillation and strictly worse quality but clearly it’s more nuanced than that.

kennywinker 2 hours ago | [-0 more]