HN via remix.js for vilnius.js

by danslo 7 hours ago

It reads like an ad.

Secondly these are "just" IDORs, arguably the easiest class of vulnerabilities.

Thirdly it compares to GPT 5.5 and Opus 4.8.

No, we don't have Mythos at home.

vlian2088 7 hours ago | [-3 more]

>Thirdly it compares to GPT 5.5

mythos is <10% ahead of gpt 5.5 on all benchmarks, which it gains by being several times the size of opus. had it been economical to provide, it would've been released to the public on day one instead of the marketing circus those effective altruism clowns had exhibited. admitting that it costs >1000% to run inference on a <10% better model would've been very damning.

oa335 4 hours ago | [-2 more]

> it costs >1000% to run inference

do you have a source for this claim? i thought LLM providers earn high margins from inference (charged by token). is this no longer the case?

vlian2088 4 hours ago | [-0 more]

if a $6000000 cabinet can generate 10000/s tokens of Opus but only 1000/s tokens of Mythos, then Mythos costs 1000% to run no matter the markup.

no one has a source, because no one knows closed model parameter counts. we have only heuristics which strongly indicate that Mythos is simply a big fucking model that any other lab could make an equivalent of.

3836293648 4 hours ago | [-0 more]

This was just theorised. The leaked OpenAI financials suggest otherwise (because of shady naming of losses)

The only ones who seem to profit are the ones running smaller Chinese models. Even NVIDIA seems to have to "reinvest" their profits into sponsoring companies to buy their cards now.

InsideOutSanta 6 hours ago | [-1 more]

In my experience, GLM 5.2 is extremely good at finding vulnerabilities, and more importantly, unlike Opus, I've never seen it refuse a command. It genuinely is a very strong model for finding and fixing vulnerabilities.

nozzlegear 4 hours ago | [-0 more]

More importantly, unlike Mythos and Fable, you can actually use GLM 5.2! It's not just marketingware that got its founder in hot water with the government.

NitpickLawyer 6 hours ago | [-0 more]

> Thirdly it compares to GPT 5.5 and Opus 4.8.

> No, we don't have Mythos at home.

That's still useful. To paraphrase the kids these days, GLM5.2 is in the room with us, today. Mythos is not. And for us in the EU, it's even more complicated, as Mythos might be with us in the room one day, and go poof the next day, on the whims of political entities that we have 0 control over.

Knowing where open, accessible, local models are is important. We know they're behind. But there comes a time when "good enough" is useful. Even if they're "just IDORs" today, and even if they're behind SotA today.

As someone else said above, GLM5.2 (and other models in the same tier like kimi, dsv4, etc) is / are slowly becoming "good enough" to assist in automated repo prepare work (download, install, test, edit, re-test, etc). And that translates in RL traces ready to be trained into the next generations. That might be more important than x% behind on benchmarks.

sanid 6 hours ago | [-0 more]

Technically we don't have Mythos at all? You guys have access. This tells me we have Opus at home (open weights).

jimbob45 6 hours ago | [-0 more]

Yeah they straight up say that their criteria is narrow and primarily important for their specific use case. Never let rationality cause your pitchfork to be cast away though!