HN via remix.js for vilnius.js

> [...] beating Claude Code (32%) at roughly $0.17 per vulnerability found

Claude Code is an agent harness, not an LLM.

Claude is a brand (or group of LLMs), not an LLM.

Yes, and the article author is fully aware of that. Thank you for pointing out this small mistake though.

It looks like the author is specifically avoiding model's name, because results are really weird.

  Opus 4.8/4.7 scored 28%

  Opus 4.6 score 37%

So the author thought as let's not get into that just write Claude.

happycube 4 hours ago | [-0 more]

Not weird at all, given the variance in Opus' quality over the last few months.

wild guess - I wouldn't be surprised if Opus 4.6 was run quantized for a while, and 4.7/4.8 have QAT for that nerfed size.

raincole 24 minutes ago | [-0 more]

Where is the weird part?

andriy_koval 4 hours ago | [-0 more]

many people think opus 4.6 was the best

tills13 5 hours ago | [-1 more]

It costs nothing to not be pedantic.

alienbaby 4 hours ago | [-0 more]

Possibly, nothing other than accuracy

croemer 2 hours ago | [-0 more]

The dollar amount is meaningless without comparison - and no other model has a price tag. Sloppy article.

Onavo 6 hours ago | [-0 more]

Claude code it's the only way to get access to the actual amortized cost of running a Claude-scale model. The consumer non-enterprise API is extremely expensive (with increasing marginal costs for the user and fat profit margins for Anthropic). If you want to approximate a State level attacker's cost where they can have the model on their own hardware, Claude Code is probably the best guess at the amortized cost.