by EagnaIonat 3 days ago

LLMs don't see words. They see tokens, which is why previously they had a hard time counting the r's.

You can certainly prove that mathematically, and giving that proof to an LLM it will give you the correct answer. Which is a prompting technique btw to improve accuracy of an LLMs results.

Although the more recent LLMs have built in approaches to answer this question.

Gemma4 for example:

-----

The user is asking for the number of 'r's in the word "strawberry".

* S

* T

* *R* (1)

* A

* W

* B

* E

* *R* (2)

* *R* (3)

* Y

There are 3 'r's.

-----

GPT-OSS-120b example.

-----

The user asks: "How many r's in strawberry." This is a straightforward counting problem: count the letter 'r' in the word "strawberry". The word "strawberry" contains letters: s t r a w b e r r y. Count r's: there are three 'r's? Let's check: s(1) t(2) r(3) a(4) w(5) b(6) e(7) r(8) r(9) y(10). Actually the word is "strawberry". Let's write out: s, t, r, a, w, b, e, r, r, y. So there are three r's: at positions 3, 8, 9. Yes, three r's. So answer: 3.

But we may also note that "r" appears three times.

Thus answer: 3.

We can provide a short answer.

Thus final: There are three r's in "strawberry".

----

kang 3 days ago | [-7 more]

Doubt if you can make a dumb model smart by feeding it proofs

EagnaIonat 3 days ago | [-6 more]
Tade0 3 days ago | [-5 more]

Sohnds like a great way to fill up the context before you even start.

falcor84 3 days ago | [-4 more]

Yes, what's your point? That is literally what it does - it adds relevant knowledge to the prompt before generating a response, in order to ground it me effectively.

Tade0 3 days ago | [-3 more]

My point is that this doesn't scale. You want the LLM to have knowledge embedded in its weights, not prompted in.

EagnaIonat 3 days ago | [-2 more]

It scales fine if done correctly.

Even with the weights the extra context allows it to move to the correct space.

Much the same as humans there are terms that are meaningless without knowing the context.

kang 3 days ago | [-1 more]

Would it be possible to make GPT3 from GPT2 just by prompting? It doesn't work/scale

EagnaIonat 2 days ago | [-0 more]

Bit of a straw-man there.