by bee_rider 7 hours ago
Naive question, but what is Gemini?
I wonder if a lot of these models are large language models that have had image recognition and generation tools bolted on? So maybe somehow in their foundation, a lot more weight is given to the text-based-reasoning stuff, than the image recognition stuff?
Go watch some of the more recent Google developer, Google AI, and Google deepmind videos, they're all separate channels at YouTube but try to catch some from the last 6 months with some of these explanatory topics on the developer side that are philosophical/ mathematical enough to explain this to you without going into the gritty details and should answer your question