Interpolation and generalization are two completely different constructs. Interpolation is when you have two data points and make a best guess where a hypothetical third point should fit between them. Generalization is when you have a distribution which describes a particular sample, and you apply it with some transformation (e.g. a margin of error, a confidence interval, p-value, etc.) to a population the sample is representative of.
Interpolation is a much narrower construct then generalization. LLMs are fundamentally much closer to curve fitting (where interpolation is king) then they are to hypothesis testing (where samples are used to describe populations), though they certainly do something akin to the latter to.
The bias I am talking about is not a bias in the training data, but bias in the curve fitting, probably because of mal-adjusted weights, parameters, etc. And since there are billions of them, I am very skeptical they can all be adjusted correctly.
I assumed you were speaking by analogy, as LLMs do not work by interpolation, or anything resembling that. Diffusion models, maybe you can make that argument. But GPT-derived inference is fundamentally different. It works via model building and next token prediction, which is not interpolative.
As for bias, I don’t see the distinction you are making. Biases in the training data produce biases in the weights. That’s where the biases come from: over-fitting (or sometimes, correct fitting) of the training data. You don’t end up with biases at random.