by aspenmartin 6 hours ago

Data curation is important and expensive and frontier labs can afford to do it right. Natural data isn't the limitation, we are already literally out of tokens. It doesn't matter how much you poison things it's not going to stop the progress train.