by tiahura 4 hours ago

Apparently irrelevant data can help because model weights are entangled.