by russdill 3 hours ago
This is more of a "Are all the windows closed upstairs?"
"The windows upstairs..."
"...are all closed except for the bedroom window"
The first portion of the response requires a couple of seconds to play but only a few tens of milliseconds to start streaming using a small model. Currently I just break the small model's response off at whatever point will produce about enough time to spin up the larger model.
But all responses spin up both models.