by russdill 3 hours ago

This is more of a "Are all the windows closed upstairs?"

"The windows upstairs..."

"...are all closed except for the bedroom window"

The first portion of the response requires a couple of seconds to play but only a few tens of milliseconds to start streaming using a small model. Currently I just break the small model's response off at whatever point will produce about enough time to spin up the larger model.

But all responses spin up both models.