I think it makes sense to have a separate RTOS thread/task listening to, parsing and acting on BM1397 responses. Low latency with good hashes seems important!
Oh yes, you want to get those out
immediately. Would definitely work with threads / tasks / listeners (whatever fits the framework or programming language used). On the other hand, I don't think the main task needs to do much while the ASIC is hashing, no?
In that case, it would be possible to do the ASIC comms on main thread and periodically / on background thread fetch new block templates from the pool. Whatever is most performant, I'd suggest.