Interesting, seems like the fix was simply to remove the ", is_blocking=True" parameter I had in my init on both cl.enqueue_ commands and re-enable the use of self.commandQueue.finish().
Edit: Another strange observation is, that Phoenix seems faster (is quicker at the highest rate) without verbose = true ... does that make any sense???
Dia