CPU 20-25%

OK, thanks. Could you try the release 1.5.1 (Available on gitbub)
I changed the number of thread per block to 128 and divided by 2 the default number of block per grid.
I would like to know if, on your config, it improves performance, it is the same or it is worst ?
Thank you

Edit :
Changed the link