it can run 2 commands in 2 windows, for 1 card, just try, if you get error on any command let me know
The P100 video card is used remotely, I do not have a normal video card.
And how can iron be used if it is occupied by another process

The driver error itself will be.
in short i am guessing, your gpu engine commands inside program is default, and not useing full gpu power, as i said bitcrack used switches B T P , where you set compute thread, process, by this speed will increase more then 10x, study about bitcrack codes, hope u get success to increase more real speed
Again 25. Each jump requires 1 inversion by modulus, since the distance is not known in advance where to jump and it is calculated by dividing it with the remainder of the PointX coordinate on DPmodulo - this is repeated after each jump in each stream. And now, count how many inversions you need to do and please read about this mathematical function. It takes a lot of processor time, since it is necessary to do many mathematical operations with 256 bit numbers for 1 inversion modulo. You do not compare the Polard algorithm with VanitySearch (and bitcrack), in which 1 inversion is done for 1024 keys using the Montgomerys trick, since the step is known in advance - it is a delta from 0 to -512 and from 0 to +512. Check out the GPU code.

Accordingly, the speed can be tens of times different!
you are talking about code, its P kangroo, you are right at your own way, i am talking code running at hardware structure space, as first kangroo was by pinkachunka at bitcrack, his stats was 100 bit in 3 days, and your 90 bit are shown for 8 days, he uses full power of GPU, mean full space by useing bitcrack switches, he uses 15.9gb during workout, and your ram used is 5.5gb , maybe you need expand table size or design by hardware structure switches, where maximum result for less time