Did some more testing with different intensity settings. My previous test was wrong for ccminer.
MSI 1070 ARMOR OC - driver 388.71
Overlock settings: 75% TDP, +180 Core clock (around 1800 Mhz), +250 memory clock (4250Mhz)
ccminer KlausT 8.17 cuda 9.1: 1210 kH/s
Palgin Neoscrypt: 1240 kH/s
So it seems ~2% difference. Is this right for the 1070 or is there more room somewhere?
Software is still very stable and running well on default intensity settings.