I forced p=0 video Precision Xoc utilizing the KBoost button which can be seen highlighted in the benchmarks using your algo.
I didn't do this until half way through bc I wanted to make sure that yours was at p0 when it started and wasn't as concerned with the others.
As for the flags I'm not sure, but I know upon entering benching with your algo p0 was definitely enabled and clock was pinged at 2100. It only dropped slightly under load and settled at the shown 2050. Now, this is the same level achieved across the board pretty much even with p0 not enabled.
Remember as well I'm using a 1080ti, not 1070 for these benches. I'm not sure I can p0 it under full load with +400-500mem without fragging instantly or losing performance significantly. On both my FTW3 air, SC2 Air and SC2 Hybrid, I can get a max of about +115-120core and +300, maybe stretching to +350 but for zero sum gains.
You should just check P state with "nvidia-smi -q -d PERFORMANCE" during mining with ccminer or my fork to be sure.
Also problem is not with CORE clocks (2100 is already very high), but with memory clocks, they drop with P2 state. User SCSI2 also has similar EVGA 1080 Ti SC, you can
check his settings for core and mem clocks and see hashrate - rig with 9 cards shows 17+ mh/s, ~1.9 mh/s per card.
Here are my current settings on that rig. NVsmi reports P2 on all 9 GPUs, so I will try to force P0 and see what it gives... All cards are the original SC Black.