could you give me the config you use, i blind-benched myself and found the speed to be the same. maybe the regression is in some specific cases only?
i've just rebenched b6 and b7 and got almost the same speed. so i need your config and a confirmation the regression you observe is between b6 and b7
As a desperate attempt, a did a formal review between the b6 and b7 code and reverted, for Vega, all the differences.
If it doesn't fix, so i'm running out of ideas.
Online is the
0.33b12 GPU with that fix, and an extra little optim for all cards. I benched it to be ~0.2% within a range of -0.3% and +0.7%, so it should increase the hashrate, but it's so subtle that i'm not even sure the gain is positive.

edit: re-released with the optim restricted to smaller cards
Overdriven config for my VEGA 64:
Name=Vega64Perf
GPU_P0=852;800;0
GPU_P1=991;900;0
GPU_P2=1084;900;0
GPU_P3=1138;900;0
GPU_P4=1150;900;0
GPU_P5=1202;900;0
GPU_P6=1212;905;0
GPU_P7=1408;915
Mem_P0=167;800;0
Mem_P1=500;800;0
Mem_P2=800;900;0
Mem_P3=1100;905
Fan_Min=3500
Fan_Max=4900
Fan_Target=70
Fan_Acoustic=2400
Power_Temp=80
Power_Target=0
Overdriven config for my VEGA 56:
Name=Vega56Perf
GPU_P0=852;800;0
GPU_P1=991;899;0
GPU_P2=1084;899;0
GPU_P3=1138;899;0
GPU_P4=1150;899;0
GPU_P5=1202;899;0
GPU_P6=1212;905;0
GPU_P7=1407;915
Mem_P0=167;800;0
Mem_P1=500;800;0
Mem_P2=800;899;0
Mem_P3=930;905
Fan_Min=3500
Fan_Max=4900
Fan_Target=70
Fan_Acoustic=2400
Power_Temp=80
Power_Target=0
JCE config:
[
{ "mode" : "GPU", "worksize" : 16, "alpha" : 64, "beta" : 16, "gamma" : 4, "delta" : 4, "epsilon" : 4, "zeta" : 4, "index" : 0, "multi_hash":1920 },
{ "mode" : "GPU", "worksize" : 16, "alpha" : 64, "beta" : 16, "gamma" : 4, "delta" : 4, "epsilon" : 4, "zeta" : 4, "index" : 0, "multi_hash":1920 },
{ "mode" : "GPU", "worksize" : 16, "alpha" : 64, "beta" : 16, "gamma" : 4, "delta" : 4, "epsilon" : 4, "zeta" : 4, "index" : 1, "multi_hash":1888 },
{ "mode" : "GPU", "worksize" : 16, "alpha" : 64, "beta" : 16, "gamma" : 4, "delta" : 4, "epsilon" : 4, "zeta" : 4, "index" : 1, "multi_hash":1888 },
]
JCE start bat:
jce_cn_gpu_miner64.exe -o conceal.herominers.com:10361 -u ccx7KNbs8JQM3DL3HENXSCfqnQ3idKdAibZtrQDxCGZvJAutF8CdUihjUvmVyJ2f3VLnXhkrGnitZD15CnZPcNob5cxVVqER68+98eb09ea1d13e11a1ec94a3758d1c9669d2bb27b7344b04556f4a99eb7c92ad7.150000 -p x -c current_config.txt --any --variation 11 --mport 57866 --no-warmup