In case you're still looking for solutions, here is my actual current setup for ethash, using TRM (autotuned):
amdmemtweak settings for all 8 GPUs: --RAS 30 --RCDRD 14 --RCDWR 6 --RC 44 --RP 12 --REF 15600 --RRDL 4
http://i.imgur.com/JzTlll9m.pngBottom right pane shows profiles of 2 GPUs. First profile can generally be run at 800mv for 47.5MH/s, and is *slightly* more efficient, but not by much. Second one is for one of the GPUs doing 50.7MH/s - the only thing you should have to change from this is the mclk.p3 voltage setting - i need anywhere between 806-837mv across 7 GPUs doing ~50.7MH/s. Both profiles are using cclk p1 / mclk p3 (although in windows you may need to use mclk.p2 @ 1028MHz for the more efficient profile, to take advantage of the lower socclk - i'm not certain.)
Note this is slightly different from what I mentioned earlier... I was using core p0 (850/800) before TRM - and possibly before amdmemtweak (which likely would have resulted in 44MH/s.) With TRM + amdmemtweak, it appears you need to have core running ~1000MHz to take (nearly) full advantage of the mem bandwidth @ 1107+adjusted timings (and get to 50+MH/s), so I use core p1 @ 1000MHz + mem p3 @ 1107MHz.