Thanks!
Using: -k phatk DEVICE=0 VECTORS BFI_INT FASTLOOP=false WORKSIZE=256 AGGRESSION=13 on a single 5830 and get 312MH/sec

I have always found phatk to be slower. Why did you choose to use it? Also, why WORKSIZE=256? I have also found 128 to be faster on the 58xx and 69xx series.
I am using poclbm at the moment getting 0.2% stale rate [not using BTCMine at the moment though].