If you under clock your video RAM you'll see improvements using a worksize of 256 in my experience.
My settings:
GPU: 1065 MHz
RAM: 300 MHz
VCC: 1.11
phoenix.py -k phatk VECTORS BFI_INT FASTLOOP=false AGGRESSION=12 WORKSIZE=256
Yeilds me 235 Mh/s on a Radeon HD 5770
lol, I'm already doing a little better than 235, but I guess thats just the difference in cards

Thanks for the tip I'll let you know if I can get it up and running. I tried doing a GPU up/Mem down underclock at stock voltages but could not get anywhere near 1000 Mhz for GPU or 300 Mhz for mem.
That's 235 Mh/s for $90 @ 108W, this is a good speed for my card (5770).
According to
https://en.bitcoin.it/wiki/Mining_hardware_comparison you should be able to achieve around 392 Mh/s with the proper settings on a 5850.
If you're running linux, check out AMDOverdriveCtrl project on sourceforge, it allows you to underclock and such.