I had some trouble getting this to work with SDK 2.4, but finally its running.
On my HD6950 I get ~343Mhash/s,
with diapolo's 11-07-17 its slightly better, ~345Mhash/s.
I tested some combinations - but both run best with BFI_INT WORKSIZE=128 VECTORS AGGRESSION=11