Quick update:
Just about ready for this beta with some good results on my end. I will probably be contacting people that responded in the coming days to set up some trials.
Here are the features I have included so far:
gpu and cpu mining
multiple gpu support
long poll
multiple theads per gpu
various timeout parameters: getwork timeout, error timeout
tandem mining.
There is also no perceived performance difference when mining in tandem vs mining in parallel... at least with my 1 gpu and multiple cpu miners
Ive also put some work into unit / integration testing and archetecting the application, which I think has some merit.
All that being said, I would say my mining speed is comparable with all the other top miners.
Also, I have not focused on some of the more gpu / kernel specific compilation parameters (LOOPS, BFI_INT, etc), but im sure those will be easy to add.
Again, if you're interested in beta testing, contact me.