Thanks theymos! That's exactly what I was wondering. How did you do that? Was it with the 'bitcointools' that gavinandresen mentioned?
I'm also surprised that there's so... few!!
I count a grand total of SIX auto-payments from the CUDA client.. that is, six payments to the mandatory address of 5.00 BTC.
Each block that the CUDA client found should have sent 5 to that specific address, so this would imply that in the entire closed-source life of the CUDA client, on all the machines it ran on, it found six blocks.
What am I missing? Puddinpop?
I guess there's no magic here. Although there's probably more to the story than what you found, the simple fact is that I have gotten my CUDA version to pump ~7700Kh/s on my system, and while Puddinpop's version is slightly more optimized the simple fact his kernel workers take multiple hashes on a single call makes the system very unresponsive and the amount of data he's moving in and out of the card makes the whole thing slower.
But even at 7700 I haven't yet generated a single block (yes, it does work on a test network) and the last 3 blocks I got where instead found by another machine runnnig at under 3000Kh/s. Even doing half, randomness has it's way of giving an edge to that one

If 10 of you were using puddinpop's version for a couple of weeks doing 10M each, it's kind of expected to get no more than ~10 blocks, combined. But again, you could have generated a lot more, I'm just saying...