1) Use a 64 bit Linux, then you should be able to use more than 4 GB at once. I have had someone autotune scrypt-jane (N=32768) with -L 1 and -L 2 on a Tesla M2090 recently
He sent me his results via Skype, e.g. with -L 2:
GPU #0: maximum total warps (BxW): 324
1.83 khash/s with configuration X71x2
You might check to see what s/he achieves with -L 3 and -L4, the Titan continues to see gains for me until -L 4, might be worth checking if they still have access to the Tesla.
2) Vista, Windows 7, Windows 8 tend to have huge issues with large memory allocations on the card (WDDM driver model restrictions).
Noted, I had hoped this may not be the case

I will look into building for Linux. Question on this. Would you see any disadvantage to a USB bootable Distro for testing?
3) I don't have access to such hardware myself, so my testing opportunities are zero, essentially.
No worries here, like I said, curiosity and speculation. I'm pretty satisfied with the improvements despite OS & Driver limitations you have no control over. Still seeing nearly a 45% increase from Commit 92 to Commit 111

Why is cudaminer crashing when I try to mine off of middlecoin.com ?
I mine there and it isn't crashing.
The FAQ for middlecoin seems to indicate that Stales that are submitted count toward proof of work. Does Cudaminer submit stales currently?