Interesting, on fugue256, the gtx780ti gives 232Mhash/s and clearly beats the R9290x which does only 157Mhash/s
yes, we've done a midstate optimization, the OpenCL code does the full hashing on GPU. If they optimize, your GTX 780Ti is toast.
BTW I am only getting 175 MHash/s per 780Ti on Linux (550 MHash/s on my rig of 3) - but I have no OC options there.
Christian