It's different GPU architecture. Unfortunatelly now cant test GTX970 for the comparison (EWBF gives about 280-290 sol/s), try the miner as soon as possible and will post results
That would be very helpful. If it also doesn't scale for a 970, then it must be the transition between host and gpu code. I have some ideas how to overlap it more.