What do you think would be the most cost effective way to generate a higher number of Mkeys/s?
it is hard to talk about "cost effectiveness" while all you want to do is generate a vanity address. in other words even if you spend a single satoshi, it is still a high cost!
in any case a more effective way is using a strong GPU instead of CPU and doing it locally. and of course just like miners you could build a rig of GPUs to do the search so that you get a higher "hashrate". and finally the ultimate way is to build an ASIC machine that is designed to perform the specific calculations.