AMD 9590, 4.43 GHz/core (i.e., lame, conservative clocking) provides ~280 H/s with 8 cores with minerd; bytecoind will get 270. Performance scales reasonably as cores are added/removed, and core temperatures are not an issue (push-pull liquid cooler keeps mine at 36C under full load).
Results with an old Intel i7 2600K series show that performance does not scale beyond 4 threads: I observed about 100 H/s at 8 threads, and 145 H/s at 4 threads. In other words, less is more. This may or may not apply to later model Intel devices, and experimentation is suggested.
While I _strongly_ suggest only the use of devices that implement the AES-NI instructions, that old non-AES accelerated Intel thing has yanked a few blocks, so it's not like it is completely useless.