I published a new release (1.6).
No new feature, just performance increase (16% GPU, 50% CPU on my hardware).
The performance increase are mainly due to a best ECC calculations ( many thanks to arulbero

)
It affects less the GPU because the GPU has no SIMD instructions to speed up the SHA, so the resource goes mainly to it and much less to ECC calculations.
On my pc:
VanitySearch -stop -u -t 1 1tryme --> 1,2 MKeys/s
my ecc library --> 2,0 MKeys/s (17 M Public keys/s)
Now (Intel(R) Xeon(R) CPU E3-1505M v6 @ 3.00GHz):
VanitySearch -stop -u -t 1 1tryme --> 2,078 MKeys/s
VanitySearch -stop -t 1 1tryme --> 2,771 MKeys/s
VanitySearch -stop -t 8 1tryme --> 10,758 MKeys/s
EDIT:
Search: 1Happpppy
Difficulty: 51529903411245
Base Key:89D6DCD4B58447BB26F7FAFC99C12612B4ADB97E8A0CC5133253E3CB74B6734E
Number of CPU thread: 6
GPU: GPU #0 Quadro M2200 (8x128 cores) Grid(64x128)
98.840 MK/s (GPU 88.068 MK/s) (2^31.39) [P 0.01%][50.00% in 4.3d]
For a comparison with Bitcrack:
./cuBitCrack -b 128 -t 256 -p 256 1FshYsUh3mqgsG29XpZ23eLjWV8Ur3VwH
Quadro M2200 568/4038MB | 1 target
61.75 MKey/s (807,927,808 total) [00:00:21]
