Post
Topic
Board Altcoin Discussion
Re: [ANN] MemoryCoin | CPU Only | Very Deflationary
by
AnonyMint
on 18/08/2013, 05:45:00 UTC

It doesn't need to be linear, because the FLOPS cost in GPUs is so much lower than in a CPU system.

It appears to me that what happens in a GPU (which is why Intel's hyperthreading is faster than just 4 hardware cores) is that when there are many logical threads, then thread blocks on main memory latency are not a factor, because some other thread can run which has already loaded its main memory access into cache. Thus the GPU is always able to achieve the 200+GB/s main memory throughput, because the latency is masked by the probability of numerous threads.


So it's not just that the improvements in adding cores are not linear . . . it's that they appear to be converging to an upper value, suggesting that they're hitting a limitation other than cycles per second. I'm assuming that's related to memory access, either cache memory size or main memory access.  I'm assuming too that a GPU will hit that limit too, and any increased performance because of wider memory bus, speed or cache would be offset by the slower cores.

I am not sure until I run some tests, but my strong belief (based on good analysis) is that that you are hitting memory latency in your CPU with the 2MB Scrypt, but the GPU will not be. Thus it is going to toast your CPU. This is not a minor problem, rather I conjecture it is total failure to get what you intended. The reason I suspect this, is because the GPU can run 6 x 1024 MB ÷ 2 MB = 3096 threads and thus entirely mask away the latency. Whereas you do nothing to mask the latency on the CPU as far as I can see in your code.

You make other good points and I don't have the time to give them the attention they deserve immediately. I'm sure there are improvements to be made in the hashing algorithm to make it more GPU resistant, but I'm more concerned with having one that is 'good enough' rather than a perfect one that might fork the coin or cause loss of momentum.

I think yours is worse than Litecoin's by a factor of 10 or 100, i.e. 100 to 1000 times slower CPU than the GPU since Litecoin is already 10 times slower CPU than GPU!  Shocked At least Litecoin stays within L2 cache, and it is not just the memory bandwidth but the TLB and L2 cache reloads that run 100+ cycles that kill you.

You appear to have a serious fail here, but I would need to build a CPU miner to test my hypothesis.


I am going to avoid talking about your holistic design because we have some disagreement,

Yes, there's a lot we disagree on. Have you considered implementing your ideas in a new coin? That's what I did when I decided all the other coins had it wrong!

Yes I am, but as I said, if I can help you and at the same time benefit from having my idea tested earlier, then it is win-win for both us, in spite of our disagreement on orthogonal features of MemoryCoin. There is no need for me to comment further about your grants. Best is you try it, then we all learn from the result. I wouldn't copy your grants, nor your 2% deflationary coin, so there is no competition coming from me against MemoryCoin on those features. I am delighted you introduced a coin with continuous inflation (albeit only 2%) and not ridiculously high as Inflatacoin, so if you get adoption then it opens the door for a coin with slightly higher rate of inflation.

If instead you don't want to share synergy or simply have other priorities demanding your attention, then that is fine too.