Post
Topic
Board Mining (Altcoins)
Re: VectorDash - Rent GPUs to AI researchers. $7.68 per 1080ti per day
by
HansHagberg
on 10/07/2018, 15:24:54 UTC
Hey hey, Octominer here. We manufacture and sell specialised mining hardware for crypto miners.

We are currently working on developing our NEXT GEN riser free motherboard that will be more focused on AI and Rendering tasks also and not just mining for mining.   Shocked Cool

I would love to hear some feedback from you guys on which kind of motherboard would you think would be perfect for Vectordash or AI/Rendering rigs? Currently we are considering making a riser free motherboard with X16 slots based on either AMD Threadripper X399 platform or AMD EPYC server CPU platform. Advantage of the threadripper is that it is cheaper and the CPU clock speeds are much higher. Advantage of the AMD EPYC is that it has 128 PCIe lanes compare to the threadrippers 64 PCIe lanes and more cores. Which makes it possible to run 7x GPUs at full 16x speed on the EPYC motherboard, while the threadripper motherboard can support a maximum of 3x GPU running at 16x full speed or 7x GPUs running at 8x speed.

I would love to hear some real life feedback on the topic of 16x vs 8x PCIe speed when it comes to rendering and AI tasks. From what I understand for OTOYs OctaneRender 8x speed is optimal and there is only a 1-2% difference in performance when comparing 8x to 16x PCIe speed. Does anybody know how is the 8x vs 16x PCIe performance difference when it comes to AI tasks and Vectordash?


I hope to be able to start verifying the performance difference between 16x and 8x on real AI workloads within a few weeks.
I'm also looking for boards with lots of well spaced slots with full bandwidth and I'm not alone.

I suggest Threadripper and PCI switches (e.g. PLX) like what was done on some X99 boards in the past.
Other important factors for AI use cases is provisioning for large amount of RAM. Typically, you want 2x the amount of memory on the GPU.
Storage is another important parameter so bandwidth has to be reserved for that. If the compute models can't be fed data from storage as fast as it's being processed, all is for nothing.