Hey guys, can anyone point me in the right direction for C++ with gpu optimization and cuda integration/implementation?
C++ won't help you, the GPU doesn't care about any fancy object oriented programming. Besides, C++ isn't like a magical unicorn that solves everything and is a perfect language, anyone who lost more than 20 years learning it will definitely tell you they are still learning it. If they don't, they are lying or are unaware that there's always something new to learn about it. Which makes them really bad developers.
To learn what you want, it all really depends on your background. Try this:
1. Run a Hello world program that prints from the GPU.
2. Decompose everything that happened before delving into hacking advanced cryptographic problems:
a. Understand what the host code did (and draw the line between your C++ or whatever code and the GPU code)
b. Understand what the GPU did - dump the CUBIN, check the SASS ops, learn about the registers, shared memory, constant memory, basically read the manual.
c. Upgrade your hello world program into a different programming language (maybe Python) but load and run the exact same kernel as the first time. Notice how it has nothing to do with any C++ this time around, since you're loading a CUBIN into the GPU.
d. Now you're a pro - apply for lead tech jobs at emerging de-fi startups. Just kidding. You're on your way to fight with compiler bugs, unexpected behaviours, unslept nights scrolling through NVidia forums, and profiling performance of a dozen variations of algorithms to see which one runs better.
Good luck.
Even this is super advanced for me but I will take this opportunity and brute-force implement tonight. Thank you for the suggestion.
You mentioned "really depends on your background". Did you mean coding background? If so, absolutely none. I have a medical background, specifically paramedic. A dead person is also known as a "code". So the only "coding" I ever did was on dead people, and ironically, this project may be the death of me.