ah-- no interest?

It requires no hardware modification. I was trying to compile cgminer to my Galaxy S4, managed to do it, but the kernel doesn't want to be accepted, so I modified the scrypt kernel code. The good is what you see above, higher hashrate-- you might even need to tune down (like I did, to reach a good hash/power ratio), the bad is the modification is specific for each configuration, so the kernel code I have is specific to mine. Bear in mind that cgminer can do a lot of things and can be configured to suite any GPU settings, but making it (the scrypt kernel) concentrate on a specific configuration will yield higher hashrates. So here's what I did:
- edit scrypt130511.cl
- remove, as much as possible, looping structure
- use hardcoded constants as much as possible
- minimize arithmetic operations (addition, subtraction, multiplication, and division) by calculating the exact values from hardcoded settings (thread concurrency, worksize, gap)
- minimize calling custom functions and incorporate them to the function caller itself.
- save!
- remove all scrypt130511......bin files, because this is the compiled kernel that cgminer uses for the current setting
- start cgminer
The readability and maintainability of the code is of course affected, but we don't need this-- we want the kernel to be as efficient as possible!