Do you also know if you want to check if a algo is memory limited, you can go into GPUZ and check out the MCU (memory controller unit) and see the load on it?
I think this is wrong. Although I primarily mine using Linux, I have a Windoze box that I use for testing cards. GPU-z appears to show only external bus bandwidth use (to the GDDR), and not the utilization of the bandwidth between the controller and core. In practical terms, a miner kernel may be using 200GB/s of memory bandwidth, but a significant percentage of it can be from the L2 cache. The collision counter tables in SA5 would be an example of this.
Do you have a source for this hypothesis? In all memory restricted algos that correlates to MCU usage. Pretty sure it pertains to any sort of memory overload, bandwidth or bus width...