... it seems you all don´t get it:
You can´t escape the difficulty bomb !Do your own research or cry later !

This has nothing to do with the diff bomb. This has to do with the hardware architecture of the cards. The reason it started now is because DAG size is over 2GB. In a simplified manner, you could see it as part of DAG now physically ending up on a different IC. This causes the memory controller to switch the IC it reads from more often. The larger the DAG, the bigger the part of the DAG that ends up on a different IC, the more switching it has to do.
Polairs (and Tonga) both have 4 channels, with 2 GDDR5 chips per channel, making a total of 8 chips. Each chip does 32-byte burst xfers, so 2 chips provide a single 64-byte cache line. The memory layout switches channels every 256 bytes (4 cache lines).
http://developer.amd.com/tools-and-sdks/opencl-zone/amd-accelerated-parallel-processing-app-sdk/opencl-optimization-guide/#50401334_pgfId-472173AMD docs say the cards use a direct-mapped cache, which means TLB thrashing can't be the problem since there is no TLB. It sounds a lot like the Pitcairn performance issues as the memory working set grows beyond 1GB (except the issue starts at 2GB with GCN3 devices). I haven't had much time for coding over the past few months, but hopefully I'll have some time over the summer to figure out what's really going on here.
Could this have anything to do with the memory straps? Maybe with stock straps it doesn't slow down with every new DAG