I hope this guide will answer many or all of your questions about mining difficulty.
I was a miner for more than 2 years, doing a lot of mining with CPU, GPU and ASICs, did a lot of tests, trying to set my rigs as optimum as possible. I started from zero, with no programing skills, and no knowledge of cryptocoins. I learned a lot, but many times I felt the need for proper begginer guides. There is so much info about mining crypto, and very few usefull complete guides for beginners. Many miners don't understand essential notions and do a less than optimum mining with their equipament and the money they spent. I just try to be helpfull, because I was one of them. Let's begin with a simple but too many times ignored notion.
BTC MINING DIFFICULTY.
These notions can be applied to any PoW algo.
A miner is the owner of an account on a mining pool, to which he points his workers to mine a coin and get rewarded for the work he does.
An worker is an individual mining machine, an ASIC for BTC and many coins, a rig or a PC for many other coins. If the miner uses a proxy (a software runnig on a PC, on the same local network as the workers), than the pool sees the proxy as one single powerfull worker, and dosen't sees the actual individual machines. In this case, "worker" means the proxy, that has the combined HR of the machines pointed to it. The proxy is like a middleman between the local ASICs and the pool; it gathers all the small shares (work) from the ASICs, combines them into one big share and sends this big share to the pool. This, I mean using a proxy, is better for the pool and for the miners. The reason for this is that each pool is a server with CPUs and stuff, and like any machine, it has a limited compute power, limited compute cycles. The small shares are sent more often to the pool. The big shares are sent rarer. So it's easier for it to verify fewer bigger shares, than more small ones that can choke it up.
This is what many attacks do on small pools, with small coins; the attacker chokes the pool with many small fake shares, like a DOS attack, and the pool can't process the good ones.
The "same thing" as the proxy, the difficulty adjustement does. It forces the worker to send fewer big shares, than more small ones. The reward for the miner is the same (approximately

), but the pool's server is working less. The pool can also drop incoming shares if there are too many and can't be processed in time. These are the rejected shares.
The mining difficulty for a worker (ASIC, individual machine, proxy) mining on a BTC pool (or any other SHA-256 coin) is calculated as a power of 2, and usualy is adjusted near the value of the worker's hashrate (HR) in GH/s, or the double of that value, by the pool's algorithm. This is the autodiff function of the pool.
The autodiff process: the pool makes guesses using lower and higher diffs, in order to establish the best diff for the worker. It checks the number of shares received from the worker in a period, and makes adjustments to difficulty untill reaches a target, like 1 share at 10 seconds. All this process costs you. You can find this target yourself for a specific pool, as an exercise: let the ASIC mine on the pool for 24h or more, with autodiff; enter the web interface of the ASIC, take the Acceped shares and the Elapsed time (in seconds, or minutes), divide them and you get the "number of shares/minute" or "one share at x seconds". This is the pool's target for diff.
Also, many pools supports custom diff - that is a fixed difficulty the miner can set for each worker. The pool also has some rules for this custom diff, to limit the number of small shares that it receives, and to make an attack less possible.
With autodiff the problem is solved, but there are advantages for the custom diff too, especialy for the miners. So many pools allow custom diff in order to attract miners.
A proper set diff can give a miner 0-2% more rewards than a wrong set one, from my tests. Maybe more. This applys to every minable PoW coin. The custom diff is very usefull for rentals or when you switch pools often; when you rent a rig/ASIC to mine for you, you do this for a limited time, usualy 3 to 24 hours. Than switch to another, and another... These means pool conection-disconection-conection-.... If you don't use a custom diff to tell the pool the best difficulty at start, it will start the autodiff process, that will cost you shares.
From my research, for coins with 2 minutes block time, like many cryptonight coins, the best target is 1 share for 15 seconds; for CN we calculate diff = HR x target_time; ex. diff=950h/s x 15s=14250.
For BTC mining, the target is around 1 share in 3-10 seconds, for the majority of pools. In this case (SHA-256 coins), diff is a power of 2, and usualy it's around the value of the hashrate in GH/s or the double of this value. For a 28TH machine (28000GH/s), you have d=2^15=32768 or 2^16=65536.
The spikes and drops in HR you will see on pool are bigger with bigger diff, and closer to a line with smaller diff. They represent the aproximation of your HR made by the pool, according to the shares that it gets. So the smaller shares (shares with small diff) are sent faster, the pools gets more info in a period of time and the graph has more dots on it; the curve is more linear...

)
WOW, that's a long explanation

Thank you for reading all of this, and I hope it was usefull to you and you learned somethig from it.
If you want to show me your appreciation and to motivate me to write more usefull stuff, please consider a donation. Thanks!
BTC: 1PQHFKx4iFSJzB5CWAEVSd4bJAKwF38fLj
==========
This is a list with some BTC pools and their rules for setting the difficulty; the custom diff can be set by entering in the password field the formula below, like d=65536
f2pool.com - fd=2^n (n=19-25)
poolin.com - autodiff (target 2^n, aprox. double of HR)
pool.btc.com - d=2^n
viabtc.com - d=HR*1000 (wrong)
slushpool.com - d=2^n (>128)
kano.is - autodiff (min. 442, starting at 4098, target 18 shares per minute)
Here are some usefull links, if you want to investigate more:
https://bitcointalk.org/index.php?topic=274023.0Here are some of my old tests in CN algo:
Test fix diff RIG LOKI miner.rocks:
diff 50.000 - start at 23:00-11:00 / 8307903092 - 8450153092 = 142250000 shares > 3292,8 H/s +1,97%
diff 100.001 - start at 11:15-23:15 / 8450153092 - 8589654487 = 139501395 shares > 3229,2 H/s
Test vardiff-fix diff AMD LOKI miner.rocks:
505486003-486059580 vardiff = 19426423 in 10h > 539,6
485975580-466424580 21000 = 19551000 in 10h > 543,1 +0,65%