Now here's where I got confused, Miners have to guess this number to find a compatibility hash for the header.....how do they get to change the noce in the header since its hashed along with the block-header-dat thought they have different hash . I'm just completely confused here because the whole guessing of a thing seem like magic to me or does the hardwares get to do it all??
The nonce is a variable that can be used for the hash to be different; anything that changes the input hash (block header) would change the resultant block hash. Since using the exact same input gets the exact same hash, we would have to ensure that the nonce is different by incrementing it with every iteration. If not, then using the same random nonce and the same block headers could result in recomputing the same hash. Not all of the implementations will add 1 to the nonce, and certain clients may choose to skip nonces but the result should be that the nonce are distinct if the rest of the block headers are the same.
An interesting note, SHA256 is divided into chunks of 64 bytes; [nVersion, PrevHash, 28 bytes of merkle root] + [4 bytes of merkleroot, timestamp, difficulty, nonce and padding]. Hence, miners repeatedly hash the latter part of the chunk to find the nonce which yields a hash that has the appropriate target.