In Xilinx FPGAs the PLLs are very picky about the multiplier. eg the multiplied clock must be between 800MHz and 1600MHz. With a 100 MHz input you would never use a multiplier of 2 and a divider of 1 to get 200MHz, you would use a multiplier of 10 and a divider of 5 to put the multiplied clock into the valid range. It might be worth sweeping the mutliplier range to see if certain values perform better.
After I get a heat sink mounted I'll do some testing like that.