1. he makes one block with his own transaction to merchant and sends this block to the merchant.
2 immediately creates the second block with a double-spend transaction and sends it to other peers.
3. The first block will be orphaned with a high probability, so as merchant's tx.
If the merchant accepts then it's 0-conf security, a sensible merchant will wait for additional confirmations. I tried to explain to you why what you say doesn't make sense, because it's easier to double-spend by offering a higher fee reward to everyone at once.
What if the last stakeholder will send a valid blocks. Like he will connect to all peers and send them different, but valid blocks (the "cost" of this kind of action is comparatively low).
How they can choose the "right" one?
If all these different valid blocks don't offer any particular reward to specific addresses, then it's just DoS attack as discussed in section 5.1 of the PoA paper. If the attacker does wish to mix things up by incentivizing different participants to work on diffferent forks, then I tried to explain to you how to accomplish this attack more easily in Bitcoin.
If you disagree or don't understand something specific in my reply, then please refer to the text that I wrote and specify why you think that it's wrong, or what's unclear about it.