The Fistful of Bitcoins paper [1] by Meiklejohn et al has some nice heuristics. i.e.
HEURISTIC 1. If two (or more) addresses are inputs to the same transaction, they are controlled by the same user; i.e., for any transaction t, all pk ∈ inputs(t) are controlled by the same user.
HEURISTIC 2. The one-time change address is controlled by the same user as the input addresses; i.e., for any transaction t, the controller of inputs(t) also controls the one-time change address pk ∈ outputs(t) (if such an address exists).
Are there any others that are particularly effective?
Is it possible to bootstrap my clustering, for instance, maybe if some people have already associated a certain cluster with Kraken or Satoshi Dice, is this information available?
[1]
https://cseweb.ucsd.edu/~smeiklejohn/files/imc13.pdf