Post
Topic
Board Development & Technical Discussion
Topic OP
Messing with queries
by
Hepatizon
on 19/07/2010, 05:56:44 UTC
The Bitcoin network accepts blocks as valid if enough of the nodes accept the block as valid.  The inherent security in this is that an attacker trying to manipulate which blocks are accepted as valid or not would need computing power approaching 50% of the entire network.  But it is infeasible to actually query all or even most of the entire network (or at least it will be once the Bitcoin network becomes larger) so in reality each node must take a sample of the network.  If the sample is large enough and randomly distributed over the entire network, there is no problem.  But if a given node preferentially connects to a smaller subset of nodes, then that node is vulnerable - an attacker need only compromise that smaller subset.  Obviously it is in the interests of each node operator to ensure that his node is truly unbiased in determining which node to get block data from.

So, how does the Bitcoin client determine which nodes to connect to?  I haven't gone through the source thoroughly enough to figure it out yet, but as far as I can tell it either asks the IRC channel who is online and queries those nodes, or sends a request to IRC directly and then gets a response from an arbitrary node?  I think?

I have two main concerns:
1.  Is there any 'first-responder' bias in determining which node to ask for block data from?  If there is, then a malicious user with access to a very low-latency connection could disrupt the network by responding extremely quickly to any request with some form of false data - perhaps simply by saying that no new block has been hashed yet - and get disproportionate influence over the consensus this way.  If HighSpeedCheater can supply information so fast that he responds first 99% of the time to your query, then you need to do 68-69 queries just to get 50/50 odds that you reach a single node that isn't a HighSpeedCheater sockpuppet - and that node itself might be fooled by HighSpeedCheater and unwittingly be repeating false information.  Probably unrealistic for an individual to pull this off, but I can see a big, widely distributed system like a botnet or Google being able to do it.  Even then, nodes could get around it just by checking everything 140x as much as they used to.

2.  Can a 'broadcaster' node influence which nodes it responds to?  For example, could Alice set up her system so that every time Bob asks for the most recent block chain, she is the one (or disproportionately more likely to be the one) to answer him?  If so, then Bob is basically at Alice's mercy unless he can figure out which addresses Alice is using faster than she can create new sockpuppets.

There's also the possibility that the entire transaction chain could fork if the latency between any two approximately equal computing groups becomes larger than the average block-creation time.  Suppose that there is roughly the same about of computing power dedicated to hashing Bitcoin blocks in two groups, say the US and Russia.  If - and I can only see this happening if the time between new blocks is very short - it takes more time for a US node to ask a Russian node what the longest block chain is (and vice versa) than it takes for the Russian (or US) Bitcoin network to generate a new block, then chains could develop local to Russia and the US.  Suppose a node in the US and Russia successfully hash the next block at about the same time, call them blocks 1001us and 1001ru.  The network resolves the discrepancy by seeing whether 1002us or 1002ru gets hashed first.  But since there's a delay in getting information from one country to the other, every node in the US that checks a Russian node for their longest chain length gets out-of-date information, so it will see an older, shorter chain than the chain it gets from local, lower latency nodes.  And the same thing happens to the Russians, with the end effect that American nodes keep working on the branch containing 1001us because they can only see an older, shorter version of the 10001ru branch, while the Russians work on theirs for the same reason.  As long as the chain grows faster than the speed of information, neither chain will ever dominate the other.