Re: Sia: Siacoin (scn) and Siastock (stk)

Quote from: astor on September 02, 2014, 09:24:39 PM

Just read your whitepaper. I am impressed by the guts, but not by your solution.

Maybe some of these issues have been discussed in this long, long thread, but:

- Your quorum is not a quorum when put in a tree!
- Assuming that attackers are randomly distributed is nonsense (sorry). The sybil attack is as old as time.
- Thus your binomal distribution calculations are simply wrong, because the assumptions are wrong (sybil attack).
- The whitepaper does not describe what the consensus tree looks like. Judging from the calculations in section 10, it seems like it could be a binary tree.
- A honest party cannot prove dishonesty with a properly programmed attacker using a simulated storage and controlling the (attacked) random generator (section 10).
- The number of messages required to keep consensus is very high.
- Did I mention sybil attacks? There is no protection!
- Section 6, what the cost of fetching a block from another participant needs to be is completely wrong in the whitepaper. It is off by 5 orders of magnitude! (by 100.000x!). This p0wns the consensus protocol, or this will indeed be an incredibly expensive storage platform.
- You allow changing the voter set in the consensus protocol without having consensus it seems.
- The proof of storage algorithm is way too naive when much, better algorithms exist.
- Nothing in the protocol requires a node from EVER SERVING DATA!
- Who is going to "fine" misbehaving nodes? The only thing you will see are invalid quorums where data is held hostage.

This is like a swiss cheese! Who reviewed this?

Alright lets go through this bit by bit. First off, the whitepaper hasn't been updated since May, so there are a few things I'll say here that are probably in conflict with what's actually written in the whitepaper. But most of the problems you bring up are not issues, you've merely misunderstood how we've addressed the problem.

First, and most important, the sybil attacks:

We don't assume that attackers are randomly distrubuted. We randomly distribute nodes as they enter the network. I actually don't see where this is discussed in the whitepaper- though it was in a previous version I may have forgotten to add it to this one. When a sibling joins the network, they replace a random existing sibing. The sibling that got replaced goes into the new quorum. This ensures that attackers are randomly distributed throughout the network - you have no control over where you are added when you join.

This is one of the reasons you have to pay a fee when joining the network. It makes it expensive to join multiple times and pick only the random slots that favor you. A large attacker with lots of resources can get lots of rerolls, but each reroll will be expensive and they need exponetentially more rerolls as they try to stack a particular quorum with a higher and higher percentage of corrupt nodes. Does this make sense? The attacker needs the down payments to keep rerolling, but they also need enough storage once they decide to stay in a quorum, because they will be performing storage proofs.

====

What does the consensus tree look like? It'll probably be a binary tree, but that hasn't actually been decided. Could be 4 children per parent, or more. But probably just a binary tree.

====

For section 10, I'm not exaclty sure what problem you are discussing, but if an attacker creates a dishonest block, then an honest host inside of the attackers quorum will be able to prove that the block is dishonest (by showing a snapshot, and the previous honest blocks). An honest host can prove that a particular block is dishonest once a dishonest block has been created.

====

The number of messages required to keep consensus is high: yes unfortunately. At 128 nodes per quorum, assuming ~4kb per heartbeat, each participant will be doing on average somewhere around 512kb of bandwidth, and in attack cases could be doing up to 64mb of bandwidth per block. These are both bounded in ways that make consensus possible, but it's pretty expensive. I'm expecting the block rate to end up at around 1 per day, so participating nodes would be looking at consuming around 15mb of bandwidth per month, per 64GB of storage they are contributing. (The 64GB is a flexible number, we may up it to 256GB or something larger to reduce the total bandwidth the network is consuming).

Attack conditions are pretty severe, but we have a set of (undiscussed) approaches to the problem. DOS attacks are probably Sia's biggest weakness right now.

====

Section 6 has a dated way of doing proof of storage. Today, we use merkle trees + a saved-hash structure. It ends up adding a total of about 700 bytes per message to the consensus algorithm. Each heartbeat will have ~1kb of mandatory stuff, leaving 3kb for optional stuff. 3kb may end up not being enough, but remember that's 3kb _per sibling_, _per quorum_, which means that each quorum has a total of around 350kb per block that can be used for transactions, and there are going to be many quorums.

====

"You allow hanging the voter set in the consensus protocol without having consensus"

I'm not sure what you mean by this.

====

Nothing in the protocol requires a node from ever serving data:

this is something that isn't discussed in the current whitepaper. There will be monetary incentives for serving data, and barring that, there will be a proof-of-data-transaction, which is a complicated and expensive process but forces a node to either serve data or be kicked. This currently isn't a high priority, I want to make sure the storage stuff all works before we move onto things like bandwidth, cost, and actually uploading files to users.

====

Fining misbehaving nodes:

a node that misbehaves will be kicked from the network, and have it will not recieve it's initial "down payment" as a refund. This is how nodes are fined. If a node gracefully leaves the network (there are a few cleanup things it needs to do - not fully discussed yet).

====

I realize that a lot of the details are left out. We're planning on releasing another paper in the next few months, namely an explicit specification for single-quorum consensus. (IE what the alpha v2 is going to look like). There's a lot of work left to do. Things that we are consciously leaving out for now:

+ People downloading wallets
+ Storing files that the network can't automatically repair (to defeat storage pool concerns)
+ The explicit way that things are priced
+ The explicit way that nodes in different quorums can look each other up

These are all important, and will be fully and formally defined eventually, but we're focusing on the more core problems first.

The things that we care about most right now:

+ Maintaining consensus in a way that is fully fault-tolerant
+ Preventing sybil attacks
+ preventing random number manipulation
+ secure proof of storage
+ transaction malleability
+ scalability

It's going to be much easier for me to discuss Sia with you if we focus on one problem at a time. I think the one you are most worried about is Sybil attacks, so lets focus on that until you are convinced that Sia is secure against them. (or until I am convinced that I've missed something important). Or if you want to focus on something else specifically (like total bandwidth consumption), we can do that. But just 1 problem at a time so the discussion can be more focused.