Re: Block chain size/storage and slow downloads for new users

Thank you for the clarifications.

Quote from: DeathAndTaxes on June 26, 2014, 04:07:56 PM

(...) At some point client A starts sending you blocks X to Z (...)

... and at this time my client would tell the client A to stop sending the blocks and send me bunch of others.

Quote from: DeathAndTaxes on June 26, 2014, 04:07:56 PM

It doesn't validate all of them. It is done to ensure there has been no database corruption (possibly during the prior close due to a power failure). It only checks a limited number of the most recent blocks. You can from the config file adjust how many blocks to check and how detailed of a check to perform. You can even set this to zero blocks if you like.

How about marking the stored block chain as good after the client properly exits and then at the startup look for this mark and do the check only if the mark is not there?

Quote from: DeathAndTaxes on June 26, 2014, 04:07:56 PM

There is a cache. It is called the UTXO. Block are only used to create and update the UTXO. All validation of new txns and blocks is done against the UTXO. Once a block is written to the disk other than for responding to block requests from other peers (or updating the UTXO in a reorg) they aren't used by your client.

Good to know. But why it is then accessing the disk so much? I am using the latest client. True, the speed went waaaaaaaaaaay up from the 0.6.x I was using before but still it touches the disk quite a bit.

Quote from: DeathAndTaxes on June 26, 2014, 04:07:56 PM

Older blocks are not needed except to provide blocks to peers who are bootstrapping. Saying you prefer large files over small files in all cases is a dubious request.

Well, maybe it might seem dubious in your eyes but you cannot be sure if there isn't a valid reason on my side for the request. It would be nice if I could control that stuff.

Quote from: DeathAndTaxes on June 26, 2014, 04:07:56 PM

Reinventing the wheel? The chainstate is stored in leveldb which is accepted as an incredibly lightweight and very fast key pair database. It is doubtful you would design an alternative custom database with similar functionality that outperforms leveldb. Also even if you could would the development time be worth reinventing the wheel rather than improving the actual client?

Good point. Thank you for pointing this out. One other question for me to ask would be "how about improving the leveldb so others can benefit from it as well?".

Quote from: DeathAndTaxes on June 26, 2014, 04:07:56 PM

The coinbase txns of all blocks represent <0.003% of the blockchain. The size is already limited by general limits on the size of ScriptSigs for all transactions. Seems a dubious use case.

Good to know. Thank you. I did not do much statistics but

Quote from: DeathAndTaxes on June 26, 2014, 04:07:56 PM

That cache is called the UTXO (the chainstate folder you dislike so much). Blocks are used to build the UTXO in a trustless manner. They aren't used to process or validate new blocks and transactions. The raw blocks are just used to bootstrap new nodes so they too can build the UTXO in a trustless manner.

Maybe I might start liking the UTXO and stuff and disliking Windows instead (the client is running on Windows). I now remember that Windows is doing pretty crappy job at managing the disk and the files on it. Once I find the time for this, I will try to run the thing on (modern) Linux and see what happens.

Now I am starting to think that Windows is pretty crappy system for a task like this. I do know its disk cache sucks a lot. I also now remembered that recently I realized that the Windows's filesystem also sucks a lot (when compared with Linux filesystems) (have you ever tried to defragment NTFS and make sure that it really is defragmented? That is the thing I now remember doing recently and realizing it to be pretty impossible and concluding that NTFS sucks). Now I think that the leveldb guys and you, Bitcoin guys did an admirable job at forcing the stupid Windows to behave under such load (20+ GB of growing data + 0.5 GB of heavily updated data).