I want to anwer questions like "how many transaction with lower input value than 1btc have ever been completed after 1.1.2013" and similar.
You basically want to parse the blockchain into a database like MySQL/Postgres so that you can answer that question by executing a query against it.
For example, I parsed it into a database called Datomic and that query looks like this:
http://i.imgur.com/tQcuIN8.pngThe blocks are stored in the blkXXXXX.dat files found in the ~/.bitcoin directory.
Each blkdat file is a sequence of block data where each block can be parsed like this:
- Magic bytes: uint32-le
- Byte-count of the block: uint32-le
- Block: BlockCodec
The file may be padded at the end with null-bytes. You also need to know that blkdat files are not a validated blockchain, just the blocks that bitcoind received over the network. The easiest thing to do is just add all blocks to the database and assume that the longest chain of `latestBlock->prevBlock->prevBlock->...->genesisBlock` is the one you want to query.
Database schema looks something like this:
- Block has many Txns
- Block has one PreviousBlock
- Txn has many TxOuts
- Txn has many TxIns
The pseudocode for it all is pretty simple and parsing is straight forward.
Word of warning: I went so far as to get a local blockexplorer working (
https://github.com/danneu/chaingun), but of course I wandered off into the dragon lair of reimplementing blockchain validation and ran out of free time. It became evident that I was in fact climbing a large mountain when I passed by Jon Krakauer's corpse frozen in the snow. At which point I realized I wasn't climbing a mountain at all but slowly sliding down one slippery slope. And when I reached the bottom and turned back around, I realized it wasn't even a slope but instead a heap of my own yak shavings.