Post
Topic
Board Bitcoin Technical Support
Merits 1 from 1 user
Re: Database corrupt on re-start [Was: Bitcoin-qt cannot read the database, closing]
by
tfeagle
on 19/01/2018, 23:12:53 UTC
⭐ Merited by vapourminer (1)
The root causes are usually out of our hands. In many cases it is hardware failure. Bitcoin Core does a lot of disk I/O and uses quite a bit of memory and processing power. So often times hardware issues (typically in memory issues or disk issues) are the root cause of such problems. I suggest that you run some hardware diagnostics on your machine and see if anything turns up.


I'm relatively sure (let's call it 80% certainty) that some database corruption events stem from the way qt handles an interrupted incoming bitstream (ie. downloading blocks) when the interruption is not flagged by hardware or other lower level layer.  Boxes which connect different telecomm technologies to local ethernet network (DSL to ethernet, coax to ethernet, FIOS to ethernet, etc...sometimes called "broadband modems") are an ongoing headache.  For ethernet (OSI model) there are seven layers to ethernet...more or less.  This is "ivory tower" perspective.  In reality, vendors who make little boxes/modems will merge and/or divide existing layers whenever they wish.  In an ivory tower implementation, there exist diagnostic hardware/code hooks to detect issues with data stream and somehow mitigate possibility of bad data.  In a low-cost box with fewer or merged layers, some specific types of bad data are knowingly passed northwards.  (Passed upwards from phy layers to processing and application layers.)  Some types of bad data are too expensive to identify and flag at lower levels.  Brute force experiment (to detect absence of flags) is easy to design, annoying to execute.  Simply inject noise into DSL signal on outside of modem while downloading blockchain.  (Be careful...some ISPs may be angered by this.  Also...use protective circuit to prevent damage to modem or to ISP hardware.)  Last log I posted (above) from corrupt database was generated in this manner.  Took less than three hours, with only a small amount of noise injection.

If I am correct, then yes.  This would be clearly a network/hardware failure, stemming from design omissions.  Or maybe a management failure, stemming from corporate practice of giving as little value as possible to buyers of equipment.  If so, then application software is completely innocent.  Nevertheless, it is unlikely that all modem vendors will upgrade their products in order to prevent blockchain corruption.  Even if this were practical, any dysfunctional equipment which is already in service will continue to pass bad data, and annoy us, for decades to come.  Choices are: fix the problem(s) at the application level, or post process and repair the database, or require that users endure corruption events.  Also...as blockchains grow, it is likely that users' overhead expenses (download time, offline time, network loading during download, extra systems with hot spare blockchain, etc) will also grow.  Will corruption events increase in frequency as blockchains grow?  Difficult to predict.  

To developers: I would recommend creation of an offline stand-alone point tool (or possibly several tools) to examine idle database on user's hard drive, and provide block-by-block diagnosis.  My opinion: embedding diagnostic tools within an application package is generally a bad idea.  Also, some sort of block editor would be useful.  I should not need to download 150 GBytes of data, whenever I wish to delete and replace one specific block.  

Sorry for the rant.  Can you guess that I once designed tecomm hardware?