The bug appears to have been introduced in the 2.7.2 version of OpenEthereum, upon which the subsequent 3.0 update was based as well.
Though the 2.7 release was marked as stable, since June the community started reporting about the client occasionally freezing, which required a manual hard restart for the node. The issue appears randomly “one or three times a month,” and the software fails to notify of its malfunction. Some users have decreed the release as “useless” and “broken for node operators.”
The developers seem to have pinpointed the issue to a subtle bug in thread concurrency, which is used to process tasks in parallel. In this specific case, the software seems to be entering a deadlock — a condition where two threads are forever left waiting for their turn to access some shared data.
OpenEthereum decided to simply scrap the 2.7 release due to this and other “heisenbugs” that are extremely difficult to reproduce and thus fix.
The new 3.0 iteration, based on the last truly stable 2.5 version, is set to be released in mid-September ahead of the Berlin hard fork.
Until that happens, however, operators who downloaded the new version are left with the extremely disruptive task of downgrading.
Liam Aharon, a developer at infrastructure developer BlockNative, highlighted on Twitter that downgrading requires a complete resync of the blockchain, “which for some node configurations will take months,” he said.
The bug affects about 50% of current Parity nodes and all nodes branded as OpenEthereum, which sums up to a total of 12% of the entire network, according to Ethernodes data.
The OpenEthereum team is said to be working on a conversion process that would help nodes avoid the costly re-synchronization.
Some criticism was levied at the team for marking a deeply bugged release as “stable,” an error that propagated into all subsequent releases. Others questioned the soundness of the multi-client approach, citing Satoshi’s view that multiple implementations of the same blockchain node would inevitably lead to issues.
Proponents of the multi-client approach believe that this prevents bugs in one implementation from bringing down the network with them, and the OpenEthereum bug appears to be that exact type of scenario.