TransWikia.com

Synchronisation failed, dropping peer; err="retrieved hash chain is invalid"; message loop

Ethereum Asked on January 22, 2021

I have clique private proof-of-authority chain.

I have updated all signer nodes’ (currently I have 3 signer nodes) geth version to minimum Version: 1.8.16-stable. Also I have updated the other node that gives the error to v1.8.16.

enter image description here

The way I run my geth also tried without --syncmode fast flag.

geth --syncmode fast --cache=1024 --shh --datadir $DATADIR/private --rpcaddr 127.0.0.1 --rpc --rpcport 8545 --rpccorsdomain="*" --networkid 12345 --rpcapi admin,eth,net,web3,debug,personal,shh


Error I am having on multiple nodes that are connected into the network.

########## BAD BLOCK #########
Chain config: {ChainID: 23422 Homestead: 1 DAO: <nil> DAOSupport: false EIP150: 2 EIP155: 3 EIP158: 3 Byzantium: 4 Constantinople: <nil> Engine: clique}

Number: 1260001
Hash: 0x659e96f35e1fa1c39fc3b8370a336f78787e482aef44e56bbe6dd9e10bb06bdc


Error: recently signed
##############################

WARN [10-05|15:49:57.694] Synchronisation failed, dropping peer    peer=ae57fcb24c19102e err="retrieved hash chain is invalid"
INFO [10-05|15:49:57.694] message loop                             peer=ae57fcb24c19102e err=EOF
ERROR[10-05|15:50:07.707]
########## BAD BLOCK #########
Chain config: {ChainID: 23422 Homestead: 1 DAO: <nil> DAOSupport: false EIP150: 2 EIP155: 3 EIP158: 3 Byzantium: 4 Constantinople: <nil> Engine: clique}

Number: 1260001
Hash: 0x659e96f35e1fa1c39fc3b8370a336f78787e482aef44e56bbe6dd9e10bb06bdc

  • I have reverted back the blockchain into some previous block number, debug.setHead("0x124F80") (1200000 th block) but it did not help.

  • Please note that I have to remove my chaindata geth removedb and sync from the start, which also didn’t help.


Possible Solution:

  • Should I take back the signer nodes’ blockchain data to point where error does not occurred using debug.setHead()? This could be a short-term solution where the same error could be occurred again.

=> Opened issue.

One Answer

The problem was the signers does not recognize some past block became invalid.

As a solution, first I have updated geth version of signer nodes to equal or above to v1.8.16. Later, I have rewind the chain back to the faulty snapshot block (epoch transition) for all the signer nodes using debug.setHead("0x124F80") (1200000th block). Site to convert decimal value to hexadecimal value, don't forget to add 0x ad the beginning.


From @karalabe's answer on the opened issue:

There was a bug in one Geth release (v1.8.14/v1.8.15) that violated the Clique consensus spec, causing some signers to create blocks when they weren't allowed to (epoch transition). All previous and subsequent version of Geth (apart from the faulty one) correctly rejected those blocks, hence why you couldn't sync a new node to your already mined chain.

A node however does not re-validate blocks when you update it, so even though you updated your signers, they were oblivious to the fact that a faulty block was already in their chain. When you rewound the chain, the signers had to re-mine the faulty segment, correcting the issue.

This should most definitely not happen again, as long as you don't use the faulty version of Geth. Any version equal or above to v1.8.16 should work just fine.

Correct answer by alper on January 22, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP