How Ledger manages Ethereum Hard Forks
Main takeaways: |
– Last year, we saw the introduction of two major updates in the Ethereum blockchain: the Berlin and London forks. How have such events impacted us, at Ledger? – One of these updates raised big problem for the entire Ethereum ecosystem. How did we handle the crisis? What did we learn from it? What about NFT support? How do we adapt to handle these new transactions? – At Ledger, we aim to provide our clients with best-in-class services. That is why we are constantly looking to improve our solutions, stay up to date with the latest innovations and resist any shake-up to this ecosystem! |
Hard Forks: keeping our clients on the right side of the chain
The year 2021 was wild for the Ethereum protocol. It saw the introduction of two major updates, introducing what we call a hard fork in the underlying chains.
A hard fork occurs when the code of a blockchain changes so much that it is no longer backward-compatible with earlier blocks. The blockchain then splits in two, where the original is unaffected, and the new fork follows the new set of rules.
Feel free to check out this Ledger Academy article for more details about forks.
This means that all client Ethereum protocol implementations – in short, ETH node implementations – must contain the protocol changes before the update. If you have an outdated node, you will end up on a chain that is parallel to the so-called hard-fork block, with all the other members of the network who have not upgraded. If the update goes ahead, this parallel chain eventually dies out.
Obviously, at Ledger we want our clients to stay on the right side of the chains, so these updates are critical to the service we provide. Here is how we do it.
What the fork?
Last year, two forks occurred on the Ethereum Mainnet:
- The Berlin fork on the Ethereum Mainnet at block 12,244,000, on April 15, 2021.
- The London fork on the Ethereum Mainnet at block 12,965,000, on July 30, 2021.
These forks greatly modified the economics of the Ethereum ecosystem, making ETH assets partially deflationary (such as BTC). We won’t go into details here, as this is already pretty well covered in the literature. And for those who are more interested, you can also check out the documentation about EIP 1559.
Also, Ethereum is first and foremost a protocol that is implemented by a variety of clients. Each client has its own set of advantages and disadvantages, and nodes should be picked according to the desired goal. Since we will be talking a lot about them, here there are :
- open-ethereum: an implementation focused on performance, efficient indexing and querying, and mining.
- go-ethereum: the reference implementation, with a focus on correctness of the specifications.
If you don’t know what an Ethereum client is, or you want more details, you can read the official Ethereum documentation.
In this article, we will also talk about Ethereum Virtual Machines (EVMs). An Ethereum Virtual Machine (EVM) is a computation engine that acts like a decentralized computer with millions of executable projects. It carried out all kinds of actions on the blockchain.
If you want to know more about EVMs, there are many good articles out there.
The anatomy and limitations of a transaction
To better understand the difference between these two node implementations, let’s start with what an Ethereum transaction really is.
No surprises here, the Ethereum blockchain is a single linked list of blocks, with each block containing a set of transactions. When a block is “played” (when a node of the network adds a block to its state), all pending transactions are executed in order, each one of them modifying the state of the chain.
A transaction can be:
- A fund transfer from one address to another. This case is simple: the modified state concerns the two accounts’ balances only.
- A smart contract method call. This case is more complicated.
Smart contracts can be seen as being special accounts that have a collection of a chain’s substates (called “contract state”) and have a group of EVM bytecode methods that are executed when the method is called. Contracts are special accounts that own data on the chain (the contract state) and interact with it through transactions. Every time someone interacts with a contract, EVM bytecode is executed to read or modify the contract state. Each elementary piece of code (or opcode) may trigger a state update, another event, data storage, etc., and individual gas costs.
In particular, a transaction to a contract (i.e. a transaction that executes code) may trigger a call to another contract in some cases. This means it interacts with it like a normal transaction would, but it is completely within the transaction.
The key thing to understand is that you can receive funds through a contract without directly interacting with it. A good example of this is a marketplace contract. If you sell something, you need to be paid when the buyer decides to issue the transaction, even though its initial target is the marketplace contract address.
In conclusion, you need to retrieve these internal transactions somehow to have full accountability (having an event for each balance change). Unfortunately, there is no way in the current web3 API standard to fetch these internal transactions. They are completely invisible from a user’s point of view, and this is the case for most Ethereum wallets.
Hard forks and client migration
At the beginning of 2021, we extensively used open-ethereum as our default node. OpenETH was very useful for achieving full accountability because it maintains an index on transaction call trees (with the trace_transaction
and trace_block
RPCs). So it was easy for us to index each operation for each account and display to the client the real reason for their moving funds. So far so good!
But then, in April came the Berlin fork. And it went badly because the team behind open-ethereum already had plans to move to another project (which was another performance-oriented eth blockchain node called Erigon), and the Berlin-fork-compatible version just didn’t do the job. No new blocks could be processed! This situation was critical for us, since open-ethereum is used a lot by our users.
Luckily, within a couple of days, a patch was issued, and we could resume operations. Kudos to our firefighters, since we were one of the first crypto companies to do so.
This issue should now be resolved. Thank you for your patience!
— Ledger Support (@Ledger_Support) April 15, 2021
If you're still facing any kinds of issues, don't hesitate to try and clear Ledger Live's cache via Settings > Help > Clear cache
We then decided to look at other implementations to mitigate risks. We began by adapting our codebase and operation indexing strategy to use the go-ethereum node instead.
With go-ethereum, internal calls are currently not available via RPC directly. The only way we can retrieve calls chaining is by fully replaying transactions and their execution on the EVM by using debug_traceTransaction
(or debug_traceBlock
). We can then extract transactions that happened behind the scenes and add them to our indices. This operation is very costly, for obvious reasons.
Fast forward to July, when we implemented this logic using go-ethereum calls and the second update went fine. No fire; Normal operations; Yay!
What about NFT support?
To have full NFT support, we needed to do what we call a re-indexing of the Ethereum blockchain to include past NFT transfer events. In other words, to rebuild our indices completely. This means we have to replay every transaction on a local EVM as we index, and this makes the indexing process extremely slow: our indexing speed was 8 to 10 blocks/second (for blocks with traces), and 40 to 50 blocks/second without them.
Tracing transactions had just become the most significant bottleneck. To solve this problem, we investigated and found its source in the JS engine used by go-ethereum to implement the tracer.
Indeed, since it was a debug
interface, it was a clever move to propose a scriptable engine. The most common cases (including getting the call tree) are preloaded with aliases. The first thing was to change the JS code to have a custom tracer fetching only the data we needed. We looked at making changes in the go-ethereum codebase directly, but at the same time, geth 1.16 was released with a complete rewrite of tracer written in go.
Once applied, we observed a significant improvement in tracing speed! However, in terms of magnitude, it was still slower than open-ethereum for instance.
Providing best-in-class services for our users
Internal transaction is key to achieving full accountability in the Ethereum-like blockchain. Tracing nested contract calls is a missing part of web3, and a path of improvement to have a sound and workable system.
At Ledger, we aim to provide our clients with best-in-class services, and the full history is a fundamental component of it. This is why we are constantly looking to improve our solutions, stay up to date with the latest innovations and resist any shake-up this ecosystem can encounter!
Update 12/09/2022: If you landed here because you need information about why the Ethereum POW fork will not be handled by Ledger Live after the Merge, please read this article.