Erigon 2 - three upgrades

Apr 12, 2022

Two flavours of Erigon - going back to Alpha

Presently, Erigon is available for the users in two main “flavours”:

Beta releases - these have been published roughly every week, but sometimes less frequently. Main objective of these releases has been to stabilise the code (by fixing issues) while continuing adding missing features. It is assumed that subsequent versions are compatible with one another, so that users can keep upgrading without having to rebuild their databases.
Cutting-edge devel branch. There are no promises made about compatibility of subsequent versions (and there are no versions per se, apart from commit hashes). This is to enable development team to move quickly with implementing the support for “The Merge” (upcoming Ethereum transition to Proof Of Stake), as well as series of upgrades currently named “Erigon 2”.

Very soon, available “flavours” will change. Erigon will continue to have beta releases, though the cadence is likely to decrease. This beta version should be operational up until “The Merge”, but it will not allow users to transition.

Instead, Erigon will start publishing Alpha versions again, with the aim of taking a cut of devel branch and gradually stabilising into beta version. These alpha versions should be upgradeable from one another (should data model change, automatic data migration will be provided), but it will not be possible to migrate the database from current “Beta” to that new “Alpha”.

The new Alpha versions will contain mainly the following improvements:

Support for “The Merge”, transition to Proof Of Stake. Implementation is not yet finalised, but any further updates will be cherry-picked from devel branch to Alpha releases.
Embeddable RPC daemon. Prior to this, there was no option to run RPC daemon in the same OS (Operating System) process as Erigon. And one could get good performance simply by running on the same computer but with “shared memory” mode. However, the design of “The Merge” requires the EL (Execution Layer, of which Erigon is one of the implementation) to expose JSON RPC for the CL (Consensus Layer), making it non-meaningful to run Erigon without an RPC daemon. It is still possible to run multiple RPC daemons per Erigon node, one of them can be embedded.
Implementation of BSC. There is an implementation of Parlia in Erigon (but not in Beta releases), which appears to work. However, it is not easy to support heavy chains like Ethereum Mainnet, BSC, and Bor (part of Polygon), because any potentially disruptive changes need to be tested using full resync, which is currently very time consuming for BSC and Bor. In fact, for this reason, implementation of Bor is still not functional. These testing issues will hopefully be addressed in the 3rd upgrade of Erigon 2 (see below), and this may enable better support for heavy chains.
First upgrade (Upgrade 1) of Erigon 2. What are the upgrades of Erigon 2 is the main topic of this post, so it will follow.

Goal of Erigon 2

Erigon2 is the modification of Erigon1 (currently available) aimed primarily at solving two problems described below, and, secondarily, at resolving any issues that arise from solving the primary problems.

Snapshot sync (boostrapping)

First primary problem is the Erigon's inability to perform snapshot sync, which is bootstrapping the node without having to replay all historical transactions from the Genesis. Solving this does not simply improve user experience, but also makes it easier for the developers of Erigon to support large blockchains. Without snapshot sync, testing any potentially disruptive changes requires full replay since Genesis. This becomes more impractical as the supported blockchain grow. With snapshot sync, we have a convention that testing is only performed for the blocks above certain height, and all blocks below that have pre-computed history available for download as snapshots.

Granularity of history

In Erigon1, history of the blockchain state, and all other indices related to that history, are granular up to a block. However, from user's perspective, transaction granularity makes much more sense. Blocks exist only as optimisation neccesary to order transactions without requiring them to refer to one another. When state history and related indices have block granularity, the peformance of accessing such history deteriorates as number of transactions in a typical block grows. This is because an index only narrows down the search to the closest block, and the search for specific transaction has to be done sequentially by replaying transactions one by one from the start of the block. Erigon2 aims at recording state history at transaction granularity. All related indices would also be at transaction granularity. This means that indices will narrow down the search to specific transaction, and the peformance of this search would be independent of how many transactions there are in a typical block.

Three upgrades of Erigon 2

Current plain is to roll out Erigon 2 as a series of three upgrades. Each upgrade will require re-sync from scratch since the data models will be incompatible. Here is the rough description of these upgrades.

Upgrade 1

Upgrade introduces infrastructure (based on BitTorrent) for downloading and seeding static files.
Nodes download, then seed static files for block headers, block bodies, transactions, and lookup index to search transaction using their hashes. Further on, they automatically produce and seed new static files, meaning that centralised seeder servers will only be required to bootstrap the swarms.
Upgrade establishes the first (perhaps naive) format for static files, with notable features of dictionary-based compression, and indexing based on minimal perfect hash tables.

Upgrade 2

Using experience with Upgrade 1, this upgrade is likely to improve the format for static files, with more emphasis on encoding of monotonic integer sequences (e.g. Elias-Fano)
Nodes download, then seed history of state, as well as indices for event logs and call traces, in addition to all the things from Upgrade 1. Big difference from Erigon 1 here is that the granularity of indices is changed to per-transaction, which is likely to improve performance of most historical queries (especially trace_filter). Further on, they automatically produce and seed new static files for state history, event log indices, and call trace indices, meaning that centralised seeder servers will only be required to bootstrap the swarms.
Full replay from genesis is still required to compute the state. However, because most of the history, event logs and call traces are already downloaded, the initial full replay will happen slightly faster. There may also be more simple techniques that use “benefits of the hindsight” to speed up the state computation.

Upgrade 3

Using experience with Upgrade 2, this upgrade is likely to improve the format for static files, with more emphasis on encoding the intermediate commitments, such as patricia trees (hexary and binary), and B+trees, with flexible choice of hash functions.
Nodes download reasonably recent state as a composition of static files, and only use replay to apply recent changes. As with other types of data, further on, new files are automatically produced and seeded. A new complexity here is that static files for static will sometimes need to be removed, as they are getting merged into larger files.

After these three upgrades, the full vision of Erigon 2 should be realised.

Erigon Blog

Discussion about this post