Architecture of Erigon - separable and embeddable components
Erigon is different from many other implementations of Ethereum (or other blockchain protocols) is that it is less monolithic and more modular. But the word "modular" here needs to be understood quite specifically. Modules are not just code packages that end up complied into a single executable. In Erigon, modules (or we more often call them "components") are parts of Erigon that can be taken out into a separate executable, and then operated in its own process (of the operating system), or even on its own computer in the network. Below is the illustration of how Erigon currently splits up into components (modules):
Components are shown as rectangles. In the current version of Erigon code, all components shown as rectangles are “separable”, meaning that they can be run as separate processes. Where two overlapping rectangles are shown, it means that more than one component of this type can be run at the same time, and on multiple computers. All components are separable, and all of them can currently run within the same process (embeddable). However, downloader/seeder components need need to be run on the same computer even if they are separated, to be able to share downloaded and seeded files with Erigon.
The “lollipop” notation is explained below:
All interactions in the diagram above are numbered with numbers in blue background. Below is the description of what each interaction is for:
ETH sentry connects to the Ethereum p2p network. It performs these main functions: peer discovery (via Kademlia DHT, via DNS lookup, via configured static peers, via node info saved in the database, or via boot nodes pre-configured in the source code), peer management (handshakes, holding p2p connection even if Erigon is restarted).
ETH core interacts with the Ethereum p2p network via Sentry component. Sentry presents a simple interface to the core, with functions to download data, receive notifications about gossip messages, upload data upon request from peers, and broadcast gossip messages to either specially selected set of peers, or to all peers. It is possible to have multiple Sentries to increase the connectivity to the network, or obscure the position of the Core computer.
ETH core instructs the ETH downloader/seeder component to download (and then seed) specific files from BitTorrent network. Files are specified as their “info hashes”, which is a form of content addressing. The files ETH core instructs to download are block headers, block bodies, and in the near future also parts of the Ethereum state and various indices.
ETH downloader/seeder interacts with the BitTorrent network in order to retrieve files required by ETH core.
ETH TxPool uses ETH sentry to download initial set of transactions currently unconfirmed, as well as to receive gossip about new transactions, re-broadcast them, and inject transactions created by users and submitted via ETH RPC api.
ETH TxPool connects to ETH core to be able to fetch the up-to-date information about account balances and account nonces for the transactions in the pool. It also subscribes to the stream of state updates to be able to maintain a partial cache of the state, to reduce the interaction in the long run.
ETH RPC api connects to ETH core to be able to access any data in the ETH core’s database or other sources (downloaded files), as well as other information (for example, number of currently connected peers via ETH sentries). Through this connection RPC api also subscribes to the stream of state updates to be able to maintain a partial cache of the state, so it can respond to some of the queries using fewer interactions with the ETH core.
ETH RPC api connects to ETH TxPool to be able to query transaction pool as well as inject new transactions to be broadcasted to the Ethereum network.
User connects to ETH PRC api to query data about current or historical state of Ethereum, subscribe to notifications, and send new transactions to the Ethereum network.
Rollup architecture
Though not yet realised, the modularity of Erigon architecture makes it very suitable platform for implementing rollups based on Ethereum. Here is a theoretical rollup node architecture:
One can observe on the diagram above that the “rollup” part is mostly a mirror image of the “ETH” part, with one extra connection (number 10) from Rollup core to ETH core. This connection is required for the Rollup Core to be notified of events happening in ETH code (usually interacting with the dedicated contract deployed on Ethereum). In the version of the architecture for the block-producing rollup nodes, where will also be a connection (not shown on the diagram above) between Rollup Core and ETH TxPool(s), so that transactions that “roll up” data into Ethereum blocks can be added directly without having to go via JSON RPC and pay the extra cost of data transformations and having less interactive interface (connections between components support duplex streams, for example).
Since all the components are embeddable, it should be possible to realise the entire rollup node (including ETH part) in one executable for convenience. Of course, for more scalable installations, any parts can be taken out and many of them - multiplied.
Architecture after the transition to Proof Of Stake
Current design of “The Merge” does disrupt the architectural elegance presented above. There is no diagram for it yet, but it is easy to imagine a CL (Consensus Layer) component plugged in to the ETH RPC api component. Also, for some time, CL will not be embeddable, and will not utilise the richness of the communication protocol used between other components (duplex streams, for example).
This is, of course, the area where more thinking and design needs to happen. Perhaps the way to go is to create a version of CL implementation that can be embeddable. This is especially interesting for rollup implementations that will themselves use CL implementation for their rollup consensus.