Ethereum client and JSON-PRC API for The Graph network

stmx38 · January 10, 2022, 12:29pm

Hello everyone!

As for now, we are using OpenEthereum node for our indexer and starting to consider a second Ethereum node as a backup in case of failure. We also consider to use an external service for this.

From official documentation for indexers

Ethereum endpoint - An endpoint that exposes an Ethereum JSON-RPC API. This may take the form of a single Ethereum client or it could be a more complex setup that load balances across multiple. It’s important to be aware that certain subgraphs will require particular Ethereum client capabilities such as archive mode and the tracing API.

It states that we need Archive node with Tracing JSON-RCP API support.

Ethereum documentation - Nodes and clients | ethereum.org

Stand-alone clients

Client	Status	Archive	Tracing	CPU/RAM	Disk	Costs, $/m	Comment
Geth	Active			4+/8GB	9.3 TB	1,246.60	c5.2xlarge/gp2 (9.750) - N.Virgingia
Openethereum	Deprecated			4+/8GB	9.5 TB	1,510.35	m5a.xlarge/gp2 (12.012) - Ohio
Erigon	Active			8/16GB	2 TB	467.16	m5a.xlarge/gp3 (3.020) - Ohio
Akula	Alpha			?	2 TB	?
Nethermind	Active			6/32GB	4.5+ TB	?
Besu	Active			4/16GB	3 TB	?
Parity Ethereum	Deprecated	-	-	-	-	-
Aleth	Deprecated	-	-	-	-	-

External services

Service	Price	Comment
Infura	0 / 50 / 225 / 1000 $/m
Alchemy	0 / 49 / custom $/m
Chainstack	0 / 49 / 349 / 990 $/m
QuickNode	9 / 99 / 299 / 300 $/m
Anyblock	0 / 179 / 449 / custom €/m	OpenEthereum

Please share your experience, client you are using and maybe some additional information. Also, maybe you prefer to use an external service instead of hosting your own node - please share which one and why.

I will try to keep this post updated with all information.

Thank you!

cryptovestor · January 10, 2022, 1:37pm

Thanks for posting this!

The only alternative to OE, for the tracing that is used by graph-node, is Erigon. It has come a long way with Graph compatibility but we still have work to do in order to match OE’s behavior. Many Indexers have already made the transition from OE to Erigon, and a small subset of those Indexers, with assistance from the Erigon team and some Edge&Node devs, are actively helping improve Erigon’s tracing capabilities for Graph.

We have an Erigon working group led by @chris that runs on Thursdays, you can find the event on the info@thegraph.foundation calendar.

We self host the following for indexing:

Mainnet Indexer
2x OE archive nodes
1x Geth archive node
4x Erigon archive nodes (not used in production yet)

Testnet Indexer
1x Geth rinkeby archive node
1x Erigon rinkeby archive node
Mainnet nodes are used for indexing and the above nodes are used for the testnet subgraph and indexer-agent

Costs are all upfront, the most painful part of self-hosting is paying upfront for disk growth - to make sure you always have the storage capacity to support the growth of the larger nodes. This can be very expensive if you factor in disk redundancy.

stmx38 · January 11, 2022, 4:00pm

@cryptovestor, thank you for the detailed reply!

We also started to use Erigon but got some issue with it when we fit the 2 TB DB limit. It is running as an application in the VM and now we are considering to run it using containers.

Before using external services we at least need some usage data and we probably can use some proxy to collect the stats.

For example, Anyblock states that they are ready for The Graph and I found their announce on the forum.

I have some question to setup you described:

Why you are use Geth for The Graph?
How are your running Erigon - standalone or containers?
Did you consider to use external services and why?

cryptovestor · January 14, 2022, 11:31am

#1 Geth is (unofficially, I suppose) a reference client. I like to compare sync performance of OE against Geth for supported subgraphs (Geth cannot be used for trace features). I also like to use Geth with subgraphs that often have issues with OE. One example of this is the EIP721 subgraph which had some issues when syncing on OE (they might be fixed now, I don’t know, I still sync that one with Geth). So I have a graph-node instance that is only connected to this Geth instance and if I want to sync a subgraph with Geth I re-assign the subgraph to that graph-node.

#2 I have a mix of Erigon containers and VMs, I am moving to all containers using docker-compose in the near future in order to have a standardised execution environment. I have had some issues in the past with compiling Erigon myself, that are not an issue if I use their Docker images.

#3 I use Infura for mission-critical transaction execution. It just makes sense to use their very high investment in infrastructure for the most important transactions I execute, considering how cheap they are in the lower tiers. So this includes all transactions being executed by indexer-agent, mostly. I would never consider third parties for syncing subgraphs because it’s vastly cheaper for me to self-host a set of highly available and archive nodes and I have the skillset to manage the infrastructure.

On the question of usage data, I would be very keen to explore this in more depth. At a minimum, I would like to have self-hosted dashboards similar to the infura dash:

And ideally I would like to be able to break this data down by subgraph, so we can measure the activity generated by each one. Not sure if this is possible, but definitely something I would find very valuable for making cost and pricing decisions. If that sort of product existed, I would be happy to share the output usage data with the community and I’m sure other Indexers would too.

stmx38 · January 19, 2022, 10:43am

An interesting and related topic
Testnet Docker Guide by StakeSquid / Archive node options/Service Providers (WIP)

Topic		Replies	Views
JSON-RPC API for all TheGraph indexers by Anyblock Community & Grants indexer	5	2643	February 9, 2022
Add Support for Chiado chain (Gnosis chain testnet) New Chain Integrations	2	996	January 12, 2024
This Month in Graph Indexing - July 2022 Edition Ecosystem Updates indexer	2	2547	August 4, 2022

Ethereum client and JSON-PRC API for The Graph network

Related topics