Grafting in the decentralised Graph Network

schmidsi · March 10, 2022, 4:03pm

In this thread I would like to explore the possibility of an experimental feature called “grafting” in the decentralised Graph Network.

What is Grafting?

From the docs:

When a subgraph is first deployed, it starts indexing events at the genesis block of the corresponding chain (or at the startBlock defined with each data source). In some circumstances, it is beneficial to reuse the data from an existing subgraph and start indexing at a much later block. This mode of indexing is called Grafting . Grafting is, for example, useful during development to get past simple errors in the mappings quickly, or to temporarily get an existing subgraph working again after it has failed.

Note: Grafting requires that the Indexer has indexed the base subgraph. It is not recommended on The Graph Network at this time, and developers should not deploy subgraphs using that functionality to the network via the Studio.

Who uses it?

Most complex subgraphs at some point start to leverage grafting because it reduces indexing speed tremendously for new versions. Examples are Uniswap, Synthetix and others. These bigger subgraphs sometimes take months to index from scratch and with grafting the subgraph developers have a very convenient tool to add new features to their subgraph without indexing from scratch. Also, if a subgraph fails, grafting is the only possibility for a hotfix.

I also remember the discussion that grafting will be obsolete as soon as we have the Firehose and other indexing speed improvements. I am not sure if this will ever be true. A subgraph that takes 100 days to sync, even if we achieve 100x indexing speed improvements, would still take 1 day to index. In a hot-fix situation, this is probably still too long.

What is the problem with grafting on the decentralised Graph Network?

If I remember correctly, there are basically two problems with grafting:

A subgraph developer could basically deploy a faulty subgraph which would make it impossible for the indexer to claim indexing rewards (I’m not 100% sure if this is correct)
A subgraph with multiple grafts would make it very hard for a new indexer to index. Let’s say a subgraph has 5 versions, each grafting on it predecessor. A new indexer would first manually need to index v1, then graft v2 on top of v1, then graft v3 on top of v2 and so on.

Experiment

To really test this in the wild, we deployed a new version (1.0.1) of the web3index subgraph on the network and grafted on top of 1.0.0 two days ago. As of today, there are already 10 indexers indexing the new subgraph. So it basically seems to work.

I asked in Discord for the experience and it seems to be smooth.

Open questions

What are the exact risks for an indexer?
If an indexer starts late to index, could they not just ignore the grafting instructions in subgraph.yaml and index from scratch instead of indexing v1, v2, …, vX serially?
What are the implications of 2. in regards of Proof of Indexing?

adamfuller · March 10, 2022, 5:17pm

As you say one of the main challenges with grafting is that it introduces complexity & dependencies for indexers on the network, who must index all of the graft bases (and graft bases may themselves be grafted…)

That is not an option, grafting is part of the subgraph definition and can’t be ignored, at least with Graph Node as it is today (and to change that would introduce significant determinism issues). That also resolves the third open question (there is only one valid POI for a grafted subgraph, indexing the graft base subgraph up to block X, then indexing the subgraph as defined)

It will depend a bit on the exact subgraph, but I think a combination of better caching (i.e. where not all the indexing work needs to be re-done) and parallelisation will enable significant improvements, and with a model that is more legible than stitching several fully indexed subgraphs together. Right now subgraph execution is single-threaded, and all-or-nothing. If processing can be parallelised, and more intermediate data can be stored and later re-used, then updating a subgraph won’t necessarily mean re-indexing from scratch cc @abourget

That3Percent · March 10, 2022, 5:44pm

I agree that we can and should support Grafting on The Network. It is a useful way to express upgrades (especially when paired with contract upgrades that are scheduled to trigger as-of a certain block). And, the hotfix use-case is valuable as well. Whether this is the best way to deliver this functionality seems to be a matter of taste and/or context. Having the option can’t hurt.

The primary argument against Grafting seems to be that it “requires that the Indexer has indexed the base subgraph”. That seems more like a tooling problem. Graph-node should be modified such that if it encounters a grafted subgraph it first ensures the base subgraph is indexed up through the block required for that subgraph (recursively, if necessary). Indexing a Grafted subgraph need not require manual intervention from Indexers.

I designed the Grafting feature with determinism in mind, so there is nothing blocking a Grafted subgraph to be used on The Network today, with enhanced UX in the future.

Topic		Replies	Views
Statement of clarification about failing subgraph due to grafting Arbitration	0	2751	December 1, 2023
This Month in Graph Indexing - March 2022 Edition Ecosystem Updates indexer	1	2016	April 1, 2022
This Month in Graph Indexing - August 2021 Edition Ecosystem Updates indexer	0	2719	August 29, 2021