Hey everyone,
First of all the arbitration team would like to thank you all for the rich and civil discussion that unfolded here. We’ve been silently watching and having some internal chats as well, thank you for your patience, we think that regardless of the outcome of the dispute there is a deeper and more interesting point being discussed which is that one of data integrity.
TL;DR is that we will:
- draw this dispute
- make sure relevant teams are aware of the “multiple unique POIs after deterministic error” issue
- recommend changes to the arbitration charter to address data integrity concerns and suggest the ideas discussed here about closures with risky POIs.
Dispute summary
First let’s outline the relevant facts that summarize this dispute:
- Fisherman initiated a dispute against an indexer for repeatedly presenting the same POI on an apparently healthy subgraph (multiple indexers have no issue indexing the subgraph).
- Disputed indexer shared logs indicating they are hitting determinism error when indexing the subgraph.
- Fisherman shared data indicating the disputed indexer presented 3 unique POIs after hitting the determinism issue.
- Disputed indexer claims to use indexer-agent to manage their allocations, no manual closing or POI overriding being done. They presented agent logs that support this claim. They mentioned they are using Erigon3 beta as the chain client, this could be malfunctioning and be the root case.
- Another indexer shared an instance of a similar problem happening, with some indexers being able to sync a subgraph but they get determinism error with 2 unique POIs created.
Dispute resolution
We believe there is not enough data to conclude there is a deliberate or intentional misuse of the protocol. It’s unclear if this is a graph-node issue or a product of bad RPC data being piped in, however as of today neither are a slashable offense, so drawing the dispute is the sensible choice.
Data integrity
This dispute has introduced an important topic for consideration which is the chain of custody for the indexed data. Several indexers and members of the community made great points, the opinion of the Arbitrators can be summarized as follow:
- Indexers should be responsible for the chain of custody of the data they serve. It’s in the network’s best interest that they serve “good data” to the best of their knowledge.
- Indexers should not be punished for using blockchain client software as intended. An indexer cannot be burdened with manually checking the validity of the data they consume assuming they act in good faith and use reputable sources for their data.
- Indexers should be slashed if they willingly use a source of data (RPC/firehose) that is giving an incorrect input to graph-node. This could be a malicious RPC or a novel client implementation that is in it’s alpha/beta stages and full of bugs.
As it was pointed out by some of you in this thread, a good solution would be to incorporate public POI cross checking on the indexing software to get early alerts in case of divergencies and maybe even prevent those POIs from being committed on-chain. We agree and have raised this internally to the relevant teams, some of which are already working on this end.
In the meantime the arbitrators will also propose to the Graph Council an amendment to the Arbitration Charter. The suggested changes aim to give the arbitration council grounds for slashing indexers that intentionally harm the network by sourcing bad data without limiting their options (we don’t want to strictly prevent usage of novel clients, beta implementations, etc). The new proposed policy assumes indexers act in good faith but demands rectification if they are found to be producing incorrect results.
Here is a draft version of the new text, feedback and suggestions are welcome:
- Indexing data integrity
The Graph Node software indexes data from blockchain inputs. If the input data is inaccurate, the resulting subgraph and any derived POI will also be incorrect. Depending on the subgraph code, indexing bad data may even cause subgraph failures which could be misinterpreted as determinism bugs. Upholding the quality of the data is essential for the network’s overall health and reliability. Indexers are responsible for ensuring the integrity and chain of custody of the data they serve, which includes sourcing blockchain data from reputable sources.
The Arbitrator is encouraged to resolve disputes as a Draw if they believe an incorrect POI to be the result of a blockchain client malfunction (RPC/firehose).
However, upon a discrepancy being noticed the indexer should take reasonable measures to rectify the issue or submit a zero POI for any subsequent allocations. Note that the indexer must be notified by the Arbitrator by posting in the forum; the indexer will then be given a seven (7) days period to work with the Arbitrator and the community on addressing the problem after which any new disputes against a non zero POI can be resolved at the discretion of the Arbitrator.
Once more, we wanted to thank everyone that contributed in this thread,
Arbitration Council