Request for information about disputes #GDR-24

@tmigone Thank you for summarizing the allocation list into a much better format—it looks so much better this way.

@Tehn Thank you for voicing your experience. Based on your logs, and after reviewing the entire discussion in the indexer channel on The Graph’s Discord, I believe there is some useful context here. Although I am not a Firehose expert myself (as I don’t use Firehose), Marc-André | Ellipfra mentioned that there is often some variance in the error block number. The only logical explanation I can think of is that when you closed your allocation, it might have reverted to another error block or that rewinding the subgraph resulted in different POIs back and forth.
It seems like you were able to resolve this by clearing the call-cache and resyncing it.

@indexer_payne Thank you for your input. Dear Arbitration Team, is there any confirmation from the Graph Node team on whether they are looking into this issue? This seems like a very high-priority matter.

@inflex Thank you for the indexer-agent logs. Was hoping to see your graph-node logs as well which shows a more detailed part the subgraph condition such as block reverts/firehose or non-firehose used, any specific actions taken such as rewinding subgraph, followed by the graph-node version and indexer-agent version as requested by @tmigone. Do note that prior to this dispute a thorough investigation and checkings has been done which shows a clear, repeated and obvious fault instead of throwing accusations like

“you are not using indexer-agent to close allocations” or “closing unsynced subgraphs”

Which is also why #GDR-24 was created in the first place for further discussions.
Accusations would mean exactly what you mentioned above

“Why would someone slash allocation that got 30grt rewards, hoping to get 120k grt from it?”

Reason already being explained above.

You also mentioned to count my 10k GRT deposit in, any reason why you think it’s the latter now that money should not be put on the table for this dispute to happen? Disputes are created not only to get the indexer’s attention but also to urgently notify the Graph Node team in case there is a software issue. Lastly, I genuinely appreciate you not purging the subgraph, as it allows for further investigations.

After reviewing the input from @Tehn, @indexer_payne and @inflex, I’ll summarize the key points as best as I can:

  1. The indexer-agent will generate duplicate POIs due to deterministic errors. POIs might fluctuate back and forth, as mentioned by @Tehn, due to rewinding subgraphs.

  2. Although @Tehn’s issue occurred on the BSC chain (using Firehose) and is different from the subgraph in this dispute (on the Ethereum Mainnet chain), it does not eliminate the possibility of the indexer-agent submitting duplicate POIs back and forth.
    However, the disputed indexer submitted 3 unique POIs and 1 duplicate POI. In contrast, @Tehn’s issue involved 2 types of duplicate POIs. As mentioned above, this could be caused by variance in the error block number.

  3. Upon reviewing all of the disputed indexer’s allocations for other Ethereum Mainnet subgraphs, none of the PUBLIC POIs match those of other indexers. This raises serious concerns. The disputed indexer mentioned it might be using a faulty RPC provider, which may be related to Erigon3 Beta.

  4. Prior to the latest allocations update from the disputed indexer, a total of 216 subgraphs has been allocated. Upon reviewing all of them, 215 are either synced, failed with deterministic errors that other indexers also encountered, or outdated subgraphs—all of which are considered normal behavior. Only the subgraph Qmb27RY3RqP98UMKbTgScf6F7hhokfMuS9fV7VAtPiZHwF, which the disputed indexer has failed to sync, has consensus among all other indexers—specifically, 12 indexers that are fully synced.

  5. The disputed indexer’s endpoint is functional but unable to process any queries. Pretty sure it’s because of not running TAP. This has been the case since December 4 which is the official due date required for Indexer Service (Rust) upgrade for TAP to be running on an indexer.

  6. While managing allocations across numerous subgraphs can be challenging, when the same issue occurs repeatedly across subgraphs or over an extended period without any visible attempt to resolve or mitigate it, it shifts from being an understandable error to a question of operational robustness.

Key Questions:

  1. As mentioned by @inflex, it may be related to Erigon3 Beta. Is it a slashable offense to use unofficial software versions, whether on RPC or Graph Indexer stacks? There have been cases where some indexers faced problems specifically related to using Erigon3 Beta, as reported on The Graph Discord."

  2. If the issue is caused by the RPC, which led the Graph Node to classify it as a deterministic error, the indexer-agent would still be able to close allocations as usual. This is intended behavior and not an issue with the indexer-agent.
    However, could a malfunctioning RPC (e.g., eth-call failures) cause the Graph Node to flag this as deterministic? If so, is this intentional?
    This could lead to the indexer-agent continuously closing allocations with duplicate/incorrect POIs until the issue is flagged or disputed.

  3. As @indexer_payne mentioned, he encountered many subgraphs with deterministic issues caused by RPC problems as suggested by Graph Node team. Should the Graph Node classify such cases as non-deterministic so that the indexer-agent generates an error upon closure?
    It doesn’t seem reasonable to issue 1,000 slashes if the root cause is an RPC issue. Additionally, both @indexer_payne and @Tehn reported these issues to the Graph Node team and Discord, demonstrating good faith.

  4. According to GIP-0009: Arbitration Charter

Incorrect Proofs of Indexing or query Attestations may be produced due to software malfunctions in the Graph Node software or the Indexer’s blockchain client. We will refer to these malfunctions collectively as determinism bugs. The Arbitrator is encouraged to resolve disputes as a Draw if they believe an incorrect PoI or Attestation to be the result of a determinism bug.

^If the disputed indexer’s RPC is faulty, why are other Ethereum Mainnet subgraphs synced successfully (Even though with mismatched PUBLIC POI), while only one subgraph exhibits unique issues? Based on @indexer_payne’s statement, he encountered dozens of subgraphs with deterministic issues suggested by Graph Node team that it’s because of RPC issue. In @Tehn’s case (Firehose), clearing the call-cache and resyncing resolved the issue. What if this problem persisted for months until it was disputed? Would it still be considered a slashable offense even if a solution exists despite that it’s flagged as deterministic? (@Tehn, this is just an example—don’t worry about it. You have shown good faith, I believe all the other indexers will appreciate this. Thank you.)

  1. According to

GDR-18

Closing allocations for rewards and reusing the same POI repeatedly on a subgraph where the network of indexers has consensus with a healthy sync status is a slashable offense. To the affected indexer we ask to review their processes for closing allocations. It’s always recommended to use the indexer-agent to manage allocations unless there is a specific need. If the agent is failing please do report the issue on a relevant channel so it can be investigated and a course of action suggested.

GDR-21

To the affected indexer we ask to review their setup to ensure their operation is running as expected and meeting the demands from the network.

I have approached this dispute with neutrality, aiming to gather and analyze all relevant facts to determine whether this issue stems from a software malfunction or an operational oversight by the indexer. As mentioned earlier, I am okay with any dispute result as long as proper investigations and checks are conducted. I leave this to the Arbitration Team to decide.

PS: It is heartwarming to see participation from other indexers. Cheers!

1 Like