Request for information about disputes #GDR-24

The Arbitrators are contacting Indexer address 0xdecba5154aab37ae5e381a19f804f3af4d1bcbb5 (decisionbasis.eth) and fisherman 0x4208ce4ad17f0b52e2cadf466e9cf8286a8696d5 (@Inspector_POI) about a new Dispute filed in the protocol.

Dispute ID: 0x488aac24d17c6b89f17db3fa4db97c16205efccdd1cfd80a791a7e396386b011

Subgraph deployment ID: Qmb27RY3RqP98UMKbTgScf6F7hhokfMuS9fV7VAtPiZHwF

To the fisherman, could you share the data you gathered that led to you filing the dispute? Please provide all relevant information and records about the open dispute. This will likely include POIs generated for the affected subgraph(s).

Purpose of the Requirement

This requirement is related to the following dispute:

Dispute (0x488aac24d17c6b89f17db3fa4db97c16205efccdd1cfd80a791a7e396386b011)
├─ Type: Indexing
├─ Status: Undecided (0.74 days ago) [24 epochs left to resolve]
├─ Indexer: 0xdecba5154aab37ae5e381a19f804f3af4d1bcbb5
├─ Fisherman: 0x4208ce4ad17f0b52e2cadf466e9cf8286a8696d5
├─ SubgraphDeployment
│  └─ id: 0xbc6819c290cf2e48340e63f6dec80e9e8e9d579f2b812ce3c81b341bd23c8c1a (Qmb27RY3RqP98UMKbTgScf6F7hhokfMuS9fV7VAtPiZHwF)
├─ Economics
│  ├─ indexerSlashableStake: 126054.068493163038042542 GRT
│  └─ indexingRewardsCollected: 31.439073893523433108 GRT
├─ Allocation
│  ├─ id: 0x57da6d449d27bba187b8a93798c9a128545f8119
│  ├─ createdAtEpoch: 714
│  ├─ createdAtBlock: 0x75f1f52ddfdf9c54d199124bc92cf02eb1d77c6056937fba473d1a5dd222d808
│  ├─ closedAtEpoch
│  │  ├─ id: 718
│  │  └─ startBlock: 0xc209e05949448e4cecc079d81bb2ed0564e84de1484e2cca606931b9073c976f (#21195137)
│  └─ closedAtBlock: 0xbe50c670aeb5d5f716bd3af89e75a05cc6114f5f9ef53f0abef0e0be88d951f4 (#21198881)
└─ POI
   ├─ submitted: 0x5d087cdb84c09ab8ba790f947e6ce0735109a505ba692a58673981a5cf6ada4f
   ├─ match: Not-Found
   ├─ previousEpochPOI: Not-Found
   └─ lastEpochPOI: Not-Found

About the Procedure

The Arbitration Charter regulates the arbitration process. You can find it in Radicle project ID rad:git:hnrkrhnth6afcc6mnmtokbp4h9575fgrhzbay or at GIP-0009: Arbitration Charter.


Please use this forum for all communications.

Arbitration Team.

gm all

thanks @indexer_payne for dming and pinging on that one

it’s been some years since i last played with bogus poi and signal pumps (properly slashed there haha)

for now all subs that don’t sync i either kill with 0x0, or leave them hanging until killed by someone else

so, as i leave my allocation handling to indexer-agent, i hope this one will be resolved in my favor. Why would someone slash allocation that got 30grt rewards, hoping to get 120k grt from it? clearly predatory behavior, count me in for his 10k deposit. i recently failed a rescue for drained mnemonic, would be happy to compensate the victim there

btw, some proofs from my grafana, please observe “deterministic” flag

i’ll be resyncing that subgraph hoping to navigate around the issue

1 Like

Dear Arbitration Team,

I am disputing decisionbasis.eth indexer (0xdecba5154aab37ae5e381a19f804f3af4d1bcbb5) for repeatedly closing the subgraph Qmb27RY3RqP98UMKbTgScf6F7hhokfMuS9fV7VAtPiZHwF with the same POI (0x5d087cdb84c09ab8ba790f947e6ce0735109a505ba692a58673981a5cf6ada4f). This has occurred a total of 8 times (technically 7 times, as the most recent closure happened after this dispute was raised. I believe the disputed indexer has not yet noticed this dispute regarding the 8th closure).

Upon reviewing this subgraph’s full allocation history, not a single other indexer has submitted a duplicate POI. This strongly suggests that the subgraph is healthy and does not have a history of deterministic error.


According to the screenshots, The Graph Explorer shows that the disputed indexer is the only one failing to sync, while all other indexers are synced and up to chainhead.

(Sorted by latest to oldest)
Allocation ID: 0xc3ad07fbabc57864335b5335881bba6d87d17d65
PUBLIC POI : 0x8c42bb96e744715430a2ae6114375de501e46773b5b46184fbc51e02b06bdc9e
POI : 0x5d087cdb84c09ab8ba790f947e6ce0735109a505ba692a58673981a5cf6ada4f
Closed Epoch : 750
Closed Start Block : 21425537

Allocation ID: 0x57da6d449d27bba187b8a93798c9a128545f8119
PUBLIC POI : 0x8c42bb96e744715430a2ae6114375de501e46773b5b46184fbc51e02b06bdc9e
POI : 0x5d087cdb84c09ab8ba790f947e6ce0735109a505ba692a58673981a5cf6ada4f
Closed Epoch : 718
Closed Start Block : 21195137

Allocation ID: 0x4e257818af72eb2bf9867fbecd90ec579c5c73a7
PUBLIC POI : 0x8c42bb96e744715430a2ae6114375de501e46773b5b46184fbc51e02b06bdc9e
POI : 0x5d087cdb84c09ab8ba790f947e6ce0735109a505ba692a58673981a5cf6ada4f
Closed Epoch : 714
Closed Start Block : 21166337

Allocation ID: 0x6e0036ea6495c9cb99339652d43529629e68babd
PUBLIC POI : 0x8c42bb96e744715430a2ae6114375de501e46773b5b46184fbc51e02b06bdc9e
POI : 0x5d087cdb84c09ab8ba790f947e6ce0735109a505ba692a58673981a5cf6ada4f
Closed Epoch : 706
Closed Start Block : 21108737

Allocation ID: 0x7fcb4f971c992c402c04a500a94e8dccec31f23e
PUBLIC POI : 0x8c42bb96e744715430a2ae6114375de501e46773b5b46184fbc51e02b06bdc9e
POI : 0x5d087cdb84c09ab8ba790f947e6ce0735109a505ba692a58673981a5cf6ada4f
Closed Epoch : 694
Closed Start Block : 21022337

Allocation ID: 0x8a5f73352d1b8ed28334400fa47d252326e26375
PUBLIC POI : 0x8c42bb96e744715430a2ae6114375de501e46773b5b46184fbc51e02b06bdc9e
POI : 0x75a1c9d26630732e5b29ed9c740ab11f8c71d869a18959f6053eda5829df0bc9 (Unique)
Closed Epoch : 665
Closed Start Block : 20813537

Allocation ID: 0x3f54b1bdb610d109611335a70202ada75217cda8
PUBLIC POI : 0x8c42bb96e744715430a2ae6114375de501e46773b5b46184fbc51e02b06bdc9e
POI : 0x5d087cdb84c09ab8ba790f947e6ce0735109a505ba692a58673981a5cf6ada4f
Closed Epoch : 642
Closed Start Block : 20647937

Allocation ID: 0x167d6df8dddf8e554bf7f11a95fa5954dc2fc4db
PUBLIC POI : 0x8c42bb96e744715430a2ae6114375de501e46773b5b46184fbc51e02b06bdc9e
POI : 0x5d087cdb84c09ab8ba790f947e6ce0735109a505ba692a58673981a5cf6ada4f
Closed Epoch : 631
Closed Start Block : 20568737

Allocation ID: 0x51b627bf85d9635e7ba8f453dc41b5693393a562
PUBLIC POI : 0x8c42bb96e744715430a2ae6114375de501e46773b5b46184fbc51e02b06bdc9e
POI : 0x5d087cdb84c09ab8ba790f947e6ce0735109a505ba692a58673981a5cf6ada4f
Closed Epoch : 624
Closed Start Block : 20518337

Allocation ID: 0x8a33b85335972f05f2662512089db042fecba2a6
PUBLIC POI : 0x8c42bb96e744715430a2ae6114375de501e46773b5b46184fbc51e02b06bdc9e
POI : 0x096079a657901029aa9217e9d5060244559cf2564b3744bcd53c8fe0ae3b0a87 (Unique)
Closed Epoch : 601
Closed Start Block : 20352737

Allocation ID: 0xc5d75a35a81e710f22a779a3ccf4fa713c25b873
PUBLIC POI : 0x8c42bb96e744715430a2ae6114375de501e46773b5b46184fbc51e02b06bdc9e
POI : 0x2c1ce2de884183ecb3bde8481f5715d45f8feeb2754dfaa71bbd9743dda5da31 (Unique)
Closed Epoch : 599
Closed Start Block : 20338337

Upon further investigation, unmatching PUBLIC POIs compared to other indexers were found for all allocations closed by the disputed indexer on this subgraph, based on the startBlock of the closedEpoch for Ethereum Mainnet. This includes the additional 3 allocations with the unique POIs. Refer below :


Based on the disputed indexer’s screenshot, the error occurs at block #20100104, the disputed indexer is expected to have correct and matching data prior to this block with all other indexers. Attached below shows the latter.


It is promising that the disputed indexer failed to sync and stopped at block #20100104, as all the PUBLIC POI shows the same for the disputed indexer after this block until chainhead.

By definition, deterministic errors result in the subgraph halting at the same block, indexer-agent would provide the same POI and the PUBLIC POI would remain consistent even up to chainhead if the error persists.
According to the allocations list, there is a history of 3 UNIQUE POIs but still with the same PUBLIC POIs. Expected behaviour from indexer-agent for deterministic errors would be the same POI for all 11 allocations.
Disputed indexer, have you attempted before by rewind this subgraph or resync it from the beginning?

However, a deeper analysis reveals that the divergence began at block #12742160. Before this block, the disputed indexer’s PUBLIC POI matched that of all other indexers from block #1 to block #12742159.

Regarding the disputed indexer’s claim:
Why would someone slash allocation that got 30grt rewards, hoping to get 120k grt from it?
The slash is not based on a single allocation but on all the above cases of duplicate POIs, combined with unmatching PUBLIC POI data among other indexers. The slashing amount is determined by the network design and is at the discretion of the arbitration team.
I acknowledge that it may seem excessive when all the indexing rewards you have claimed for this subgraph are considered. However, the arbitration team should also take into account that this subgraph is ranked #32 in terms of query fees.

Clearly predatory behavior
You are correct that the situation is deemed predatory. I am in the process of identifying incorrect POIs, including those resulting from deterministic errors. While I trust your expertise as an experienced indexer, despite your past slashing history, I hope that you, the arbitration team, or others can provide additional context on how this subgraph could result in a “deterministic error.”
If such context is provided/proven, I am willing to accept the decision, whether it is a draw or a loss. However, the single piece of evidence presented by the disputed indexer does not explain why only 1 out of 13 indexers encountered issues, nor does it address the matching PUBLIC POIs but with 3 unique POIs and 8 duplicates.

TL;DR:

  • 11 allocations were submitted, all after the disputed indexer’s claimed error block (#20100104), with 8 duplicate POIs and 3 unique POIs, but still with the same PUBLIC POI (0x8c42bb96e744715430a2ae6114375de501e46773b5b46184fbc51e02b06bdc9e), indicating that the disputed indexer is stuck at the same block for some reason.
  • Indexer-agent behavior: If a deterministic error is encountered, the same POI will be submitted repeatedly, making the presence of 3 unique POIs inconsistent with this claim. Moreover, the claimed error at block #20100104 occurred well before the first allocation by another indexer.
  • No duplicate POIs were submitted by any other indexers.
  • All other indexers show matching data, except for the disputed indexer.

Thanks @Inspector_POI for the investigation. To summarize your findings I’ve put together the following table:

AllocationID Created At Epoch Closed At Epoch POI
0xc5d75a35a81e710f22a779a3ccf4fa713c25b873 562 599 0x2c1ce2de884183ecb3bde8481f5715d45f8feeb2754dfaa71bbd9743dda5da31
0x8a33b85335972f05f2662512089db042fecba2a6 599 601 0x096079a657901029aa9217e9d5060244559cf2564b3744bcd53c8fe0ae3b0a87
0x51b627bf85d9635e7ba8f453dc41b5693393a562 608 624 0x5d087cdb84c09ab8ba790f947e6ce0735109a505ba692a58673981a5cf6ada4f
0x167d6df8dddf8e554bf7f11a95fa5954dc2fc4db 624 631 0x5d087cdb84c09ab8ba790f947e6ce0735109a505ba692a58673981a5cf6ada4f
0x3f54b1bdb610d109611335a70202ada75217cda8 631 642 0x5d087cdb84c09ab8ba790f947e6ce0735109a505ba692a58673981a5cf6ada4f
0x8a5f73352d1b8ed28334400fa47d252326e26375 642 665 0x75a1c9d26630732e5b29ed9c740ab11f8c71d869a18959f6053eda5829df0bc9
0x7fcb4f971c992c402c04a500a94e8dccec31f23e 665 694 0x5d087cdb84c09ab8ba790f947e6ce0735109a505ba692a58673981a5cf6ada4f
0x6e0036ea6495c9cb99339652d43529629e68babd 702 706 0x5d087cdb84c09ab8ba790f947e6ce0735109a505ba692a58673981a5cf6ada4f
0x4e257818af72eb2bf9867fbecd90ec579c5c73a7 707 714 0x5d087cdb84c09ab8ba790f947e6ce0735109a505ba692a58673981a5cf6ada4f
0x57da6d449d27bba187b8a93798c9a128545f8119 714 718 0x5d087cdb84c09ab8ba790f947e6ce0735109a505ba692a58673981a5cf6ada4f
0xc3ad07fbabc57864335b5335881bba6d87d17d65 742 750 0x5d087cdb84c09ab8ba790f947e6ce0735109a505ba692a58673981a5cf6ada4f
0x17a59a0b6b80b41e283e1447e8468ee0b60910aa 750 - -

@inflex, the error log you shared shows the subgraph deployment being flagged with deterministic error at block height 20100104 (block hash 0x260e18eef4ecb61a0b7bb7e796fc70c55664a04112ed094a8e6388d2f952de87), which is approximately epoch 565, so somewhere during the first allocation on the previous table.

The two points of concern at the moment we would need to answer are:

  • Why are no other indexers hitting this determinism issue? Can you share the specific versions of graph-node and indexer-agent you are using? Any specific actions you took on this subgraph that could help explain this behavior like prunning, rewinding, etc?
  • As @Inspector_POI mentioned, since the deterministic error you shared there have been 3 unique POIs presented, even switching from 0x5d087cdb84c09ab8ba790f947e6ce0735109a505ba692a58673981a5cf6ada4f and back again to it. This is not normal behavior for indexer-agent when a deterministic error is found. Are you positive all these allocations listed above have been closed using the indexer-agent? Can you provide logs for this?

Thanks.

I’m not making excuses for anyone, but there is definitely something wrong with the agent and indexer. I’ve encountered a similar problem where some indexers could successfully synchronize a subgraph, and I always got a deterministic error. I’ve already started this thread in the discord, trying to understand what’s going on.

Here is an example.
Subgraph: Orbs TWAP - BSC | Graph Explorer
My logs: https://gist.githubusercontent.com/0x19dG87/687427f4cad3e093321e8383f1d7c8d3/raw/13feab77a1c9ebd9a709bf79434663a0da869ac1/QmPXBSGJV4ptejsATQw4udeF2QqJ5wRWkx995FNjexMFWy.log
I tried restarting, rewinding, and even recently dropping and synchronizing it back, but I still got the error:

transaction a65237ddb7cc9cc26f30c898369a91bc0d372b9ca52cfdd5a6cb7fbe75cc8117: error while executing at wasm backtrace:	    0: 0x3b56 - <unknown>!~lib/@graphprotocol/graph-ts/chain/ethereum/ethereum.SmartContract#call	    1: 0x6b2c - <unknown>!src/twap/handleOrderFilled: Mapping aborted at ~lib/@graphprotocol/graph-ts/chain/ethereum.ts, line 681, column 7, with message: Call reverted, probably because an `assert` or `require` in the contract failed, consider using `try_order` to handle this in the mapping.

Moreover, the agent generated 2 different POIs (where there should be only one if the synchronization failed). You can see it on the attached screenshot. All this was done by the agent v0.21.11. Indexer node v0.36.0. I use rpc and firehose. I open and close allocations manually, without any automation, etc.

.
And this is not the only example. Perhaps someone will be able to figure it out and explain what the reason is.

1 Like

I have dozens of those “error while executing at wasm backtrace” and all marked as DETERMINISTIC.

After asking a graph-node team member about them, he suggested it’s an RPC issue, but never got any more info out besides that.

I didn’t find any logs for past allocations, as I guess these get overwritten when hundreds of allocations gets reopened weekly. However, I noticed that I’m currently allocating to subgraph in question (Qmb27RY3RqP98UMKbTgScf6F7hhokfMuS9fV7VAtPiZHwF), as well as I haven’t yet performed graphman drop to assist in investigation. So, below I attach the logs containing this deployment-specific info from indexer-agent

Here are the commands used (unbuffer for cacheing of large files, pino-pretty for format, grep for search, -C flag to show adjacent 20 lines up and down), click on them to go to pastebin logs.

indexer@dbnode ~ # unbuffer docker logs main-indexer-agent | pino-pretty | grep -C 20 ‘Qmb27RY3RqP98UMKbTgScf6F7hhokfMuS9fV7VAtPiZHwF’ > deployment.txt

indexer@dbnode ~ # unbuffer docker logs main-indexer-agent | pino-pretty | grep -C 20 ‘0x17a59a0B6B80b41E283E1447E8468eE0b60910AA’ > allocation.txt

indexer@dbnode ~ # unbuffer docker logs main-indexer-agent | pino-pretty | grep -C 20 ‘0x5d087cdb84c09ab8ba790f947e6ce0735109a505ba692a58673981a5cf6ada4f’ > poi.txt

You can see how agent closes it automatically with one of the POIs shown above.

As per discrepancy in POIs and inserting different POIs at earlier allocations, unfortunately no logs are there, here are the commands used:

indexer@dbnode ~ # unbuffer docker logs main-indexer-agent | pino-pretty | grep -C 20 '0x2c1ce2de884183ecb3bde8481f5715d45f8feeb2754dfaa71bbd9743dda5da31'
indexer@dbnode ~ # unbuffer docker logs main-indexer-agent | pino-pretty | grep -C 20 '0x096079a657901029aa9217e9d5060244559cf2564b3744bcd53c8fe0ae3b0a87'
indexer@dbnode ~ # unbuffer docker logs main-indexer-agent | pino-pretty | grep -C 20 '0x75a1c9d26630732e5b29ed9c740ab11f8c71d869a18959f6053eda5829df0bc9'

My guess is that this subgraph had issues due to faulty rpc provider, may be erigon3 beta related.

I am not purgeing the subgraph just yet, but will do as soon as this dispute is over.

By the way, @Inspector_POI, in future, please just DM me on discord if you find faulty POIs, I agree we should all work towards elimination of these issues, and it’s good that someone spends time crosschecking deployments across indexers. I would just as well shared the logs with you, or we could hop into a call to troubleshoot. Couple of months ago, I was contacted by Shiyas from E&N, and we worked exactly on that, resyncing the subgraph and investigating the logs.

I just don’t think it’s suitable to open a dispute where some accusations are thrown and money are put on the table.

@tmigone Thank you for summarizing the allocation list into a much better format—it looks so much better this way.

@Tehn Thank you for voicing your experience. Based on your logs, and after reviewing the entire discussion in the indexer channel on The Graph’s Discord, I believe there is some useful context here. Although I am not a Firehose expert myself (as I don’t use Firehose), Marc-AndrĂ© | Ellipfra mentioned that there is often some variance in the error block number. The only logical explanation I can think of is that when you closed your allocation, it might have reverted to another error block or that rewinding the subgraph resulted in different POIs back and forth.
It seems like you were able to resolve this by clearing the call-cache and resyncing it.

@indexer_payne Thank you for your input. Dear Arbitration Team, is there any confirmation from the Graph Node team on whether they are looking into this issue? This seems like a very high-priority matter.

@inflex Thank you for the indexer-agent logs. Was hoping to see your graph-node logs as well which shows a more detailed part the subgraph condition such as block reverts/firehose or non-firehose used, any specific actions taken such as rewinding subgraph, followed by the graph-node version and indexer-agent version as requested by @tmigone. Do note that prior to this dispute a thorough investigation and checkings has been done which shows a clear, repeated and obvious fault instead of throwing accusations like

“you are not using indexer-agent to close allocations” or “closing unsynced subgraphs”

Which is also why #GDR-24 was created in the first place for further discussions.
Accusations would mean exactly what you mentioned above

“Why would someone slash allocation that got 30grt rewards, hoping to get 120k grt from it?”

Reason already being explained above.

You also mentioned to count my 10k GRT deposit in, any reason why you think it’s the latter now that money should not be put on the table for this dispute to happen? Disputes are created not only to get the indexer’s attention but also to urgently notify the Graph Node team in case there is a software issue. Lastly, I genuinely appreciate you not purging the subgraph, as it allows for further investigations.

After reviewing the input from @Tehn, @indexer_payne and @inflex, I’ll summarize the key points as best as I can:

  1. The indexer-agent will generate duplicate POIs due to deterministic errors. POIs might fluctuate back and forth, as mentioned by @Tehn, due to rewinding subgraphs.

  2. Although @Tehn’s issue occurred on the BSC chain (using Firehose) and is different from the subgraph in this dispute (on the Ethereum Mainnet chain), it does not eliminate the possibility of the indexer-agent submitting duplicate POIs back and forth.
    However, the disputed indexer submitted 3 unique POIs and 1 duplicate POI. In contrast, @Tehn’s issue involved 2 types of duplicate POIs. As mentioned above, this could be caused by variance in the error block number.

  3. Upon reviewing all of the disputed indexer’s allocations for other Ethereum Mainnet subgraphs, none of the PUBLIC POIs match those of other indexers. This raises serious concerns. The disputed indexer mentioned it might be using a faulty RPC provider, which may be related to Erigon3 Beta.

  4. Prior to the latest allocations update from the disputed indexer, a total of 216 subgraphs has been allocated. Upon reviewing all of them, 215 are either synced, failed with deterministic errors that other indexers also encountered, or outdated subgraphs—all of which are considered normal behavior. Only the subgraph Qmb27RY3RqP98UMKbTgScf6F7hhokfMuS9fV7VAtPiZHwF, which the disputed indexer has failed to sync, has consensus among all other indexers—specifically, 12 indexers that are fully synced.

  5. The disputed indexer’s endpoint is functional but unable to process any queries. Pretty sure it’s because of not running TAP. This has been the case since December 4 which is the official due date required for Indexer Service (Rust) upgrade for TAP to be running on an indexer.

  6. While managing allocations across numerous subgraphs can be challenging, when the same issue occurs repeatedly across subgraphs or over an extended period without any visible attempt to resolve or mitigate it, it shifts from being an understandable error to a question of operational robustness.

Key Questions:

  1. As mentioned by @inflex, it may be related to Erigon3 Beta. Is it a slashable offense to use unofficial software versions, whether on RPC or Graph Indexer stacks? There have been cases where some indexers faced problems specifically related to using Erigon3 Beta, as reported on The Graph Discord."

  2. If the issue is caused by the RPC, which led the Graph Node to classify it as a deterministic error, the indexer-agent would still be able to close allocations as usual. This is intended behavior and not an issue with the indexer-agent.
    However, could a malfunctioning RPC (e.g., eth-call failures) cause the Graph Node to flag this as deterministic? If so, is this intentional?
    This could lead to the indexer-agent continuously closing allocations with duplicate/incorrect POIs until the issue is flagged or disputed.

  3. As @indexer_payne mentioned, he encountered many subgraphs with deterministic issues caused by RPC problems as suggested by Graph Node team. Should the Graph Node classify such cases as non-deterministic so that the indexer-agent generates an error upon closure?
    It doesn’t seem reasonable to issue 1,000 slashes if the root cause is an RPC issue. Additionally, both @indexer_payne and @Tehn reported these issues to the Graph Node team and Discord, demonstrating good faith.

  4. According to GIP-0009: Arbitration Charter

Incorrect Proofs of Indexing or query Attestations may be produced due to software malfunctions in the Graph Node software or the Indexer’s blockchain client. We will refer to these malfunctions collectively as determinism bugs. The Arbitrator is encouraged to resolve disputes as a Draw if they believe an incorrect PoI or Attestation to be the result of a determinism bug.

^If the disputed indexer’s RPC is faulty, why are other Ethereum Mainnet subgraphs synced successfully (Even though with mismatched PUBLIC POI), while only one subgraph exhibits unique issues? Based on @indexer_payne’s statement, he encountered dozens of subgraphs with deterministic issues suggested by Graph Node team that it’s because of RPC issue. In @Tehn’s case (Firehose), clearing the call-cache and resyncing resolved the issue. What if this problem persisted for months until it was disputed? Would it still be considered a slashable offense even if a solution exists despite that it’s flagged as deterministic? (@Tehn, this is just an example—don’t worry about it. You have shown good faith, I believe all the other indexers will appreciate this. Thank you.)

  1. According to

GDR-18

Closing allocations for rewards and reusing the same POI repeatedly on a subgraph where the network of indexers has consensus with a healthy sync status is a slashable offense. To the affected indexer we ask to review their processes for closing allocations. It’s always recommended to use the indexer-agent to manage allocations unless there is a specific need. If the agent is failing please do report the issue on a relevant channel so it can be investigated and a course of action suggested.

GDR-21

To the affected indexer we ask to review their setup to ensure their operation is running as expected and meeting the demands from the network.

I have approached this dispute with neutrality, aiming to gather and analyze all relevant facts to determine whether this issue stems from a software malfunction or an operational oversight by the indexer. As mentioned earlier, I am okay with any dispute result as long as proper investigations and checks are conducted. I leave this to the Arbitration Team to decide.

PS: It is heartwarming to see participation from other indexers. Cheers!

1 Like

index-node logs with related subgraph ID grep shortly after graphman restart Qm..HF

agent was recently switched from 0.21.2(or3) to 0.21.9
node is 0.35.1

Are blockchain clients ever ratified by the council? No. There’s your answer.

Apparently yes, I remember I asked this question in particular and whoever replied, said it’s intended. Why? No clue.

People should appreciate indexers running developing blockchain clients in production. If it weren’t for these people during Turbogeth era, there would’ve been no Erigon running smoothly as it is today.

Dumbest takes of the year.
EDIT: to add more context here - indexers using software as is, should never be slashed for using the software as intended. It’s NOT my job to monitor 100 discord chats, 100 forum threads and 3000 subgraphs in my database for errors that occur at the software level.
EDIT #2: it is however MY job to report those errors to the relevant team, but it’s again NOT my problem that the software versions are being released twice a year, and even so, with bugs out of the gate (see graph-node 0.36.0)

I appreciate all the work done by the team, but obviously, some problems should not be related to indexers who allegedly act illegally. In my case, I don’t even understand why the agent allowed me to close the allocation if other indexers were far ahead of me. Basically, it should somehow know that you are lagging behind and stop closing the allocation even if there is a deterministic error. Because this error (as we found out) can be caused by rpc/firehose, and the indexer will not know about it. That is a good point to consider in resolving this issue.

  • Point taken.
  • Now this is a serious issue that needs to be addressed with urgency.
  • Appreciation, yes—active bug reports, shared findings, and improvements all contribute to the development process, provided they are done in a controlled environment.
  • No, running unstable software in production without robust safeguards risks compromising network reliability.
  • However, this is on the RPC side. Graph Node should better assess the issues mentioned above, whether intended or not. Dear Graph Node team, further clarification would be greatly appreciated.
  • Fair point, an indexer should not be slashed, especially when relying on and putting trust in the software—excluding cases of using deprecated versions with long-expired grace periods.

As long as the error occurs at the software level (deterministic), it’s not slashable. I can continue my operations as usual, regardless of faulty RPC or indexing software, since there are no updates. Until the issue gets patched, indexing incorrect data or serving faulty queries is not my problem."

  • Correct me if I’m wrong, seeing that my question with the intention of ensuring fairness among all indexers and maintaining the integrity of serving correct data is the dumbest take of the year.

My conclusion :

  • Legally? - Yes, deterministic errors are not slashable.
  • Disputable? - Yes, 12 indexers are fine, but only 1 failed. Deterministic or not? It depends on the software. How many subgraphs failed uniquely? One. This is the reason why this dispute/discussion happened in the first place, and why I remain very neutral on this case.
  • Ethically? - To minimize the risk to honest Indexers of being economically penalized while interacting with the protocol in good faith.
1 Like

This is exactly what I meant by that, yes. Buggy software is not my problem. Software that takes half a year to be updated is also not my problem. Solve the root cause, which is the graphnode team fixing bugs and iterating updates faster.

My opinion on some of the questions and points repeated above.

  • Should graph-node flag errors originating from bad RPC as deterministic?

Yes. graph-node has only one input: RPC data (or firehose). If it receives bad data, resulting in an invalid view of the blockchain (which ultimately causes the subgraph to fail), the only correct classification graph-node can make is to flag it as deterministic, because any other indexer receiving the same data would encounter the same error. Garbage in, garbage out. graph-node works as designed here.

Note that not all bad RPC data results in subgraph failures. For subgraphs with proper error handling, this bad data will not result in a failed subgraph, but instead in a bad POI and ultimately in invalid query responses.

  • Are garbage RPC data or data corruption the responsibility of the indexer?

Yes. Indexers are responsible for the chain of custody of the data they serve. They are responsible for picking the data source, whether that’s a reputable RPC/firehose provider or proven blockchain node software. They must also take measures to ensure data integrity on their equipment (storage, memory, etc.). Any error in the data is their responsibility.

  • Should an indexer be slashed for running official software releases and taking reasonable measures for data integrity, which still result in a bad POI and bad data?

Not at this stage. Indexers running the official clients (graph-node, indexer-agent, released RPC) currently lack the tooling to avoid bad POI when operating in good faith. Although public tools (graphcast, graphix) and internal tools at E&N (query comparison, poi checker, etc.) exist, indexers do not yet have easy, automated methods to handle these. An idea could be to add an indexer-agent functionality that uses data from graphix or graphcast to regularly compare POI or automate certain actions (such as auto rewinds, preventing allocation closure, or blocking query requests).

  • Is the graph-node software to blame?

In a nutshell, no. For the vast majority of POI divergences, doubts can be cast on the RPC inputs, over which graph-node engineers have no control or visibility (they cannot compare the inputs from two different indexers). Are there graph-node determinism bugs that remain to this day? Yes, I’m sure of it. You can track the issues currently being investigated here:

I want to stress that we, as indexers, are responsible for data integrity and must design our infrastructure with this in mind. We could have lengthy discussions about what that means and what it entails, and I’m sure we all have different approaches, but that would steer us way off topic for this thread. But one thing is for sure: once an indexer starts diverging on a subgraph, it needs to take action. (Again: not necessarily slashed, but they needs to resolve the issue.)

One interesting thing I noted recently is that indexing more subgraphs and serving more queries helps identify these divergences. On a few occasions, I’ve noticed patterns of subgraph errors, invalid query responses, or POI divergences affecting subgraphs on a specific chain or specific shards. Once these clusters of errors are identified, the resulting action emerges quite easily, and it usually involves mass rewinding of subgraphs.

2 Likes

Hey guys,

this topic seems to concern all indexers, so im adding my few cents here as well.

First of all, i respectfully disagree with many views by Andre even though he helps us a lot and is obviously by far the best indexer our of all of us. Having been running a lot of RPCs elsewhere than the Graph, i can safely say that the same software versions of proven client releases, have tended to have different end results when synced from scratch or after a snapshot was used. Missing blocks etc. It might not impact Graph much in the end but these things happen often and i heavily doubt anyone here runs a fully healthy archival RPC. Maybe it passes in The Graph, but it might not in some other protocols like DRPC, Blast etc. We have stumbled on many cases where an RPC that worked well on The Graph was not ok on another protocol, and vice versa. In such cases its sometimes hard to even know what IS a healthy RPC.

So yes while our job is to manage these RPCs, it is virtually impossible to have everything in perfect health and without a missing piece. I am guessing this is also why the Graph team does not recommend a certain software client/version. Any one of them can break, and even if its stable, things can still be missing.

I also dont really check much the status of all the subgraphs we sync, and i rely on the agent to NOT close an allocation in case its too far behind or broken and not eligible for rewards. This fails often though it seems.

While the OP seems like the indexer was the only one out of sync, there is a funny case with Airswap as well. Some indexer are on chain head, some are stuck at the same block (like us). I have tried rewinding, nuking it, changing RPCs, but nothing helped. Hence why i dont think this is my problem as an indexer anymore and could likely be something to do with software, which is something i dont have control over.

Theres a double problem here - indexers might get scared to index new/broken subgraphs (or “questionable” ones like Airswap). That is definitely not good for the protocol. At the same time, many of us indexers support plenty other chains as a node operator or validator (we have around 80 in total, for example), so its literally impossible to stay on top of such a heavy protocol like the Graph, which is why we should rely on their software more as well. If in the end its faulty, that shouldnt be our problem or issue to deal with (while granted its good to report any issues on github etc.). Since the Graph team wants more indexers and not less, i would suggest things change (both software wise and dispute wise).

So TLDR; until things are polished and working smoothly on the Graph side, software wise, i dont think indexers should be slashed for using buggy and faulty software (Graph or third party, unless they decide to ratify which clients and versions are ok). If the agent manages to generate a POI and close allocation, we should assume we are ok. The only case where a slash should really happen is where there is a proven malicious intent (i think we had a good case of that not long ago). In all other cases, its better to investigate and resolve the issue (as it is likely a bug or a RPC problem where there isnt just one indexer affected).

2 Likes

I agree with ellipfra that indexers should strive to provide best possible data. I also agree with mindstyle, that it’s barely possible at current stage. Btw, all my nodes, including that beta of erigon3, are connected to DRPC, and serving queries there.

How hard would it be to integrate some poi cross-check into agent so it doesn’t close allocations that seem far from consensus? Make it optional flag for responsible indexers, idk.

:+1:

And herin lies the reason why arbitration is a human endeavour and not a closed loop process, which seems to piss alot of people off due to the wishy-washy feel of the procedure, but it is simply an inconvenient reality until such a time as absolute verifiablility is a thing, should it ever be.

In the meantime I think enabling Indexers to be as well-informed as possible on risky allocation closures is worthy of more discussion per @inflex post below this one.

1 Like

The solution is POI Radio, it already does this. But hard agree that some integration between POI Radio and the agent would be cool, so that risky closures could maybe be put in a “holding” bucket in the actions queue or at a minimum generate logging and notifications in the IE0 framework that give the Indexer some alerting t hat an allocation is about to be closed with a “weak” PoI.

2 Likes

Hey everyone,

First of all the arbitration team would like to thank you all for the rich and civil discussion that unfolded here. We’ve been silently watching and having some internal chats as well, thank you for your patience, we think that regardless of the outcome of the dispute there is a deeper and more interesting point being discussed which is that one of data integrity.

TL;DR is that we will:

  • draw this dispute
  • make sure relevant teams are aware of the “multiple unique POIs after deterministic error” issue
  • recommend changes to the arbitration charter to address data integrity concerns and suggest the ideas discussed here about closures with risky POIs.

Dispute summary
First let’s outline the relevant facts that summarize this dispute:

  • Fisherman initiated a dispute against an indexer for repeatedly presenting the same POI on an apparently healthy subgraph (multiple indexers have no issue indexing the subgraph).
  • Disputed indexer shared logs indicating they are hitting determinism error when indexing the subgraph.
  • Fisherman shared data indicating the disputed indexer presented 3 unique POIs after hitting the determinism issue.
  • Disputed indexer claims to use indexer-agent to manage their allocations, no manual closing or POI overriding being done. They presented agent logs that support this claim. They mentioned they are using Erigon3 beta as the chain client, this could be malfunctioning and be the root case.
  • Another indexer shared an instance of a similar problem happening, with some indexers being able to sync a subgraph but they get determinism error with 2 unique POIs created.

Dispute resolution
We believe there is not enough data to conclude there is a deliberate or intentional misuse of the protocol. It’s unclear if this is a graph-node issue or a product of bad RPC data being piped in, however as of today neither are a slashable offense, so drawing the dispute is the sensible choice.


Data integrity

This dispute has introduced an important topic for consideration which is the chain of custody for the indexed data. Several indexers and members of the community made great points, the opinion of the Arbitrators can be summarized as follow:

  • Indexers should be responsible for the chain of custody of the data they serve. It’s in the network’s best interest that they serve “good data” to the best of their knowledge.
  • Indexers should not be punished for using blockchain client software as intended. An indexer cannot be burdened with manually checking the validity of the data they consume assuming they act in good faith and use reputable sources for their data.
  • Indexers should be slashed if they willingly use a source of data (RPC/firehose) that is giving an incorrect input to graph-node. This could be a malicious RPC or a novel client implementation that is in it’s alpha/beta stages and full of bugs.

As it was pointed out by some of you in this thread, a good solution would be to incorporate public POI cross checking on the indexing software to get early alerts in case of divergencies and maybe even prevent those POIs from being committed on-chain. We agree and have raised this internally to the relevant teams, some of which are already working on this end.

In the meantime the arbitrators will also propose to the Graph Council an amendment to the Arbitration Charter. The suggested changes aim to give the arbitration council grounds for slashing indexers that intentionally harm the network by sourcing bad data without limiting their options (we don’t want to strictly prevent usage of novel clients, beta implementations, etc). The new proposed policy assumes indexers act in good faith but demands rectification if they are found to be producing incorrect results.

Here is a draft version of the new text, feedback and suggestions are welcome:

  1. Indexing data integrity

The Graph Node software indexes data from blockchain inputs. If the input data is inaccurate, the resulting subgraph and any derived POI will also be incorrect. Depending on the subgraph code, indexing bad data may even cause subgraph failures which could be misinterpreted as determinism bugs. Upholding the quality of the data is essential for the network’s overall health and reliability. Indexers are responsible for ensuring the integrity and chain of custody of the data they serve, which includes sourcing blockchain data from reputable sources.

The Arbitrator is encouraged to resolve disputes as a Draw if they believe an incorrect POI to be the result of a blockchain client malfunction (RPC/firehose).

However, upon a discrepancy being noticed the indexer should take reasonable measures to rectify the issue or submit a zero POI for any subsequent allocations. Note that the indexer must be notified by the Arbitrator by posting in the forum; the indexer will then be given a seven (7) days period to work with the Arbitrator and the community on addressing the problem after which any new disputes against a non zero POI can be resolved at the discretion of the Arbitrator.


Once more, we wanted to thank everyone that contributed in this thread,
Arbitration Council

8 Likes