Hi, decisionbasis was using an old version of Grafana. What you see in that pic is exactly what you see in the last 3 screenshots I posted (the last one that was on discord is now attached here).
I replied to 2, 3 and 4 in discord (pinging you) because the limitation of my account in this forum was not allowing me to reply here. I report the answer here:
2. {“jsonrpc”:“2.0”,“id”:1,“result”:“nitro/v3.5.0-bdc2fd2/linux-amd64/go1.23.1”}
3. it is a problem to upgrade only if you had updated to 0.36.0 and then downgraded to 0.35.1 (our case) because the downgrade script changes some entry of the psql tables
4. No. We were not aware of the deterministic error
@tmigone could you please let us know if you need any other info from our side to take a decision? We are in the process of transitioning to another server and if the information regarding this subgraph is not needed anymore we would like to start to move the agent to the new server.
Thank you for your responses. I have consolidated your screenshots for easier viewing.
From your screenshots, there are noticeable differences between your Grafana dashboard and your indexer-agent status for this subgraph. Specifically, the same block hash but associated with different error descriptions.
Regarding the 0.36.1 patch notes, they provide instructions for indexers who downgraded from 0.36.0 to 0.35.1 before upgrading to 0.36.1. You can find the details here: graph-node/releases/tag/v0.36.1.
Additionally, according to The Graph’s Discord discussion, it’s noted that the version labeled as 0.36.0 with the date 2025-01-28 is actually 0.36.1, due to a labeling bug. The date is correct despite the incorrect version label.
You mentioned that you are using 0.35.1. However, the last log you provided indicates 0.36.0 (2025-01-28), which, as noted, corresponds to 0.36.1. Could you please confirm if this is the same indexer used for allocating this subgraph?
However, this does not necessarily indicate that version 0.35.1 is the root cause of this error. The initial allocation for this subgraph dates back to September 2024, a period during which most indexers were operating on version 0.35.1. Notably, there is no history of duplicate POIs, except in the case of the disputed indexer.
Could you provide the indexer-agent logs related to the closure of this subgraph? If no logs were captured during the last allocation closure, would you consider reproducing the error by allocating 1 GRT and then closing the subgraph while capturing the latest indexer-agent logs?
Dear Arbitration Team,
If the scope of this arbitration is to clarify if there is
we believe that we extensively proved that we experienced a deterministic error and the indexer agent acted consequently to that. In case this discussion is going on to technically understand what happened in order to improve the protocol we are happy to bring it forward with the graph team providing all the data required. We do not think that @Inspector_POI is a necessary interlocutor in this case. Nevertheless we are going to answer, once again, to the points raised by @Inspector_POI trying at the same time to specify why in our opinion this discussion went beyond the scope of this arbitration.
We double checked both grafana dashboard and indexer status grep at the same time and the screenshots we posted are correct. Despite the “noticeable” differences, both indicate the same source of errors error while executing at wasm backtrace and Mapping aborted at src/handlers/GNSTradingCallbacks/index.ts. Therefore we do not understand what you want to prove with this point. Maybe that we are very good in using photoshop but so stupid that we did not photoshop both images with the same exact message error? If requested by the graph team, we are more than willing to give them temporary access to the grafana dashboard. This arguing seems deliberately in bad faith and not at all directed towards a technical understanding of the problem.
First: the picture you found digging in discord is indicating 0.36.0 simply because one of the attempts to solve the not a valid IPFS server issue was upgrading the node. As soon as that issue was solved with the help @DaMandal0rian, the 0.36.0 was not working properly hence we downgraded. What you found is an (out of context) error log and not a log showing correct functioning of the 0.36.0 version of the node. This seems another evidence of bad faith acting. We are currently running graph-node 0.35.1. Second: we are well aware of the instruction posted on the 0.36.1 release but, nevertheless, even following that instruction, we were not capable of successfully upgrade the node (see above). As mentioned earlier, since the decision to move to a larger server had already been made, we did not consider it worth the effort to spend more time upgrading the node on the old server, and instead chose to restart on the new server with the latest version of the node. Nevertheless, we do not think we have to give you any additional explanation about our internal decision and, since you are so quick and diligent in spitting out dates, I am afraid that you do not really understand the time and effort necessary to run a node. Third: you keep stating the “proof” of duplicate POIs and at the same time you mentioned the case of decisionBasis (@inflex) that, by the way, you started. From what we understood from that arbitration duplicate POIs and deterministic errors is not something new. In addition, we can say that we were using erigon3 as beacon layer for arbitrum-one but we switched to erigon2 given common issues largely discussed. Given the fact that you are well aware of this other arbitration, we believe that you are acting in bad faith showing a
about which we would like to draw the attention of the Arbitration Team.
If requested by the graph team we can open and close the allocation. Old logs are not available anymore given the many docker cache pruning performed due to the already mentioned lack of memory on the current server. Nevertheless, we struggle to see how this request can be done in good faith. We proved the deterministic nature of the error; if the graph developing team (it is our understanding that @Inspector_POI does not belong to this team but please correct us if we are wrong) deems necessary to have access to this log in order to potentially fix a bug, we are more than willing to do that.
You’ve emphasized bad faith multiple times in your responses that aren’t related to the discussion of this dispute.
You presented two different error logs from Grafana and the indexer-agent, citing different handlers (handleBorrowFeeCharged and handleLpFeeCharged) and varying error codes and lines. References to unrelated matters, such as Photoshop or insinuations about intelligence and bad faith(1), are unnecessary and detract from the core discussion.
In previous cases, such as GDR-24, the indexer provided logs promptly, including indexer-agent and graph-node logs. Your reluctance to provide similar documentation lacks clear justification. If your claim is valid, facilitating verification would be beneficial. The arbitration team has also explicitly requested your indexer-agent logs. This request is made publicly to ensure transparency. So it’s bad faith(2) if it’s requested by fisherman?
Your refusal to provide equivalent evidence such as citing my role rather than the arbitration team’s request—lacks justification. This is a public discussion, transparency is expected.
I didn’t know Erigon2 or Erigon3 exists in Arbitrum-One although you answered my previous question that you are using arbitrum-nitro/v3.5.0, could you clarify further? Seeing that you stated
As previously highlighted in GDR-24, among over 10,000 subgraphs, this particular subgraph ranks approximately #200 in query fees, even after the current deterministic error. This underscores the critical importance of serving queries effectively. “clearly predatory behavior”bad faith(3) - None of your 'bad faith claims' can change the fact that you’re the only indexer submitting duplicate POIs and this has been happening since day one of your allocation.
I’m interested to see whether this discussion paves the way towards whether if this is a graph indexer-stack issue or a product of bad RPC data being piped in / indexer operational oversight.
But let’s be clear: I’m here for legitimate answers about why you’re the only indexer submitting duplicate POIs from day one. Since you’ve made it clear you’ll only cooperate with the arbitration team, I’ll let them handle your evasiveness.
First of all we wanted to thank @Inspector_POI for your diligence in pushing the investigation forward and @nash16 for assisting with the requested information. Appreciate we keep the discussion civil and use this forum and not others to share information.
We are still discussing the outcome for this dispute, with a few possible scenarios we’d like to understand more:
The provided Grafana screenshots match the data from the status endpoint indicating we might be seeing a deterministic error at play. indexer-agent and graph-node logs here would be tremendously helpful as supporting evidence. @nash16 I understand you had to prune those, please confirm if that is the case. If we have no logs available we would kindly request to follow the fisherman’s suggestion of allocating 1 GRT then closing while making sure to capture both component logs (indexer-agent and graph-node).
If this is a deterministic error, seeing that other indexers are not experiencing it at the same block height then it could be a product of bad data being piped in. @nash16 it would help to understand your statement on Erigon2/Erigon3, specifically having a timeline on when this switch happened and what specific versions of erigon we are discussing here.
In order to force the index-node to produce some log would it be ok if we graphman restart the subgraphs as done by @inflex in the other arbitration?
Regarding other useful logs to determine the nature of the error can you consider the possibility to have direct access to our grafana or somehow the indexer-agent itself? At the moment, opening a dummy allocation is quite disruptive due to a recent big undelegation. In order to open a single dummy we would have to close many others. We were willing to do that for transitioning to the new server anyway but closing almost all the allocation and at the same time staying on the old server is very inconvenient. Thanks for your understanding.
Regarding erigon3/erigon2 statement: we had synced erigon3 and given the apparent good performance and stability at the beginning we started to use it both to sync subgraphs from mainnet and as L1 for L2 chains like optimism and arbitrum-one. Arbitrum uses L1 beacon as well. I do not recall the precise date at which we switched but the last discussion we had with erigon team about e3 issues dates 05/12/2024. At that time we had already switched to erigon2 for the graph structure due to the continuous crashes but we were still syncing erigon3 to support erigon team debugging. We can approximately say that the switch to erigon2 approximately happened between nov and dec 2024. The erigon version should be e3-alpha3 to e3-alpha5.