An Arbitration Charter to clarify Arbitrator behavior

As The Graph ecosystem migrates subgraphs from E&N’s hosted service to The Graph decentralized network, I felt it would be worthwhile to add more clarity to how the Arbitrator will behave, as there will be more opportunities for dispute creation and slashing in the near future.

I put this into the form of an “Arbitration Charter”, the goal being that when this is sufficiently well developed, The Graph Council could vote to ratify it, thus signaling that this is the standard they will hold the Arbitrator to. Currently, the Arbitrator is set to a 2-of-3 Gnosis multisig w/ @ari, @davekaj, and @jannis as members.

Since I foresee this GIP acting as an opportunity to express feedback on the arbitration mechanism itself, I’d like to emphasize that the Arbitration Charter itself does not grant any new capabilities to the Arbitrator. Rather, it establishes norms that constrain how an Arbitrator should engage with the protocol, given their existing capabilities.

Nonetheless, feedback on the arbitration mechanism itself is always welcome, either here or in separate proposals.

For convenience, I’ve pasted the “Abstract” and “Motivation” of the GIP below. The full body of the GIP can be found on the zerim/arbitration-charter branch of the GIPs repo on Radicle (instructions for how to access that here: GIP-0001 and Getting Started with GIPs, GRPs, GRCs, etc).

Abstract

The Graph has a protocol role called an Arbitrator who is assigned through decentralized governance. The Arbitrator is an Ethereum Account that has the ability to decide the outcome of disputes in the protocol. The purpose of the Arbitration Charter are to establish norms that constrain the Arbitrator’s actions beyond what is expressible or currently expressed in smart contract code. We propose that any Arbitrator that does not comply with the norms laid out in this charter be replaced via decentralized governance.

Motivation

The Arbitrator is a protocol role that is assigned via decentralized governance, and as such there are certain parts of its behavior that are not specified in smart contract code running on-chain. Having a protocol charter for this actor’s behavior creates clarity for the ecosystem in how the role of the Arbitrator will be executed and establishes a standard for measuring the effectiveness of an Arbitrator, which can be referenced in protocol governance discussions around the appointment of an Arbitrator.

The substance of the Arbitration charter is intended to ensure that the Arbitrator is fulfilling their role of supporting a healthy and functioning network where Indexers perform their work correctly, while minimizing the risk to honest Indexers of being economically penalized while interacting with the protocol in good faith.

Feedback welcome!

12 Likes

I have not been able to see the details of the arbitration mechanism on Radicle (not able to find it there), so apologies if some of the thoughts expressed here are repetitive. I do believe that an arbitration mechanism is very meaningful to further demonstrate that The Graph seeks to uphold strong principles of integrity in the protocol.

Besides the need for a fair and impartial dispute resolution process, I also see a secondary motivation that the arbitration mechanism has the potential to pursue: progressive minimization. The arbitration process is a governance element which may be necessary to enact more frequently early on. The core arbitrator mandate is to come to an impartial resolution of a dispute, but it could also be expanded to include an assessment of the nature of the dispute concluding with thoughtful directional community feedback to develop solutions that prevent such disputes from recurring. A long-term aspirational goal may thus be that we have a solid arbitration mechanism in place without the need to enact it, therefore minimizing the need to action a governance process.

With the above in mind, here are some specific suggestions in response to your post, which are considerations for arbitrator responsibilities that may not yet be specified in the smart contract:

  • Arbitrators should demonstrate their attempt to seek statements on the dispute from all involved parties prior to evaluating the dispute

  • At least one meeting should be held amongst the arbitrators to mutually review the dispute prior to proceeding to a vote. Meeting notes should be produced that document the review and discussions of the dispute amongst the arbitrators

  • Perform a dry vote prior to proceeding with multisig. If no 100% majority, then arbitrators proceed to consult opinion of a mutually agreed upon subject matter expert to obtain an impartial, non-binding, opinion on the dispute

  • At least one majority voting arbitrator should produce a concluding statement accommodating the voting decision that includes the following components:

    • Reasoning of the decision
    • Risk assessment of future occurrences for like-kind disputes, incl. feasibility to mitigate such risks via protocol enhancements
    • If applicable, provide directional community recommendation for GIP development to address the cause of the dispute
3 Likes

Hey, @Oliver thanks for the excellent suggestions. Some of these are already captured in the Arbitration Charter and others I think would be worth adding.

BTW, try checking the Radicle repo again, I don’t think my changes had propagated properly before I closed my Radicle client. Should be there now.

1 Like

@Oliver per the discussion here the Radicle team says some of these replication issues will be solved in an upcoming breaking release (the joys of dogfooding the bleeding edge of decentralized tech :wink: ).

In the meantime, I’ve posted the full contents of the Arbitration Charter (no feedback incorporated yet) on HackMD here.

3 Likes

Thank you @Brandon , helps seeing the details. Here some further feedback for consideration after having read the Charter.

  • Double Jeopardy
    I assume the double jeopardy rule would apply to past faults by indexers, meaning that it excludes new future faults. If true, then such clarification would be helpful. Right now, it is written in a way where it could be interpreted that indexers are exempt from future slashing for creating the same faults again post arbitration.

  • Penalty & Reward level specifications
    If not already defined on-chain, then I would highlight that outside of the maximum allowable slashing amount for indexers, the Arbitration Charter currently does not specify any amounts and/or ranges around penalties and rewards. Guidance on that would be useful to ensure fair and equal treatments across like-kind cases for all stakeholders and to take into consideration factors such as escalating penalty levels for repeat offenders.

  • Arbitration Decision Appeal
    While the arbitration itself is designed to be a final settlement on a dispute, there may be legitimate instances where an arbitration decision may want to be appealed by either party. Scope may be process related (i.e. one party claiming that rules of Arbitration Charter were not followed). This could be introduced as a delay of the execution of the arbitration decision, such as 48 hours, during which time appeals can be submitted.

1 Like

The GIP for the Arbitration Charter has been updated per the feedback received above and since the recent protocol town hall.

Can be found on the zerim/arbitration-charter branch of the GIPs repo and also posted here for convenience: Abstract - HackMD

Additional feedback welcome.

Thank you, Brandon.
This charter clarifies a lot and very helpful. The slashing process was one of the unclearest for me since I joined the project. And I’m really glad to see more clarified rules here and during the Townhall + Workshop.

What I’m worried about. Here (in part of Risks and Security Consideration): Abstract - HackMD. You speak about “false sense of security to Indexers”, but after I researched this doc and watched Townhall + Workshop, I didn’t get this “false sense”, opposite, I get a strong sense of anxiety and some kind of fear about possible consequences of current proposals to the ecosystem.

That’s why:

  1. We don’t have any “ideal\reference Indexer” and actually can’t have. So we can’t be 100% sure that everything is ok with Indexing or queries. Of course, we will create (and partially already created) tools for comparing results between other Indexers, but for now, we can see that not all Indexers have the same POI’s for example, and it means that someone potentially will be slashed.
  2. Current proposal to share half of the slashed amount to the Fisherman looks really bad, especially when we speak not about Rewards from problem query or allocation, but about the whole self-stake. It leads us to:
    2.1 I have only 100k GRT and I can start (technically) my Indexer. But it will not be so profitable, actually, it could give you minus by the end of the Year, because of gas, servers, etc. Additionally, you could be slashed for 2.5% or for 5, 7.5, 10+% depends on the number of subgraphs or queries.
    2.2 On the other side, you can be a Fisherman, your 100k is 10 attempts to get half of the slashing amount from the self-stake of other Indexers. Yes, you can lose part of your “bets”, but there is no other work (and costs) except looking at the same scripts with comparing different Indexers. You don’t need to prove anything, and the risk\reward ratio is very good. You just need to choose middle-large Indexers and waiting for the possible discrepancies between them.
    2.3 As a result of 2.2 Middle-large Indexers will constantly spend their time and nervous to protect their GRT and proving that they didn’t do anything wrong. Instead of useful work for the community and delegators.
    2.4 Large-middle Indexers will split their Indexers into several small Indexers, it will cost them more for infrastructure, additional gas, employees and etc, but still cheaper than Slashing. This protects at least partially from Fisherman who possibly wants to get as much as possible.

All of these don’t help the Network, from my point of view (personally). Just burn Eth and utilize resources instead of doing something good.

I understand that slashing will happen not all the time and we have a really good team of Arbitrators + Counsels, but still: current proposals incentivized people to be Fishermans instead of Indexers or Delegators, that can create a lot of additional work for Arbitrators, Counsels, Indexers. Without any additional profit for them, only for Fisherman if he\she was right.

Please, let’s think about:

  1. Change self-stake slashing to the Rewards from allocation\queries. In ideal case Indexer’s part of the rewards.
  2. Reduce % for Fisherman from the slashed amount. Now, bad actors can reduce their losses twice just by being Fisherman against their own Indexer.
  3. It will be very nice to not just burn GRT, but automatically send it to charity. For example 20% burn, 10-20% to Fisherman, 60-70% to charity.
  4. If Fisherman loses doing the same with his “bet”. For example 20% burn, 10-20% to Indexer, 60-70% to charity.
2 Likes

Overall, I support that there should be a meaningful incentive structure for the Fisherman role to perform a useful checks and balances function within the protocol. I believe the current design attempts to establish a penalty/rewards dynamics that is balanced between the Fishermen/Indexer groups. Without empirical data gathered yet, it may prove difficult to gain consensus on this question though and we are left with hypothetical scenarios at this point.

However, should there be larger agreement right now on the perception of the penalty/rewards structure to be imbalanced unfavorably to the indexer group, here some feedback to your suggestions @KonstantinRM

  • It might be meaningful to not only change Fishermen parameters, but also Indexer parameters if the concern is that both a) the Fisherman role earns too much and b) Indexer role gets penalized too harshly
  • The concept of “charity” may be interesting to explore as an overarching idea in a separate thread. One practical challenge would be how to get community consensus on what charities we would want to support. An alternative to that could be to share it within the Graph community in the form passing it on to the Graph Foundation wallet where it could be used to fund future grants.

Here some further considerations to address the overall concerns raised by @KonstantinRM

  • One suggestion could be to keep the current structure but reduce slash % from 2.5% down to 2% or 1.5%. This would both reduce Indexer risk exposure while keeping the penalty principle in place. It also reduces Fishermen rewards potential down to lower levels.
  • Cap Fishermen rewards to their deposit amount: 1) open up possibility for larger deposits >10K 2) Cap Fisherman reward to either a) 50% of indexer allocation stake or b) Fisherman deposit amount, whichever comes first. Idea would be that Fishermen cannot earn more than their own skin in the game for each dispute. It would also likely reduce the max slash amounts for indexers in many cases
  • Introduce longer Fisherman deposit lock period. To minimize incentive for Fishermen to open the dispute case flood gates, every time they open a dispute their deposit gets locked for something like 30-60 days. Idea would be that Fishermen would make a more conscious quality decision on what to dispute, knowing their deposit will be locked for a while.

Anyway, my actual own proposal would be to go live with the current structure, so we can start gathering actual data. I propose that governance is asked to commit providing a deeper review after 90 or 180 days of going live, so that the community has a chance to reflect and respond to the effectiveness of the current structure, along with the possibility to propose changes then, if necessary. I would offer my own support in providing such an analysis so that the community can draw meaningful conclusions on this subject.

1 Like

Thanks for sharing your feelings and ideas here.

I think there’s a lot more that can be done to improve Indexer’s confidence here interacting w/ the dispute + slashing mechanism:

  1. Expose an optional endpoint in Indexer Service to request Proofs of Indexing (PoIs). Since PoIs are hashed w/ an Indexer’s public key, Indexers could cross-check against one another to check for inconsistencies without actually providing other Indexers w/ any data that could be used to submit a valid PoI on-chain and collect indexing rewards.
  2. Update the Indexer Agent to cross-check PoIs before submitting a PoI rather than the current behavior which is to check for inconsistent PoIs after PoIs have been submitted.
  3. Build integration testing or fuzzing tools to help Indexers spot determinism bugs across versions of Graph Node, in their dev ops configuration or even across two different subgraphs that are intended to be functionally equivalent.

All of the above is being tracked internally at E&N and also I believe @eva is tracking #3 at The Graph Foundation and looking for grant submissions in this area to help Indexers build more confidence in the PoIs they are submitting.

My inclination is to get as far as we can on the tooling side and get a good baseline of behavior before tweaking economics, but some changes I am in favor of:

  1. Only make allocated stake slashable as opposed to an Indexer’s entire stake.
  2. Make shorter allocations slashable for a smaller % than longer allocations that collected more indexing rewards (right now they are slashed at the same amount).
  3. Encode some of the rules described in the Arbitration Charter as smart contract logic.

There could be a reason for tweaking the Fisherman incentives, and @Oliver presents one such solution above that could make sense. At the moment, the problem feels too theoretical for me to comment on whether this might be necessary. The disputes that I’m aware of thus far have been very well researched by the Fisherman and haven’t looked anything like the spray and pray approach that you seem to imply might be incentivized. I think once we give Indexers the tools to have greater confidence in their PoIs, I think we’re even less likely to see Fisherman put thousands of GRT at stake on a dispute simply as a gamble.

I will add some of the ideas discussed above to a “Future Work” section of the Arbitration Charter, though I don’t think these should block the charter itself, as the charter only moves us in the right direction by giving Indexers far more protections than they currently have.

3 Likes

In general I am in support of the charter as it stands today, but do share some of the concerns @KonstantinRM has kindly shared around the incentives and severity of punishment in the current proposal and support the refinements @Oliver has proposed.

One of the key reasons we are not seeing a lot of input on this topic is likely because we don’t have the data nor learned experience/wisdom upon which to draw from. This is mostly theoretical for Indexers til they feel the impact in a real sense. I suspect that some of @KonstantinRM 's thoughts on this topic come from his experience with thinking about the implications of the open disputes around POIs for P2P (please correct me if I am wrong about that, Konstantin!), and more indexers are going to develop an opinion on this topic when it impacts them or another Indexer they are close to. I think that this dynamic plays well into having a robust, data-driven iteration process for the charter. At least for the early part of its implementation/refinement.

This plays into my concerns around the recent events surrounding the first subgraph fatal error on mainnet, and the reality we face in terms of an on-chain events where stakeholders end up being punished monetarily for first-time incidents outside their scope of control or knowledge.

To summarise the event:

  • One of the ten Phase1 migration subgraphs sufferred a fatal indexing error at a specific block
  • A fatal indexing error means that the Indexers allocating stake to that subgraph cannot continue syncing the subgraph.
  • If the Indexer didn’t catch the issue within two(?) epochs and manually settle the allocation, they were forced to settle the allocation with an 0x0 POI which means they automatically forfeited all indexerRewards and queryFees for the allocation.
  • This occurred due to a bug in the subgraph rather than through the action of the Indexer, yet the Indexer is punished immediately - they lose the rewards accrued over the lifetime of the allocation - there is no arbitration over the application of this punishment (and I am of course impressing my own bias upon this action by calling it a punishment)
  • The existing arbitration process specifically states an example of the fatal subgraph error and that rewards must not be collected for that allocation (Section 9)

My biggest concern (in lieu of technology improvements) is that events like the subgraph fatal error and moreover unexpected events that don’t fit the existing model of arbitration, have no home within the process and instead can breed contempt within the community due to a sense of unfair play, that a stakeholder might be economically punished despite acting in good faith. I can see these issues causing stakeholders to act differently in terms of risk within the protocol, for example in the case of the fatal subgraph error some Indexers may decide to only support a small subset of subgraphs that they deem to be very stable, going against the in-protocol dynamics (signal, total allocations etc).

So going back to @Oliver’s quote - I feel that the “first fatal subgraph error on mainnet” incident on migration is a good example of a type of dispute that falls outside the more programmatic nature of the arbitration process as it stands today, is the type of issue that could be solved with technology and “thoughtful directional community feedback to develop solutions that prevent such disputes from recurring” per Zorro’s suggestion. We need to be accommodating in officially recognising such unexpected issues as the first fatal subgraph error on mainnet, lest future first-time issues breed ill-will within the community and net-negative incentive alignments within the protocol.

I would like to explore those thoughts further on Office Hours 14, and understand if others think that these sort of events, which will likely be par for the course on mainnet, fall under the eye of arbitration (both for punishment, compensation and future fixes to avoid such events in the future) or if they represent the price of playing the game, and should/can be mitigated entirely through technology improvements (Graph stack enhancements etc.).

1 Like

Of course, I didn’t have enough time for it before because we were busy with new subgraphs. But with these disputes, I postponed other activities and focused on Charter and investigation for our cases.
And my words about Fisherman and his possible actions based on our disputes, I will clarify it a little bit.

For now, the Network has several different Indexers with wrong POI’s if our script works properly.
All of these indexers (that I found accidentally when we checked our POI’s during these days), small enough, like 1M self-stake and 0-6M delegations.

When we got these disputes, obviously, other Indexers also had wrong POI’s, but they didn’t get Disputes because our self-stake relatively big and looks better for potential slashing with current rules.

Another thing that looked bad from my point of view: Fisherman created these disputes right after Workshop (or pretty close to it, if I’m not mistaken 05.05). It was a new address, tokens from other sources, and he\she created 7 disputes against us and 2 against framework-labs. You can look at it here: Address 0x992bb240b1ef27bc95a2e4767d9de6f8bf6d9632 | Etherscan. Just 9 disputes with several minutes between each of them.
And yes, it worries me, personally, because it looks more about getting money than attempt to find someone who did something wrong or doing it all the time with no worries about the Network.

1 Like

This scenario wasn’t explicitly considered in the writing of the charter. I would be in favor of modifying the proposal to allow an Indexer to submit the last valid PoI when closing an allocation if their allocation was created before the subgraph error occurred. Let me know if this addresses your particular concern.

More generally, I totally agree with you here that this is an area where we’ll need to iterate as more community members become familiar with the mechanisms.


Without betraying any information shared in confidence, I can say that the timing of the disputes was because the Fisherman in this instance wasn’t sure if they were allowed to dispute Indexers for faults that occurred before the Arbitration Charter was ratified. This was clarified in the Protocol Townhall, hence the timing of the disputes. The reason for the new address, as I understand it, was to not create ill will among Indexers.

That being said, none of the mechanisms in the protocol are intended to be punitive to honest participants–this is one of the reasons that the Arbitration Charter allows the Arbitrator to exercise discretion. Over time, as Graph Node and Indexer tooling matures and Indexers build confidence interacting with the protocol, there will hopefully be fewer and fewer instances in which the Arbitrator must exercise this discretion.

For now, I know the Arbitrators are aware of the immature state of tooling in the protocol, and while I cannot speak for them, I do not expect that they will punish Indexers for inadvertent PoI inconsistencies, especially since as @cryptovestor notes, this is the first time many of these issues are being encountered in the decentralized network.

1 Like

Update: The Arbitration Charter GIP has been updated based on a lot of the feedback in the above thread. It can still be found on the zerim/arbitration-charter branch of the GIPs repo. Looking forward to getting your feedback on the changes.

For reference, here are the recent commits (can’t wait until Radicle adds PR support):

5236bff (HEAD -> zerim/arbitration-charter, rad/zerim/arbitration-charter) arbitration-charter: Add items to future work based on community feedback
5fcac46 arbitration-charter: allow indexers to collect indexing rewards right after subgraph fails
90a2906 arbitration-charter: add "future work" section
9dd6b6b arbitration-charter: add clauses based on community input
326f8ec arbitration charter: Add missing links to forum discussions
665dbc6 Add proposal for arbitration charter
2 Likes

if a subgraph has a bug that prevents indexing up until the current epoch, then a zero PoI should be submitted and indexing rewards must not be collected for that subgraph.

This mechanism is problematic.

The PoI contains data which attests to the fact that the subgraph has failed and in what way. Eventually when we migrate to verifiable queries, the error status would be a part of the PoI in such a way that the error message could be validated as the correct response to any query.

For the protocol to be decentralized and the Arbitrator role to be removed, there needs to be a consensus around the failure state of the subgraph - which the PoI provides. Once one Indexer attests to the subgraph failing, the protocol should still incentivize further validation of that failure status until consensus is achieved. This may include incentivizing Indexers to start indexing from scratch - which could be expensive but necessary to validate the failure status.

The mechanism by which the protocol signals the value of further consensus of the failure state is curation. Once the failing subgraph is replaced, curation should move to the new subgraph, and this is the time that Indexers should migrate over.

Furthermore, there is still value in having the subgraph indexed for historical queries at least until such time as the failed subgraph is replaced (and even then, possibly for some fuzz testing to ensure compatibility between the old and new versions). Query fees should still be paid after a subgraph fails for both historical queries and for serving error responses. Otherwise there is no way for a Consumer to know whether the subgraph failed or the Indexer(s) just abandoned it save for some additional mechanism.

It is worth noting why not paying for failed subgraphs has been proposed. The idea is that once a subgraph has failed, it becomes trivial to index and therefore there is no additional work that needs to be done or compensated. In theory, this is an attack vector because a Curator could deploy a subgraph that fails quickly and then collect rewards on it indefinitely. However, this attack already exists for subgraphs that do not fail. One could, for example, specify a single call-handler on a non-existent contract. Such a subgraph would also be trivial to index indefinitely. Since the mechanism does not protect against this attack, it is not being helpful and it’s drawbacks outweigh the potential benefits.

4 Likes

Minor spelling error:

Make shorter allocations slashable for a smaller amount that longer allocations for indexing disputes

that → than

1 Like

Attestation that is only correct with respect to the previous official version of the Indexer software

I think the language should account for the possibility that multiple previous versions of the software may have been sunset during the grace period and that any of them would be valid. There is other language to be modified to this effect in the same paragraph.

2 Likes

if a subgraph has a bug that prevents indexing up until the current epoch, then a zero PoI should be submitted and indexing rewards must not be collected for that subgraph.

I’m realizing now after having read my previous post on this a second time when linking someone to this discussion that I failed to mention what this should be changed to.

Instead, the same mechanism should be used to query the PoI for the same block as though the subgraph had not entered into a deterministic failure state. The way that the PoI is written it will continue to be updated and provide security.

This simplifies the indexer-agent as well as the logic as to how an Indexer should behave. The Indexer can continue to open and close allocations, and serve queries, as if nothing has happened. Historical queries would give results and queries after the failed block would give attestable errors.

1 Like

Hi @That3Percent could you rephrase this?
In the form of “If this happened → then you should do: step 1, step 2, step 3

Because for now, I understood this in the wrong way (I think), like:
If Subgraph has a bug (like mStable or like Enzyme?) do this:

  1. Don’t worry and close allocations without any additional actions (with the usual generated POI, not 0x0).
  2. Keep working with this subgraph.
  3. Order Lambo.
1 Like

I think you understand this in the correct way.

If a subgraph has a deterministic failure, continue as normal by eventually closing the allocation with the usual non-zero PoI. If there is still curation on the failed subgraph, consider opening another allocation.

3 Likes

Hello,

I generally agree with the charter and glad we have the discussion here and about the fact we have fantastic arbitration team. The rules are very good thought through and I think indexers are generally happy about the level of clarify at this stage. There is a clear evidence for presumption of innocence both in the charter and in the approach used by arbitrators and broader Graph team and the community, which is great!
Although i am generally happy with how things are, I would like to share my view on few points which I believe worth to consider.

  1. I would like to agree with @KonstantinRM on the point of slashing based on the indexer’s own stake. My argument is while current mechanism sounds perfectly when it comes to slashing for incorrect query service because of the economical security metrics etc. The indexing rewards does not really depend on the indexer’s self stake and thus it would make more sense to apply slashing on the indexing rewards rather than tie to the % of the stake.
  2. I would like to highlight my observation regarding the recent update of the chapter 9

An exception to this rule is if the allocation being closed was opened before the subgraph bug occurred. In this case, the Indexer may submit the last valid PoI they produced for the subgraph and collect indexing rewards for the allocation.

So the exception to the rule not penalise indexers for the subgraph failure is perfectly fine, but without clarity for how long backwards the last valid PoI should count it created the opposite -the incentive to stay on the failed subgraph for as long as possible.
What is even more worrying, then if such behaviour is supported we may have more precedents and some indexers may even deploy their own subgraphs which they know will fail at some point and signal on them to basically create the exclusive environment for them to enjoy rewards on the subgraph which only they are allowed to submit valid poi as the only allocated before the failure…

Regards,
vict | grassets.tech

3 Likes