Addressing the disparity between indexing and query incentives during bootstrapping phase

It was pointed out by @cryptovestor, @chris, and @indexer_payne during a recent Indexer Office Hours that during the early bootstrapping phase of the network, where the incentives from indexing rewards dominate those of query fees, we may not see the type of behavior from Indexers intended by the protocol design.

This is a question I’ve been tracking for a while, and in the spirit of working in public, I’d like to share my early thoughts on the topic here. I’ll also be joining the Indexer Office Hours tomorrow to continue the discussion–hope to see many of you there!


First, I’d like to define the hypothesized problem clearly, as I’ve heard some misunderstandings of what the effects of this dynamic might be:

  1. The first problem is that some Indexers may choose to index subgraphs, but not serve queries. These Indexers may do this because, to them, the marginal benefit of serving queries in exchange for query fees does not exceed:
    • The operational overhead of serving queries (i.e. creating Agora cost models, setting up scalable query infrastructure, configuring Indexer Service to use state channels, paying additional gas fees, etc.)
    • The additional risk of slashing due to query-related faults.

(I’ve heard some comments that Indexers might stop responding to Curator signal altogether, or stop indexing difficult to index subgraphs as a result of this issue, but these questions are actually orthogonal to the disparity between indexing and query fees; rather, these issues are addressed by the fact that indexing rewards are a function of Curator signal).

  1. The second problem is that Indexers opting out of serving queries makes queries more expensive for the rest of the network. This problem is more subtle so I’ll elaborate with an example.

First, recall that Indexers do not receive query fees directly, but rather settle query fees into a rebate pool, and later claim them according to a rebate function:

\%\_of\_rebate\_pool\_claimed = \%\_of\_contributed\_fees^{\alpha}*\%\_of\_effective\_stake^{1-\alpha}

(effective stake is a concept we introduce to normalize the allocated stake for allocations of varying duration)

This is the Cobb-Douglas function, and in protocols like The Graph and 0x, it is what incentivizes Indexers to stake tokens in rough proportion to the amount of work they are performing for the network.

Now let’s look at how this impacts the cost of query fees by examining a couple of scenarios (you can play with them using this spreadsheet). I assume an \alpha of 0.77, which is its current value in the network (this value for alpha is discussed here). For simplicity, I assume all rebates are received by the Indexer, but in actuality, a portion is shared with Delegators. This should not impact the analysis:

Scenario A: Two Indexers interacting optimally with the rebate mechanism

Indexer Contributed Fees Effective Stake % of Rebate Pool Claimed Rebates Claimed
Alice 5000 240000 66.667% 5000
Bob 2500 120000 33.333% 2500
Total 7500 360000 100% 7500

Note that Alice and Bob, when interacting optimally with the mechanism, allocate stake proportional to their share of query fees, and receive 100% of their contributed query fees back as rebates. Importantly, the entire rebate pool is claimed.

Scenario B: Two active Indexers and one lazy Indexer

Now, let’s keep the above setup, except that we’ll add a “lazy” Indexer Charlie, who allocated stake to collect indexing rewards, but does not serve any queries:

Indexer Contributed Fees Effective Stake % of Rebate Pool Claimed Rebates Claimed
Alice 5000 240000 56.84% 4263.17
Bob 2500 120000 28.42% 2131.59
Charlie 0 360000 0% 0
Total 7500 720000 85.26% 6394.76

The presence of the lazy Indexer, compared to the previous setup, means that Alice and Bob only receive ~85% of the rebates they otherwise would have.

Put differently, to receive a comparable amount of rebates as Scenario A, they would have to charge 17% more for queries, due to the presence of the lazy Indexer.****

The greater the share of allocated stake controlled by lazy Indexers, the greater the premium that active Indexers must charge on queries to compensate for their effects.


Now that we’ve sufficiently defined the problem, let’s discuss possible solutions. These embody one or more of the following strategies:

  1. Lower the marginal cost of serving queries vs not serving queries.
  2. Increase the cost of not serving queries.
  3. Increase the marginal benefit of serving queries vs not serving queries.

Now that we’ve outlined the basic strategies, let’s get into the actual solutions:

  1. "Use it or lose it" tax. Staking in the network implies a commitment to index and serve queries roughly proportional to an Indexer’s amount of allocated stake. As we’ve seen in [Problem 2, Scenario B] above, an Indexer that allocates stake but does not serve a comparable amount of queries imposes a cost on the rest of the network. We could construct a mechanism that captures this imposed cost as a tax levied on the Indexer’s allocated stake, that is distributed to the remainder of Indexers in a rebate pool. This increases the cost of not serving queries and also mitigates the negative effects of Indexers not serving queries by subsidizing Indexers that are serving queries.
    • (The devil is in the details here, to design a mechanism that is efficient to calculate and cannot be gamed).
  2. Subsidize queries. Queries, unlike indexing, cannot be subsidized at the protocol level because it is trivial to spoof queries as they take place off-chain between an Indexer and a Consumer. In fact, the protocol charges a tax (the opposite of a subsidy) to specifically make spoofing queries expensive. However, The Graph Foundation could, in theory, subsidize select gateways to the decentralized network that are known to be sources of legitimate query volume, as a way of boosting demand for queries during the bootstrapping phase of the network. This would increase the marginal benefit of serving queries vs. not serving queries.
  3. Layer 2 scaling. Some of the additional costs of serving queries come from on-chain calls to collect and claim–each of which needs to be called at least once per allocation per subgraph. By moving these transactions to an L2, we can decrease the marginal cost to an Indexer of serving queries vs. not serving queries.
  4. Social Engineering. Many Delegators in the network are aligned, either ideologically or financially (i.e. in vesting contracts) with the long-term success of the network. The community could encourage Delegators to delegate stake to Indexers that serve queries, and undelegate stake from Indexers who do not serve queries. This would both increase the benefit of serving queries and increase the cost of not serving queries.
  5. More predictable slashing risks. Given that an Indexer can serve an unbounded number of queries, each of which carries a risk of slashing, in the absence of mitigating mechanisms, Indexers would carry an unbounded amount of risk from serving queries. The Arbitration Charter takes the first steps in addressing this by capping the amount of slashing an Indexer could incur for serving queries per allocation per epoch. Future work might include more tooling for Indexers to guarantee indexing + query determinism, as well as enshrining some of the protections offered by the Arbitration Charter at the smart contracts layer. These measures all serve to decrease the expected cost to Indexers of serving queries vs. not serving queries.

Looking forward to getting feedback on all of the above. I’ll aim to keep the list above updated if I find that I have missed anything in the solution space.

13 Likes

Thanks for that topic. Also thought about these problems a lot.
We have at least 2 problems here:

  1. Relatively small amount of queries and potential rewards from them in the beginning.
  2. Slashing risk. Even now with the current Arbitration Charter state, we don’t have full transparency on how all of this will work.

If we speak about #1 it’s not so huge problem, because queries will grow, we just need to fix all already existed mechanisms and implement things like closeAndAllocate.
But #2 is quite painful, we don’t have tools to protect ourselves but we already can work with queries and potentially could be under Disputes. Even if we speak about some “capping” without particular numbers it can’t give enough transparency to Indexers.

So, I think safety should be covered first. As detailed as we can.
After that will be good to implement economic mechanisms for incentivizing Indexers not only index but also serving queries. Because in this order, It will be a question about proper calculations, optimizations and etc. Without clarification and some changes of the slashing mechanism (at least slashing depends on allocation % of self-stake or transfer slashing on the potential rewards somehow), it’s actually too risky for Indexers.

2 Likes