GIP-0025: DataEdge


GIP: 0025
Title: DataEdge
Authors: Zac Burns zac@edgeandnode.com
Created: 2022-03-08
Updated: 2022-03-08
Stage: Draft
Discussions-To: GIP-0025: DataEdge

Abstract

This GIP introduces DataEdge: a gas-efficient method to bridge data into subgraphs.

As L1s become increasingly costly, developers seek to minimize those costs. Rising costs are not just a hypothetical concern but an existential threat to the feasibility of blockchain for many use-cases. DataEdge solves this problem by reducing L1 gas costs to their theoretical lower limit.

We will first introduce DataEdge as a general concept and then propose a specific instantiation of the DataEdge to be used by the protocol. Flagship use cases within The Graph Protocol will include the Cross-Chain Epoch Oracle and the Query Version Registry.

High Level Description

The DataEdge smart contract has a minimal interface containing only an empty fallback() method. A message comprises a selector and payload bytes from a call to the contract and can take arbitrary meaning defined by a subgraph that decodes the message.

A DataEdge contract will be deployed and designated by The Graph Council as the DataEdge for all protocol interfaces for use in The Graph Protocol.

Detailed Specification

In terms of specification, there is no more to add to the DataEdge. An empty contract that sends payloads into the void is self-explanatory, even if befuddling. It may be helpful in the remainder of this GIP to establish norms and design patterns to inspire users of DataEdge on how to make the best use of such a vague tool.

Subgraph ABI

Technically, the 4 bytes in the selector and any payload bytes can be taken as a whole message for maximum efficiency. However, if the selector identifies a method signature, graph-node can have a descriptive ABI in the subgraph manifest, and smart contract authors can call methods with descriptive names. The price of the descriptive manifest and developer ease is less than 100 gas per transaction.

The Graph Protocol’s instantiation of DataEdge will use the selector as a namespace. For example, the crossChainEpochOracle(bytes _payload) and queryVersionRegistry(bytes _payload) methods would each correspond to selectors with their own payload encoding. This namespacing simplifies the development of the protocol by allowing individual subsystem’s encodings to evolve independently.

Compression

DataEdge message decoder implementations can rely on the fact that transactions cannot be re-ordered on a per-account basis. Even in a long reorg, the transaction nonce prevents such a re-ordering. Therefore the encoding can use stateful compression techniques, so long as the only data dependencies are the previous transactions from a given account.

Consider an example from the Cross-Chain Epoch Oracle. This oracle’s job is to provide a list of block numbers from foreign chains on a recurring interval. Regular block numbers are a classic example of time-series data and lend themselves to stateful compression. Taking the previous block numbers from each foreign chain as state (derived only from previous transactions from the oracle’s account), delta-of-delta encoding reduces the amortized block number size down to less than 2 bytes per entry.

Batching

For most cases, most of the transaction cost will be the base fee and other overhead, rather than the payload itself. DataEdge implementations should support batching by concatenating multiple payloads to lower the amortized cost.

Copyright Waiver

Copyright and related rights waived via CC0.

9 Likes

This is a good first step in starting to think about subgraphs more as read-oriented rollup chains, which has been discussed elsewhere.

Taking the previous block numbers from each foreign chain as state (derived only from previous transactions from the oracle’s account), delta-of-delta encoding reduces the average block number size down to less than 2 bytes per entry on average.

This is a really interesting idea for making the epoch block oracle more efficient!

5 Likes

Posting here a reference implementation of the DataEdge contract block-oracle/DataEdge.sol at main · edgeandnode/block-oracle · GitHub

3 Likes

Hey guys, I would love to help out or join discussions on this. I have worked on something simple that has a similar flavour.
https://neopost.netlify.app/

DataEdges is what this needs to work seamlessy! :clap: :clap:

1 Like

FYI-I have moved this proposal to GRC-0001, per the guidelines laid out in GIP-0001, as in it’s current form it does not seem to propose any changes to the subgraph API or core protocol logic, but rather a design pattern that might be standardized in the community.

I also changed the stage to proposal.

To be moved to draft I would recommend the following changes:

  • As subgraphs are multi-blockchain I would make it clear that this proposal is Etheruem-specific and freely uses Solidity/EVM specific concepts like “selector”.
  • I would disambiguate which concepts are being introduced by this proposal (i.e. “selector”) vs. are pre-existing and provide references to existing concepts.
  • Show how you derived the result that having an ABI in the subgraph manifest introduces 100 gas extra in the pattern.
  • Show what an example subgraph handler might look like that consumes a DataEdge smart contract.
  • References and relationship to prior art of other execution layers that simply use transaction call data for storage (i.e. Rollups)

I disagree with this assessment. The GIP clearly states:

The GIP proposes that a DataEdge instance be deployed to Ethereum mainnet as a part of the core protocol logic. The outcome would be that The Graph Council blesses contract address 0xFFFF... as The Graph Protocol’s DataEdge, which will in the future be a dependency of the Epoch Oracle and Query Version Registry - core protocol components. Further out I expect information like the Query Version Registry sent over the DataEdge bridge will be used in the automated protocol dispute process. This is in my mind a core protocol implementation decision point and therefore a GIP, not a GRC.

Great! The DataEdge as a design pattern works today. We’ve already tested the full pipeline including using a human-readable ABI in the subgraph. You can feel free to start using it for your project, whether or not it is accepted in this GIP as a dependency of The Graph’s core protocol logic.

For contributing to the discussion, it is happening here in this forum so feel free to post your thoughts.

1 Like

I’m certainly missing a key part here, but knowing that

  • DataEdge makes data available to subgraphs, but not to other Ethereum smart contracts, and
  • Epoch Oracle data will be necessary to manage allocations in the core graph contracts.

How does DataEdge makes the future Epoch Oracle accessible to the core Graph contracts?

It doesn’t (yet). In the meantime, cross-checking Indexers (Fishermen) as well as Arbitrators would use a subgraph query as a reference.

To close the loop when automating the dispute protocol, a contract implementing query validation logic would have to be deployed. Such a think might be called a “Subgraph Bridge”.

1 Like

Ok right, I forgot that epoch information isn’t checked when closing an allocation, but it is used to compute the POI. There is no need to have the smart contract to access the Epoch Oracle (yet)

Ah, the “specific instantiation” part of the proposal seemed to allude to future work and only seemed to include enough specific detail to be illustrative (I may have been thrown off by the “draft” label as noted above):

The Graph Protocol’s instantiation of DataEdge will use the selector as a namespace. For example, the crossChainEpochOracle(bytes _payload) and queryVersionRegistry(bytes _payload) methods would each correspond to selectors with their own payload encoding.

I do think that this design pattern is of broad general interest and could probably stand alone a community standard (GRC)–the first one!–since most of the proposal is broadly applicable, and the only thing that makes this proposal “concrete” is to deploy a specific instance of the empty fallback() contract for usage in the future GIPs for Epoch Block Oracle or Query Version Registry.

Don’t want to bikeshed though, so if you feel strongly here I’m happy to move back to the GIP list.

1 Like

The only thing I really feel strongly about is getting the protocol DataEdge contract approved and deployed in a timely manner. IMO none of the changes to the GIP requested are worth the opportunity cost of time taken from other priorities including but not limited to the dependent outcomes of the cross-chain rewards, query versioning, etc. Execution is the most important thing.

2 Likes