Chain identification & aliasing for The Graph Protocol

Title: Chain identification & aliasing for The Graph Protocol
Authors: Adam Fuller adam@edgeandnode.com
Created: 2022-07-28
Updated: 2022-08-12
Stage: Draft

Abstract

The Graph community is in the process of adding support for many more protocols & networks on The Graph Network. Graph Node, and The Graph Network currently relies on a freeform string label to match subgraph manifests with the correct upstream providers of blockchain data (RPC endpoints, firehoses). This GIP proposes that The Graph Network adopts CAIP-2 naming conventions for addition of chains to the network, with a list of supported aliases for backwards compatibility & readability.

Motivation

This will bring The Graph Network in line with a community-developed standard, and make for a more robust process as more protocols and chains are added in future.

Detailed Specification

Currently, only Ethereum mainnet subgraphs are supported on The Graph Network. These are identified via mainnet identifier, due to the Ethereum origins of The Graph. The Epoch Block Oracle will unlock addition of further indexed chains. This is therefore a timely moment to re-align on how chains are identified within the protocol.

Chain Agnostic Improvement Proposals (CAIPs) describe standards for blockchain projects that are not specific to a single chain. CAIP-2 defines a way to identify a blockchain (e.g. Ethereum Mainnet, Görli, Bitcoin, Cosmos Hub) in a human readably, developer friendly and transaction-friendly way.

The chain_id is a case-sensitive string in the form

chain_id:    namespace + ":" + reference
namespace:   [-a-z0-9]{3,8}
reference:   [-a-zA-Z0-9]{1,32}

Each namespace covers a class of similar blockchains. Usually it describes an ecosystem or standard, such as e.g. cosmos or eip155. These are equivalent to the protocol specified on a subgraph manifest.

Examples:

# Ethereum mainnet
eip155:1

# Cosmos Hub (Tendermint + Cosmos SDK)
cosmos:cosmoshub-2
cosmos:cosmoshub-3

# Solana Mainnet
solana:4sGjMW1sUnHzSxGspuhpqLDx6wiyjNtZ

# Solana Devnet
solana:8E9rvCKLFQia2Y35HXjjpWzj8weVo44K

Namespaces are maintained in a dedicated repository.

This GIP proposes that The Graph Network adopts CAIP-2 identifiers for the addition of new chains, via GIP-0008 (Subgraph API Versioning and Feature Detection). When The Graph Council approves support for new kinds of data sources, it should approve a CAIP chain_id. It may also approve a list of aliases as a comma separated list, for backwards compatibility.

For example, approving the addition of Polygon:

feature: eip155:137
aliases: polygon,matic
experimental: true
queryDisputes: true
indexingDisputes: true
indexingRewards: true

Once The Graph Council has approved a given addition, it should be added to a dedicated “Protocols & networks” section of the networks.md (link):

Network Alias
eip155:1 mainnet
eip155:137 polygon,matic

If it introduces a new protocol, that should also be added accordingly:

Protocol Aliases
eip155 ethereum
cosmos

If the network is approved for indexing rewards, that network should then be added to the Subgraph Oracle, so that subgraphs can be approved for indexing rewards, and the Epoch Block Oracle, so that indexers can close allocations. Indexers themselves may also need to update their Graph Node and Indexer configuration to support the new network, and any aliases.

Removal of rewards for a given network can be achieved by a similar governance decision, followed by removal from the network from the same components.

This GIP will require changes to support aliasing in several components:

  • Graph Node will need to support aliases for network names & protocol names as part of its configuration. This is already a requirement, given the name changes on certain EVM networks (e.g. Polygon, GnosisChain)
  • The Indexer Agent will need to match a subgraph’s stated network with the Epoch Block Oracle’s when closing allocations
  • The Subgraph Oracle will need to be updated to support a growing list of protocols & networks which are eligible for indexing rewards

Handling hard forks

CAIP-2 does not have explicit handling of hard forks, and some of the drawbacks of this approach are discussed in this thread. Hard forks introduce a challenge for indexers - if the fork is contentious, it becomes unclear which fork should be processed for a given chain identifier.

This GIP and the Epoch Block Oracle GIP propose that The Graph Council approves CAIP identifiers. In a case where both forks are “viable” (which is to say there is meaningful activity present on both forks), it is reasonable to expect that The Graph Community will want to support applications on both chains. As such, participants will need to be able to specify which fork they are interested in.

In cases such as Ethereum, where EIP-155 integers are not fork friendly, the majority chain will be unchanged, so will already be supported on the network. The minority chain will need to be added, once the new chain identifier is established, via governance approval.

In cases where identifiers are fork-friendly, then both new identifiers may need to be explicitly added. If the hardfork is planned in advance, The Graph Council could pre-approve the chain identifiers, but it is more likely that this will be done after the fork has taken place.

If for whatever reason The Graph Council does not want to support a given fork, then that network’s rewards eligibility can be removed.

There is a chance that The Graph community takes a different view on the alias for different CAIPs (for example which CAIP should be considered “mainnet”). However this GIP proposes that aliases are tightly coupled to CAIPs, even if that impacts the “sovereignty” of the community, as to change the CAIP corresponding to a given alias could cause significant confusion.

Backwards Compatibility

Alias support in Graph Node, and a list of accepted aliases for networks, will provide backwards compatibility for this feature.

The Epoch Block Oracle has been implemented with CAIP-2 identifiers as the expected network identifiers.

Rationale and Alternatives

  • Continue to rely on custom string identifiers across networks
  • Fully adopt CAIP end-to-end, which would not be backwards compatible

Copyright Waiver

Copyright and related rights waived via CC0.

5 Likes

Thanks for making this suggestion. Could the “reference” be changed to [a-zA-Z0-9]{1,64}

EOSIO-based chains use a 64 character chain id.

eg EOS blockchain might be this: eosio:aca376f206b8fc25a6ed44dbdc66547c36c6c33e3a119ffbeaef943642f0e906

Is there a “preferred” alias that should be used. Eg avoid “mainnet” and use “ethereum”.

Interesting - I wonder if that was raised for CAIP-2 itself (I can’t obviously see it in the discussions). I can’t see a harm in increasing to 64 characters?

I think for historical reasons “mainnet” on The Graph Network corresponds to mainnet Ethereum, and I think short-term backwards compatibility is important here. But in general I do agree that it’s not a great alias

1 Like

As a follow up here, based on some discussions around implementation:

  • We explored an approach that would encode network names and aliases into Graph Node, based on the specVersion. While there is an appeal to this more robust approach, it creates a Graph Node release dependency to add new networks, which isn’t in line with longer-term permissionless addition of networks to the prtocol
  • There is a concern that maintaining a list of aliases undermines the value of adopting the CAIP-2 standard. This is a valid, so it’s important to emphasize that aliases are primarily there to be backwards compatible with the existing mainnet naming
  • There is a potential benefit of CAIP-2 usage which is the validation of providers (i.e. does it have the correct chain ID for EVM networks?). This is not currentlyimplemented
  • Use of aliases creates a requirement to be “alias aware” in a range of network components, to ensure functionality & usability. This adds complexity, but is will be crucial for functionality and legibility
  • The current networks.md file is currently the “source of truth” for configuration on the network, and this will become more important as configuration requirements increase
  • The proposal includes the use of “matic”, as an alias for Polygon (and “xdai” for Gnosis chain). This is based on a legacy hosted service configuration, which is not a good reason to introduce complexity to the network

There is the practical consideration of the timing of this shift, given that the testnet program for Gnosis is kicking off this week. Following is a proposal

  • Add support for “gnosis” as a network on the hosted service, Studio and graph-cli (i.e. we will not be using “xdai”). This network identifier will be used for the testnet, and the configuration changes for this are in progress. ← we are here
  • Add aliasing support to Graph Node, ahead of introduction of the Epoch Block Oracle and any new networks on mainnet.
  • Revisit the networks.md to better scale for the increased configuration requirements (include aliases, more machine readable) - will require indexer & core develper feedback!
  • The Graph Council can then approve the addition of CAIP-2 networks, with aliases, on the mainnet deployment
  • In the future, CAIP-2 networks could be permissionlessly added, and we can migrate away from alias usage via migrations in graph-cli, and potentially based on Graph Node manifest versioning

This GIP looks good to me. I agree that extending the number of characters used for the reference part of the ID makes sense and should do no harm.

An open question I have is about whether we really have to split up protocol (e.g. ethereum/eip155, cosmos etc.) from specific chains like eip155:1, eip155:137. Are there any places where we distinguish between the protocol and the specific chain? If not, I’d be all for simplifying and not introducing aliasing in two places.

Thanks @jannis, yes agreed that I think we can alias only for the specific chain, and the protocol-level aliasing is not necessary given the current implementation (and less aliasing is better!)

Hey @adamfuller, should we set up a GGP to approve this GIP, so we cover all bases needed for going multi-chain? If so, could you allocate a GIP number to this proposal?

The antelope (EOSIO) CAIP ids will be truncated to the first 32 chararacters of the chain ID to deal with this.