Objective of This GRC
GraphOps first presented the idea of a Gossip Network for Indexers in late 2021, and following our Core Developer Grant in July, have dedicated our resources to making it a reality!
This GRC presents our approach for a Gossip Network for Indexers (called Graphcast). In sharing this post, we intend to solicit feedback from the wider community which we will use to inform the design decisions for this project. Please read the GRC in full and share your thoughts and feedback in the comments below.
High-Level Description
Graphcast will act as a decentralized, distributed peer-to-peer (P2P) communication tool that allows Indexers across the network to exchange information in real-time. Today, network participants coordinate with one another using the protocol by submitting on-chain transactions that update the shared global state in The Graph Network. These transactions cost gas, which makes some types of signaling or coordination between participants too expensive. In other words, sending signals to other network participants has a high cost that is determined by the cost of transacting on the Ethereum blockchain. Graphcast solves this problem.
Graphcast will act as an optional off-chain layer of infrastructure that Indexers can opt into independent from on-chain protocol operations. The cost of exchanging P2P messages is near zero, providing an ideal environment for participants to coordinate without concern for cost. It is worth noting that this comes with tradeoffs, most importantly that the integrity (i.e. the truth of) a message has no guarantees by default. Nevertheless, Graphcast aims to provide message validity guarantees (i.e. that the message is valid and signed by a known protocol participant).
We intend to release a Software Development Kit that allows developers to build Radios, which are gossip-powered applications that Indexers can run to serve a given purpose. In this GRC we also present a number of use cases for Graphcast, each of which may become a Radio. Our proof of concept is focused on the use case of real-time cross-checking of Indexer Proofs of Indexing (POIs).
Motivation
The main goal of Graphcast is to unlock new forms of healthy coordination among Indexers. Currently, if Indexers want to coordinate in the network they would need to send on-chain transactions to exchange messages. These costs create a lower-bound for the minimum value of a message or signal that is posted on-chain, which inevitably prices out some coordination that would otherwise be beneficial to the participants. Graphcast provides an off-chain P2P messaging layer that will address the cost aspect of this problem because messaging through the network is near-free. Of course, that comes with its own set of tradeoffs, and Graphcast should not be thought of as a replacement for existing on-chain coordination.
Developers building tooling for Indexers and other participants focus mostly on their application logic (and rightly so), which can lead to design decisions that aim at minimizing additional complexity. Since centralized command and control of an application is easy and efficient, that naturally leads to the centralization of the data in those applications. Graphcast provides a neutral messaging backend on top of ecosystem tooling developers can use to build their applications (called Radios). In other words, the aim is to give developers a construct that makes it much easier to build their ecosystem applications in a decentralized manner from the start. An additional bonus is that developers also won’t have to manage backend servers!
This approach to building ecosystem tooling will also democratize access to the underlying data that powers these applications, which will deter metadata monopolies that may emerge from tools using centralized messaging solutions that gain large-scale adoption. It also enables different applications to be built on top of the same underlying stream of gossip messages.
There are, of course, some considerations to this approach, which include:
- No strong guarantees about the integrity (honesty) of messages passed in the network.
- Difficulties in testing the Radios, since the Radio developer would need to spin up multiple instances to ensure their logic is behaving as expected across a network of Radios. We will provide a test harness (possibly in the form of a Dockerfile) that will help Radio developers with this process.
- Lack of multi-language support (at least in the beginning).
- Inspectability and visibility limitations. That can lead to difficulties in monitoring and debugging.
Use Cases
Graphcast will unlock tangible benefits for Indexers to coordinate with each other, as it will practically enable a whole new design space for low-cost coordination. Some examples of Radios include:
- Conducting auctions and coordination for warp syncing subgraphs, substreams, and Firehose data from other Indexers.
- Self-reporting on active query analytics, including subgraph request volumes, fee volumes, etc.
- Self-reporting on indexing analytics, including subgraph indexing time, handler gas costs, indexing errors encountered, etc.
- Self-reporting on stack information including graph-node version, Postgres version, Ethereum client version, etc.
- Real-time cross-checking of subgraph data integrity, with active bail-out in the case of diverging from stake-weighted POI consensus.
- AutoAgora cross-Indexer signals/negotiations for improved automated query pricing.
The potential use cases for Graphcast in the long run are vast and open to the imagination and creativity of the community. Radios could be focused on the coordination of other network participants, such as Curators or Delegators. For example, maybe a Curator would do subgraph discovery through a Radio built for that purpose, such as only curating on a subgraph once they know an Indexer can sync deployment up to the chain-head.
Technology Stack
After reviewing the landscape of P2P networking stacks, we plan on using Waku V2. Waku is built on libp2p (used by the Ethereum Beacon Chain) and prioritizes adaptive networking, node privacy, message unlinkability, and modularity as design goals.
We will start by implementing the Graphcast SDK, a TypeScript SDK that abstracts over Waku, provides identity resolution, and that will serve as a base client to Graphcast. It will expose a rich interface for building Radios.
Our perspective of the benefits of using Waku over alternative solutions are covered in the points below, but we welcome feedback and alternative approaches. All our code will be open-source.
This project will have many iterations, so design decisions being made for the first iteration should by no means limit exploring different ones in the future. Gossip messages are defined using Protobuf, providing a good basis for portability and SDK implementations in other languages if required.
Initially, we considered a few other options:
- Building our solution on top of libp2p
We could always leverage the underlying technology that Waku and most other gossip protocols are built on: libp2p. But that would add additional development overhead and having to “reinvent the wheel” when it comes to features that Waku provides out of the box like message signing and encryption, protection against network-level spam, and a level of abstraction over the raw protocol in general. - Using a solution like GunDB or OrbitDB
We explored the idea of using a distributed multi-party database, in which peers would directly write into, but we decided that the Graphcast SDK should remain with an ephemeral approach and that, if needed, data storage should be handled by the Radio. - Other gossip clients built on top of libp2p
Of course, Waku has alternatives that pretty much aim to accomplish the same thing. The reason we went with Waku instead of one of those is that Waku is the most “battle-tested” of them all and the development team (as well as the community) behind it is committed to further enhancing the product and making it the main solution of choice for messaging in web3.
Proof of Concept
We are currently iterating on a proof of concept release, intended to validate our approach before we release an MVP for others to try. The proof of concept is focused on a single Radio implementation: real-time POI cross-checking. The key requirement for an Indexer to earn indexing rewards is to submit a valid Proof of Indexing. The importance of valid POIs causes many Indexers to alert each other on subgraph health in community discussions.
The Radio works by aggregating normalized POIs from all participating Indexers and weighting them by indexer stake. If there is a mismatch between the local nPOI versus the nPOI that is backed by the most stake, then the Radio updates the cost model for the diverged subgraph deployment to an extremely high price to defer query traffic. This real-time view of aggregate stake-weighted POIs enables rapid detection of POI divergence. Learn more about how this works in its GitHub repo.
The proof of concept is split in two parts - the Gossip SDK implementation and the POI cross-checker Radio that is built on top of it.
The SDK is the core that abstracts all the necessary components of each Radio away from the user (the user being the Radio developer). That includes:
- Connecting to Graphcast, e.g., other peers in the network.
- Interactions with the Ethereum network and The Graph stack.
- Resolving the sender identity with the help of a registry contract that matches the message sender to their on-chain identity.
We will be further developing the proof of concept during the next few weeks, in order to test the most critical pieces/mechanisms. Thereafter we will move on to developing a Minimal Viable Product (MVP) intended to be run by the wider community.
You can find more information about the proof of concept in its GitHub repo. It’s important to note that this proof of concept is not meant to run in production, but just to demonstrate how all of the concepts of Graphcast will work.
The proof of concept will be demonstrated during the Monthly Core Dev Call tomorrow (Sep 1). That demo will be recorded and we will share the recording in the comments of this post.
Risks and Challenges
Everything in software is a trade-off, and Graphcast is no different. There are some risks and challenges that need to be considered:
- Data integrity by reputation
Since sending messages on Graphcast is free, in particular situations some participants might see incentives to send false or misleading messages. This is why, depending on the nature of the Radio, a reputation system can be vital. That system could be in the form of local or shared reputation models, external or in-protocol economic security through slashing. - Versioning
We need a strategy for versioning the base layer of Graphcast, whether it’s distributed as an SDK or a standalone application. Changes in the logic (both breaking and non-breaking) could cause issues in the coordination process. An automated release process would be the optimal solution, with a handy way to notify the users of a new release when they run their Radio, prompting them to update. - Dependency failure
As with all software, Graphcast will not be built “from scratch. ”It will rely mainly on the Waku typescript SDK, but also on other libraries in the Node.js ecosystem, which naturally adds the risk of dependency failure. We can mitigate that risk by having regular CI (continuous integration) builds and dependency checks.
Next Steps
Here are our main objectives for the near future:
- Collect feedback from Indexers, as well as the wider community.
- Publish on-chain registry contract that will be used to connect a Graphcast participant’s identity to their on-chain identity in The Graph protocol.
- Publish Graphcast SDK so that it can be used as a library (on npm, for instance) for the development of Radios.
- Further develop the Graphcast SDK and the POI cross-checker Radio to get them to an MVP stage. You can view the progress on that on the Github page of the proof of concept.
- Discuss the prospects and future of Graphcast with other Indexers during Indexer Office Hours (IOH).
Copyright Waiver
Copyright and related rights waived via CC0.