Empowering Subgraphs with Verifiable Confirmations

Introduction:
Greetings, Graph enthusiasts! :rocket:

We’re happy to introduce a proposal that’s set to improve TheGraph. Imagine TheGraph as your go-to tool for seamless communication between different blockchains and as a source of rock-solid, verifiable data. Sound intriguing? Let’s dive in!

The Idea:

Our core concept is to provide subgraphs with a powerful feature: an accompanied lineage of confirmations for every event they index. This innovation ensures that every piece of data queried through TheGraph comes with a verifiable proof of validity. This lineage of confirmations is produced by a decentralised network of nodes, effectively enabling TheGraph Node with an associated validator/oracle network.

Understanding the Necessity:

We recognise the importance of block confirmations in inter-blockchain communication. While a fully decentralised TheGraph undoubtedly mitigates many concerns, it’s crucial to remember that TheGraph is designed with a trust-without-verification mechanism. It operates with a sense of optimism. For TheGraph to drive inter-chain data protocols, there will persistently be a demand for validation lineage.

Benefits:

  • Guaranteed Authenticity: When you fetch data from TheGraph Node, you can trust it’s the real deal. Each query can be reutilised to obtain an associated history of confirmations.
  • Historic Blockchain Data for Smart Contracts: Smart Contracts can be created to verify the confirmations associated to query, meaning that TheGraph can be used to enable data-intensive on-chain applications.
  • Inter-Chain Communication: TheGraph Node becomes a bridge between different blockchains. You can query TheGraph for data, and the same query can produce a series of confirmations derived from the indexing process.
  • Boosted Trust: Businesses and developers can confidently rely on centralised TheGraph Node operators as a trusted data source for their applications, knowing that every piece of information has passed validation.

Validator Network:

Our decentralised nodes are the guardians of this validation process. They work to ensure the data is valid and that query responses can be verified. Validation performed by these nodes can involve re-requesting events from RPC nodes or producing Merkle-Patricia-Trie proofs.

Integration:

To make this magic happen, we’ll pair TheGraph Node with a Sidecar node. The Sidecar will keep a close eye on the indexing behaviour of TheGraph Node, publishing each indexed event to the Log Store Network, and then subscribe to a data stream to receive indexed event confirmations in real-time. The validation logic will be embedded into the Log Store Network nodes, and the nodes can be made available for anyone to participate. The synergy between these components ensures a seamless flow of verifiable data.

The Role of Log Store Network:

It serves as a decentralised time-series database constructed atop the Streamr Network. The Streamr Network facilitates real-time peer-to-peer data transport. Custom validation logic is being integrated into the network to empower centralised processes with the validator and oracle mechanisms outlined in this proposal.

Key Use Cases:

Imagine the possibilities:

  • Centralised processes can query TheGraph Node, and then the Sidecar Node to yield data that can be verified on-chain.
  • Businesses can confidently build on-chain applications on top of a centralised TheGraph Node, knowing that the data they rely on is backed by a solid history of confirmations.

Conclusion:

This proposition promises to take TheGraph to new heights. It’s not just an evolution; it’s a revolution! We believe this innovation will bring tremendous value to TheGraph and the entire Web3 ecosystem. Now, we’re eager to hear your thoughts, engage with the community, and bring this exciting concept to life!

2 Likes

Side note:

I’m aware of the discrepancy between mentions of “TheGraph”, and how it should be phrased “The Graph” :sweat_smile:

1 Like

Hi @ryanwould, great to see proposals like this one to improve the protocol.

One important clarification:

This is not the case: The Graph has a few mechanisms for verifiability (proofs of indexing, attestations for queries from indexers) and an arbitration mechanism to resolve disputes. There is active research from core devs (especially Semiotic) to keep improving verifiability and to add more automated dispute resolution.

Other than that, could you please explain a bit more how adding the proposed Log Store Network would provide verifiability? I can’t see that in the proposal - it should already be possible to build a verifiability solution by proving state from Ethereum without the need for a side chain or oracle network, so I’m curious how you see these sidecar nodes improving things. Do you propose some sort of consensus mechanism?

And who would pay for running these nodes? Do you have an incentive mechanism in mind?

I’m not familiar with the Streamr Network so I’m not sure if that’s what addresses these questions, so I’m curious to hear more.

Btw, if you want to turn this into a concrete proposal, please refer to the GIP process (originally here, though we’re simplifying the process a bit so you can use this newer version which will hopefully become the official process soon).

1 Like

Can you open this a little bit?

1 Like

Hello @Pablo

I appreciate you engaging the topic.
I believe feedback like this is necessary before converting this into a concrete proposal.

In the context of this discussion, it’s important to note that the Sidecar Node and the proposed use of a separate validator network to confirm the validity of indexed events are initially considered in the context of centralised Graph Nodes.

Objective

The primary goal of this proposal is to enable a single node to leverage The Graph’s indexing capabilities for cross-chain messaging while providing cryptographic proof to the data’s source on the indexed blockchain.

Current State of The Graph

At present, when a query is made, there’s no straightforward way for the querying party to obtain cryptographically verifiable proof of the data’s validity in the query response. The querying party places trust in The Graph Nodes involved in the request to produce Proofs of Indexing (PoIs) relevant to the resolved fields. This is what is meant by “trust-without-verification”.

While the presence of incorrectly resolved PoIs can serve as a mechanism for dispute resolution, it’s essential to recognise that this process is optimistic. This means it operates with the assumption that at least one party in the network is acting in good faith. However, to resolve disputes, a challenge period is required, which, in this case, spans 7 epochs or days.

This evaluation is based on the assumption that the subgraph is indexed by a decentralised network of Graph Nodes, rather than a single centralised Graph Node. This approach works well in most cases, providing low-latency outcomes.

However, when data is intended for inter-chain communication, it becomes vital for the destination chain to verify the data’s validity through cryptographic proofs, especially if the Graph Node operates in a centralised manner or if the data queried occurred within the last 7 epochs.

From the perspective of a blockchain receiving a query response from a Node querying The Graph, it’s challenging to prove that the intermediary node conducting the query hasn’t tampered with the data. Even if PoIs are used, they do not confirm that the post-processed query response accurately represents the indexed data.

Verifiability

For a centralised node to provide cryptographic proofs about the query response from The Graph, it needs to obtain cryptographically signed tamper-proof confirmations about the response. This can be achieved through a network of validators or the decentralised The Graph Nodes.

Implementation

For Decentralised Graph Nodes

This approach assumes that a significant number of Graph Nodes serving as indexers operate the Sidecar Node.

  1. The Sidecar Node observes queries to the Graph Node.
  2. Upon receiving a Query Response, it signs the Query Request and the hashed response and publishes this message to a Streamr Data Stream, which is essentially a Gossip Topic. The Log Store Network facilitates storage of the data stream.
  3. Other Graph Node Sidecars subscribe to this data stream and initiate internal queries to sign and publish it to the Data Stream.
  4. The centralised Node that initiated the query subscribes to this data stream, receiving a continuous stream of cryptographically verifiable confirmations about the response’s validity, as shared by the network of Graph Nodes.

This process essentially makes the Graph Nodes validators of their query response, and the party querying The Graph with the additional requirement to include confirmations compensates the network.

For Centralised Graph Node

In this approach, the Sidecar Node publishes each indexed event to the Log Store Network, where the network validates these events and provides real-time signed confirmations of the validation process. This allows the GraphQL query to access the raw indexed events, their corresponding signed confirmations, and enables on-chain cryptographic verification of the data indexed by the Graph Node.

Log Store and Streamr Networks

The Log Store Network serves as a decentralised, tamper-proof time-series database. It’s being designed to incorporate functionality, such as validation and aggregation mechanisms. The team is even researching programmability in traditional languages, enabling verifiable compute over real-time data. This goal is to empower central processes to validate their data or processes effectively, and simplify the creation of bespoke data protocols.

The Streamr Network acts as the data transport layer for the Log Store Network, allowing access permissioning of node addresses on-chain for publishing and subscribing to data streams.

This combination of technologies forms the basis for enhancing The Graph’s verifiability

References

The contracts in this repository outline the incentive mechanism of the Log Store Network.

1 Like

Hello @eray,

Please refer to the implementation brief here:

The aim of the Sidecar for the centralised Graph Node is to accompany the indexed data with a series of confirmations about the validity of any events indexed.

1 Like