Hi! I’m Henri Pihkala from Streamr – a decentralized real-time data infrastructure project.
The Streamr Network is a peer-to-peer network for publishing and subscribing to data in real-time. The Network consists of nodes that interconnect peer-to-peer using the Streamr protocol to form a topic-based publish-subscribe messaging system. Topics in this messaging system are called streams. The job of the Network is to deliver published streams of messages to all subscribers of that stream.
Late last year we completed three testnets with over 90 thousand interconnected nodes streaming messages to each other. Earlier this year we launched the production ready decentralized Network that maintains over four thousand interconnected Broker nodes.
Streamr Network properties
- Censorship resistant publish/subscribe messaging at scale
- Sub-second ordered message delivery
- Cryptographically signed and end-to-end encrypted messages
- Composable smart contract access control
- Runs anywhere JavaScript runs, including the browser
- Free to use, peer-to-peer architecture
- Pseudonymous messaging
Streamr is a power user of The Graph (TG). It depends on TG for real-time complex state queries of its on-chain stream registry to determine which Ethereum identity has permission to publish/subscribe on streams of data inside of the Network among many other use cases. We believe that Streamr could also be used by TG internally to interconnect and extend the capabilities of its own Indexer network.
By introducing decentralized real-time communications into TG’s stack, the TG protocol will be better able to maintain itself, and do so without compromising on TG’s open and decentralized nature. On Streamr, every data point is cryptographically signed by the data publisher using their Ethereum private key, so that provenance and tamper-proof guarantees are as strong as Ethereum itself.
Indexer performance and problems would be able to be quickly identified, benchmarked and solved if the Indexer network openly shared real-time stateful logs containing:
- Observed block height & hashes
- Proof of Indexing (PoI) hashes & PoI data
- Subgraph indexing logs, warnings and errors
- Local machine activity logs
- And more possibilities
In this hypothetical upgrade, Indexers interconnect and form a network topology, publishing stateful data points to each other as well as external interested subscribers, such as the TG core devs and subgraph developers.
Real time data sharing and interconnection should improve support outcomes as well as protect Indexers from slashing. For example, Indexers could compare indexing hashes among their peers in real-time, in case others saw different data then they could do some recovery processes before they submit their erroneous data as PoI.
Since TG’s Indexer software is written in TS and runs on NodeJS, it would be straightforward to include the streamr-client NPM package into the TG Indexer. Using this library, every TG Indexer becomes a node in relevant Streamr peer-to-peer topologies, and can broadcast messages to other Indexers listening for events in the same stream/topic. The TG Indexer use case can be modeled as a stream per subgraph, a global stream joined by all Indexers, or a combination thereof.
The addition of this messaging functionality adds some CPU, memory, and bandwidth usage, which depend on the volume of messages in the streams, but should in this use case remain negligible.
In short, we are looking for feedback towards conducting a proof of concept where Indexers share their real-time state, with the goal of enabling automatic recovery functionality and other reactions based on the data. We’re interested to hear any technical limitations we might encounter, opportunities to explore and closer collaboration with the TG core devs.
Any links to previous work, discussions in this direction, thoughts, ideas, wish lists, would be highly appreciated.
Thanks!