Hello Graph Community!
Our last post described a formal definition of data (Network Graph) its interface (Network Graph API) and a way to retrieve data using graphQL interface in subgraph generation.
It’s time to unveil another proposal - the Manager-Worker data ingestion model.
Our long experience with indexing various networks at Figment - helped us to propose an optimal, resilient way of network data ingestion. The model described in the document attached below is a modification of the architecture used in Figment, that supports indexing of many networks and powers many products like Hubble or Transaction Search API.
We decided that creating an external index on network data is a fast and most reliable way that won’t need an indexer to maintain old network nodes forever. Thanks to the regime defined in GIP-13 throughout the development and maintenance of the indexer - we can be sure that no API or data would be broken.
We propose indexing infrastructure based on 2 horizontally scalable services.
The Worker - a well-tailored implementation of the network API, responsible for fast fetching and mapping network data.
The Manager - Implementation of Network Graph API and a flow control system that keeps network ingestion progress.
Together they form Network Indexer to which subgraph runtime can subscribe and retrieve data from.
Thanks to the scalability of both components, workers can retrieve data blazingly fast and also reduce the costs of ingestions based on current needs. Managers as well can scale up and down based on the number of incoming requests.
The document does not define the way of a network data storage. We believe that it has to be adjusted to network implementation’s needs. This way we’re not limited only to the database approach - we can store data in a file-based store (for durability and lower maintenance costs) or in a memory database for fast access.
Once synced for an entire network, there is no need to re-fetch the data from the network node over and over again. After the initial indexing process, Network Graph API’s indexes will make sure that the data is always indexed and returned in a timely manner for every new subgraph deployment.