Semiotic Labs May 2024 Update

:woman_astronaut: Summary

Over the last month, Scalar TAP work focused on refining the TAP indexer integration through debugging and enhancing lower-priority features, revealing several bugs through feedback from InfraDAO’s testers. Concurrently, efforts were made to engage the community around the Verifiable Firehose project by seeking feedback for the flat-head verification tool and era-file-sink demo, aimed at implementing EIP-4444 with The Graph. Development continued on the Verifiable Data Service, aimed at providing verifiable queries and indexing, particularly for events like swaps on Uniswap-v2. Preparations for a private beta launch dominated activities for the Subgraph:SQL Data Service, involving deployments on testnet and iterative improvements based on user feedback. Additionally, the AI team concentrated on bridging the performance gap between models trained on synthetically versus manually generated data, scaling datasets, and refining methodologies to enhance model accuracy and data quality.

:tada: Looking back (what was delivered)

Scalar TAP

Last month, we were mostly busy debugging and developing more of the lower-priority features in the TAP indexer integration. Thanks to our interactions with InfraDAO’s testers, we refined our set-up guide and found numerous bugs and annoyances that only an outsider would see. However, we haven’t reached a complete successful allocation lifecycle on the InfraDAO side yet. We prepared a set of test plans for fault injection tests.

PRs (indexer-rs):

  • Epoch-free network subgraph queries #150
  • Update dependencies to get better graphql errors #151 #154
  • Add an eager receipt timestamp check (to avoid timestamps too far from the present) #155
  • Implement limit of receipts per RAV request #159
  • Add defaults and examples for indexer-service config #160
  • Add metrics to tap-agent #161
  • Add an eager receipt max value check #164

Verifiable Firehose

Last month, we continued reaching out to core devs, Indexers, and other interested parties for review and feedback on the flat-head flat file verification tool and era-file-sink, our demo for using The Graph as a solution for EIP-4444. Based on input from Indexers, we added support for verifying flat files stored using SeeweedFS to flat-head. We also shared our era-file-sink demo and a short video demonstration with the Nethermind Eth client team and await feedback.

Verifiable Data Service

We have continued to develop the verifiable data service we shared last month. This service provides verifiable indexing and verifiable queries like “Give me all the swap events emitted from the uniswap-v2 contract from block x to block y.” Also, recall that the queries do not transform the data: the idea is that other downstream processes, e.g., a coprocessor, can consume this data and perform verifiable compute.

Subgraph:SQL Data Service

The past month has been about preparing for the private beta launch. As a result, we’ve deployed our SQL gateway and Indexer on testnet. Along with a few other testers, we have been running queries against it and fixing anything that comes up, whether bugs or UX improvements. We have an open PR on graph-node that’ll enable other Indexers to start participating in the private beta.

PRs (graph-node):

  • graph, graphql, server, store: Subgraph Sql Service #5382

AI

This past month, we continued our research around reducing the gap in performance between models trained on synthetically and manually generated data, implementing additional filtering to improve training data quality, and scaling our synthetic dataset further to explore the impact of query difficulty on data quality. This included additional investigations into the impact of training data size on model performance. Seeing promising results from these investigations, we expanded our benchmarks to further assess our methodology’s performance.

:rocket: Looking ahead (upcoming priorities)

Scalar TAP

Verifiable Firehose

Our plan for next month depends on interest and feedback from other teams. For example, we can add support if teams are interested in post-merge verification support for Ethereum historical data. We can also consider supporting the addition of other blockchains.

Verifiable Data Service

Our current spec is here. The key feature of the design is that indexed data is committed to using a vector commitment scheme, perhaps with additional structure (e.g., a Verkle Tree). We are currently selecting an appropriate commitment scheme based on the observation that the latency of verifiable query generation is paramount; this verifiable data service is the source for all downstream processes, and any latency introduced here will impact all downstream processes. We will continue to update the document linked above as the design matures.

Subgraph:SQL Data Service

We’ll launch the private beta on testnet with a limited set of users. Our goal here is to release something that will enable us to get feedback for the final product, not to release something necessarily feature-complete (or even nice to use). We’ll ask for extensive feedback from Indexers and users during this phase. We’ll use this to build the jobs-based data service framework, which the full SQL will eventually be built on. More on this in a future post! You can also look forward to a blog post we’re writing to launch with the private beta. We will open-source the code for our VSCode extension, which serves as a “playground” for the SQL service. Based on some feedback from one of our internal testers, we’re also working on splitting out some of the functionality in the extension to a Node.js SDK. This will enable other devs and teams to build their own interface to Subgraph:SQL.

AI

This coming month, we plan to continue our investigations into using various filtering methods to improve data quality and advance our research into further automating the synthesis and selection of input data used in the generation process.

Events

Sam presented at Coinbase’s Machine Learning & Blockchain Research Summit: recording

4 Likes