I have been thinking about a concept I call “Personal Subgraphs” for a few weeks and would like to propose the idea to the community. I hope this creates discussion and ideas around Subgraphs built for a User instead of a Protocol, as well as the privacy of Subgraphs, and allowing users to own their data in the web3 world.
The simplest way to describe a Personal Subgraphs is that it is a View on an Account. Normally, Subgraphs have been built to give a View to a Protocol. As you can see in The Graph Explorer, almost every Subgraph is tied to a Protocol.
I believe there are many problems Personal Subgraphs can solve, but I would like to focus on one specifically - web3 wallets.
It is tough for a wallet application, such as Rainbow, to use Subgraphs to display a user’s holdings. The main reasons are as follows:
- Indexing an ERC-20 token can be extremely time-consuming - The Graph currently will index a contract address from a startBlock - and will query the Ethereum RPC for every event on that contract address. This makes indexing a coin like USDT or USDC incredibly slow.
- Wallets must display many different ERC-20s a user holds.
- Wallets must show balances for multiple tokens the user holds. The only way to know this is to index everything (i.e. Etherscan) - or deliberately choose what to index.
- A great attempt was made by Open Zeppelin to deliberately choose the top 20 ERC-20 tokens for a subgraph - by taking a combination of token lists and filtering for the most common. But even this is too slow, and it also does not solve the problem of the wallet - since some users may have tokens outside of the Top 20.
- Wallets must display many different NFTS a user holds - NFT’s have become super popular, and users now want to see both ERC-20s and NFTs.
These reasons prevent web3 wallets from using Subgraphs. However, I believe there is a way to pull it off right now, without requiring the Graph Node software to index every token that has ever existed.
Personal Subgraphs provide a solution to the wallet problem. I will break down the solution into three parts - what needs to be built at the Indexing Layer, what needs to be built at the Smart Contract Layer, and what needs to be built at the Product Layer.
Let’s start with Indexing first. This would involve a change to The Graph Node. The changes needed are as follows:
- Ethereum events can have up to 4 Topics, which are explained nicely here by the Ethers Documentation.
- Graph Node would have to add in the ability to filter on topics other than topic0.
- Graph Node does not have this ability today. It has been discussed in this issue on Github. I have copied the example suggested by the issue creator below:
- This should speed up Subgraph syncing immensely for Subgraphs that want to filter on multiple topics.
- It becomes possible to filter for Transfer events that only contain our user’s address in the
fromthe fields for ERC-20s. The same for ERC-721s.
- I believe it is also possible to filter for Transfer events with a user address for all smart contracts on the network. This might still be slow, it would have to be tested. If it is too slow, we can filter on specific ERC-20s and NFTs chosen by the user. Example:
- Bob owns 10 ERC-20’s and 5 different ERC-721 tokens.
- These 15 contracts are all in the Subgraph Manifest. Transfer events are indexed for each contract, and they filter on the user’s address for both the
Let’s now discuss how to implement it in the Smart Contracts. There are two ways to think about it.
- We could just deploy these Personal Subgraphs to the current network, and it should work.
- If we filter on topics 1, 2, and 3 for the Transfer event, and Graph Node can keep up, then we should have ERC-20 balances for all Users obtainable.
- Note - this assumes that we can query the entire blockchain with all 4 topics as filters - i.e. you check every smart contract, but filter extremely tight on the 4 topics. I am fairly certain this is possible but I have not confirmed.
- Same for ERC-721s.
- This would require no change to the Smart Contract Layer.
- Let’s consider that the user wants a specific View into their account. A View on 10 ERC-20s and 5 NFTs that they currently own.
- We can anchor this information on-chain, to give triggers to the Subgraph to know what to index. A simple example follows:
- This shows how a user can indicate the specific data they want to be indexed. As you can see the View is derived from both the
Useraccounts. This means that anyone can build a View into any account.
- A key thing I want to point out is that with this setup, a personal wallet subgraph can become auto-deployable code. The only parameters that change would be the
User. This would create a unique and deterministic subgraph ID for anyone’s view into an Ethereum account.
- I believe it could be some sort of a web3 personal data primitive. It needs more thought and discussion though.
- The Subgraph also does not need to Index from the startBlock of the ERC-20 contract. At the instantiation of an ERC-20 being anchored, the Subgraph can do an
eth_callto get the balance of the user for that token, and then index events forward from there.
- Of course, this would be expensive. I tested it and it cost $20 to add 4 tokens through events.
- It could also just be anchored on IPFS as a JSON file that the Subgraph parses. Thus, instead of emitting 20 events for an ERC-20 that is added, you just emit 1 IPFS hash to query, which contains that list of the 20 tokens. This should result in about $5 to index as many tokens as you’d like.
- However, a better solution would be to build it into the core Graph Node software, rather than anchoring it to an expensive L1, or semi-expensive L2. This would make it free. I currently do not have a suggest on how to fit this into the manifest, help here would be great.
- The cost of deploying Subgraphs to the decentralized network right now is prohibitively expensive for an individual to pay for a Personal Subgraph. Each new Subgraph created costs about 0.1 ETH or around $450 dollars in today’s prices. For now, we can test with the Hosted Service, but long term it would have to become cheaper.
Let’s now discuss how we could create a product that would allow anyone to create their own personal subgraph. I can think of two ways this would happen:
- A user coming to a website to do it themselves, to monitor their own balances in a wallet
- A Web3 Wallet that ties into this system and does it behind the scenes for its users.
In either case, this is what it could look like:
- Have a list of NFTs and ERC-20s from a dropdown. This could be a curated list that exists such as Token Lists. User picks the tokens they own so they can be added to the wallet.
- For rare and exotic tokens not on a Token List, allow a user to directly paste in a contract address of any token.
- With this information, the Subgraph can now be deployed. The Subgraph would be a template, such as the Open Zeppelin Subgraphs. It would require the following inputs:
- The address of the user.
- The address of the sender/creator of the Subgraph, to create the unique View. And as explained before, it would be unique and deterministic.
- All the tokens the user chose from the dropdown, to filter on the specific tokens (if needed).
I have implemented a proof of concept here. I hacked it together, since there is no ability to filter on topics 1, 2, and 3. It contains the following:
- A simple PersonalSubgraphAnchor which anchors emitted ERC-20s to track for an address, as a trigger for the Subgraph.
- A Subgraph that tracks the Anchor contract, and then uses Data Source Templates to track all ERC-20 tokens that are emitted for that View.
- The Ethereum Account we are Indexing is Binance 14 (as labeled by Etherscan). This was chosen as it would be a highly active account on Ethereum with many transactions.
- The Subgraph does not index any of the tokens from the start. It does a contract call to get the balances of Binance 14 and then starts to index every block for those tokens.
- The personal subgraph is deployed here. It can be seen that it is successfully syncing this account for 4 tokens, AAVE, DYDX, USDT, and USDC. It has been syncing smoothly for over two weeks.
- It does not include the ability to RemoveToken for watching, but ideally a real implementation would.
- It only works well with established standards. When talking about ERC-20s or ERC-721s, any contracts that don’t follow the standard (i.e. Crypto Punks) might not work.
- Only improves Subgraphs that need filtering on all event topics. It works very well for ERC-20s and ERC-721s. It may have other good uses as well. But many protocol Subgraphs will gain no benefit from this upgrade.
As I noted above that this has the potential for a web3 data primitive, but it needs more thought. I will list some ideas I have on what could be built.
- There is an open PR to the contracts where you can see an implementation of Subgraphs as NFTs. The concept of Subgraph as NFTs is very interesting, especially when you start to think about data ownership.
- I believe every data protocol focused on “owning your personal data” could just build Subgraph NFTs, and build their protocol on top of The Graph.
- The cost is a bit insane to launch an ERC-20 for a personal subgraph. We need to solve this of course. But it is a problem everyone faces.
- In any discussion of personal data, privacy is needed. We will need to think about how to protect users’ data if people are building personal subgraphs.
- The API access to query the subgraph could be limited to the NFT holder. As the web3 ownership economy matures, we likely will see the disappearance of API Keys, replaced by NFTs as access keys.
- For the Personal Wallet Subgraph, we could build a module that allows for the charting of the user’s financial assets. This would allow for a decentralized competitor to Blockfolio (FTX).
- Most users own multiple accounts across multiple chains. We can aggregate these into a single Subgraph.
- The trick here is privacy. If you combine your work wallet and your personal savings wallet, you’ve just doxxed your savings to your employer.
- In an idealistic world, every person should be able to mint a basic Subgraph NFT for their own data for very cheap, or free.
- They should be able to stake some small amount of GRT in perpetuity. The rewards from this GRT could get routed directly to an Indexer, to pay for ongoing Indexing and query fees of their Personal Subgraph.
I will list the important parts to pull from this long post:
- Web3 wallets can integrate with The Graph with an upgrade to Graph Node to filter on all topics.
- I believe if we could get this implemented, we could reach +10,000 Personal Subgraphs deployed, if a web3 wallet decided to integrate. (Gas issues still exist, hosted service could be used in short tern).
- There is a rough POC showing how it would work (without the filtering).
- I believe Subgraphs as NFTs could be a web3 data primitive.
- Personalized Subgraphs could be a huge part of this, although I believe much effort will be needed on the privacy side.
Long term I hope this builds more discussion and ideas around centering Subgraphs around Users instead of Protocols, as well as the privacy around these Subgraphs.
- Are there other use cases for filter-by-topic syncing? (The open feature request makes me believe there are).
- What would the upgrade for Graph Node look like and how hard would it be? What would the manifest change look like to filter only on an event topic?
- How could we get web3 wallets such as Rainbow to use this?
- How can we make it cheap enough so anyone can deploy a Personal Subgraph to the Decentralized Network? Could we partition all of them to an L2?
- How do we deal with privacy long term, when users want to combine multiple of their accounts in a single View, but not doxx themselves and their connecting accounts?
- Is there a better, more general name than Personal Subgraphs?