I’m currently developing dchan, an imageboard whose backend logic is entirely implemented in a subgraph, including content moderation.
However, time-travel queries pose a security risk in this scenario, as it is possible for indexers to serve illegal content, should a user point to the block prior to the moderation happening.
Hence, I would like to be able to disable this feature to protect both users and indexers.
An option in the subgraph.yaml's datasource definition would suffice.
Hi @opdchan - this is an interesting idea! I think theoretically this configuration could be made at the Graph Node level (the query node), but I am interested in @Brandon’s thoughts about whether this would work on the decentralised network, given query attestation requirements?
There doesn’t seem to be a straightforward solution here.
First, some context - in The Network all queries are time-travel queries. To satisfy the requirement for attestations, each query must have a single deterministic response. If you use the Edge & Node Gateway the Gateway will substitute block numbers for hashes, or if no block number is specified then the hash of the latest block the indexer reports will be injected into the query. If we did not inject block hashes into the query, then the output of the query would be ambiguous and attestations would not be possible.
Time travel are also being used by the Fisherman to cross-check the results of Indexers. A Fisherman may use time-travel queries in an automated fashion in their attempt to secure the network to create protocol disputes. Similarly, the Arbitrator may require time-travel queries to securely resolve protocol disputes and investigate determinism issues.
This task may also require specific engineering work for each applicable jurisdiction. This becomes especially complex when multiple jurisdictions may apply to a specific query. In a decentralized setting, an Indexer and Consumer may not be located in the same country, for example.
This would be a complex undertaking with a much larger scope than adding a config to disable time-travel queries generally.
This is a big hindrance for adoption for any kind of dapp that has to deal with user input, as it would make serving the resulting content via subgraph a no-go and would require a roundabout way of retrieving that data.
Would it be possible to reduce the time-travel scope to a limited number of previous blocks? That way indexers are prevented from serving possibly malicious data for an indefinite amount of time without the need of outright disabling the feature.
You’re asking for censorship within the context of a decentralized indexing layer built on top of technology that is inherently censorship resistant (blockchain).
I’m not very familiar with the implementation details of dchan and what chain you are indexing, but assuming Ethereum for a moment any data which has been “moderated” is still rooted in L1 and would remain so indefinitely. Any Ethereum archive node would still produce this data in some form when queried as-of a specific block hash. If you’re talking about data within IPFS files, that is not a censorship-resistant technology but deploying IPFS support to mainnet is an active area of research for the same reason.
Any specific number of blocks chosen for limiting time-travel queries does not solve the underlying issue, which is that content moderated by your team would still be able to be queried for at least N blocks. In N is increased enough to take into account Arbitration and protocol disputes (56 days, IIRC) that may not be “sufficiently censored”.
At the protocol level, I don’t think it’s possible to enforce this anyway. The only thing you could do would be to disallow queries attested to by an allocation that is started some blocks ahead of a “censored block”. This would at minimum require a GIP approved by The Graph Council. Even still, with that solution there would be a large range of time where the data would still be able to be queried at the protocol level just for the protocol to be able to function.
Hmm… thinking on this a bit more… the idea to not allow certain queries to be attested to at the protocol level depending on the start block of an allocation isn’t possible either. This is because the chain can re-org (possibly changing the block-height of the allocation), and we can’t have a re-org cause an Indexer to be liable for slashing.
Thank you for the insight @That3Percent .
I’d say it’s best if we disregard this proposal then.
The issue is caused by a very direct approach to having the subgraph return everything needed for the dapp to properly function… including user generated content, which is inherently potentially problematic or malicious. Looks like that isn’t a possibility for now.
One possible solution would be to just return an IPFS reference to the content, so that each jurisdiction can deal with it at that level. Much more elegant than completely breaking the protocol.
IPFS is a different matter altogether (albeit not one that we’ve solved). For the moment, reading IPFS files is not supported on mainnet anyway. So if you want to deploy the subgraph to mainnet you would have to do it the way you describe and read the IPFS files on the client, kicking the problem to IPFS.
In the future, we would like to index IPFS data as well but there needs to be some consensus around the availability of that data. It’s worth keeping this issue in mind, because if the availability of IPFS data is dependent upon jurisdictions then this is even harder to get consensus around the data availability than we anticipated.
@opdchan How come limiting number of queryable last blocks would in any way solve the problem?
If I get the problem correctly, correct solution would be to approach it from completely different angle, and talking about indexing process, only treat data valid when they are “finalized”. This approach is present in plenty of different networks as parameters. This is how the poi consistency might be done.
For dApp input - this problem is quite well know in any distributed system where you need to retrieve the data from the distributed source, when they have different input. To mark a main line in DAG you may like to have additional parameter that would indicate “correct” history. In others you would need to always fetch last known block to make sure you have “never history”.
I don’t see how disabling time-travel queries would solve the problem here.