What we are doing is not far away from what you guys are describing.
Technology involved: Docker, Jest, Hardhat (forked version until this is resolved: feat: set relative block timestamp based on forked block by fubhy · Pull Request #979 · nomiclabs/hardhat · GitHub), GitHub Actions
About Hardhat: We run a local Hardhat development stack (mainnet fork) inside of Jest. We are not using Hardhat as a “task runner” because other tools are much more capable (e.g. the Jest test runner) and solve things like parallelization (better cpu core utilization, etc.) very nicely. So consider our Jest + Hardhat stack a custom breed where we only use the Hardhat EVM by directly importing lower-level functions from their npm package: We don’t use the HRE (Hardhat Runtime Environment) or any of the other globals that are exposed by their package. I’ve brought this up with the Hardhat maintainers in the past but haven’t heard back from them yet. Since that approach is a massive deviation from what they advertise their tool as (a task runner) I don’t think they’d reconsider and instead turn Hardhat into a library first and foremost, but who knows. We also use Jest instead of Mocha and have built our own Jest Matcher & Assymetric Matcher toolbelt (custom assertions for transactions, ethereum primitives, function input/output, etc.). Nonetheless we are very happy with Hardhat. Especially in comparison to Ganache: Stack traces, forking with caching, good performance, nice abstraction around ethereumjs-vm, etc.
About Jest: For some reason, the Ethereum community is quite invested into Mocha. We’ve had much better experience with Jest. Considering that Jest is also the #1 test runner in the rest of the JavaScript community I’m surprised that Ethereum is still locked into Mocha so much. Jest shines especially because of it’s parallelized test runner and the ability to control large parts of the test environment & initialization of the test context. Once you’ve mastered it, the interactive CLI in “watch”-mode is also quite a nice productivity boost (interactively narrow down on the tests suites you want to “watch”, re-run them with ease, prioritize previously failed tests, etc.).
Our integration tests for the subgraph work pretty much the same as laid out by you here:
- Boot the docker environment (including a custom Hardhat docker image, postgres, ipfs, graph-node). For testing, we don’t mount any volumes.
- Run the test suite
The different now is, that our test suite manages all the rest: Contract deployment, deployment of the subgraph, interacting with the protocol, asserting state (onchain + graphql queries).
For that, we’ve written a few hacky utility functions that
a) deploy our subgraph while replacing handlebars template placeholders with the addresses (and other dynamic values like start block, etc.) from the deployment output
b) wait for the subgraph to be synced
The tests then continue by interacting with the protocol. Custom Jest Matchers then poll the subgraph until it has been synced up to the block of the last relevant transaction receipt and then run assertions against that block by querying the store (using graphql) at that exact block number.
… You see, we’ve experienced all the same pain that you are describing. And yes, it certainly means that there is a lot to do in terms of (subgraph) developer tooling: Unit testing, integration testing, etc.
Integration testing is possible already, but it’s painful.
As Jannis said: We need custom test utils (I’d be happy to eventually open source ours although I am not sure how helpful those would be for the majority of the community as most people seem to be using Mocha) and we need ways to manipulate the graph-node from within the test suite so that it plays well with features like EVM snapshotting (also revert the subgraph to the block # of the snapshot), etc.