Roadmap to L2 utilization

Apologizes if this already has a related post, but I did not come across one when searching.

I have seen a number of posts or comments in other sites from potential network participants that ask about the barrier-to-entry related to gas prices. Curators, Delegators and Indexers are all subject to the extremely high gas prices on the ETH network right now and it is likely making it impossible for average participants to get involved due to the high principal investment needed to make financial sense to perform something like a Signal or Delegation.

With that said, are there any plans to adopt an L2 solution that utilitzes zk-Rollups or Optimistic Rollups in order to reduce these costs like many other projects have? I would love to be able to answer these concerns I find with good knowledge, but I always seems to just see the same “The Graph doesn’t support L2” kind of answers with no explanation of why or when it might be able to.

Can we explore this topic here?
Thanks

7 Likes

Hi flynn,

Funny you made this thread, as I was thinking of making something similar just yesterday. I figure this can be a good place to throw down my thoughts on the matter, particularly in the context of The Graph’s current design.

Note: I am not a member of the development team for The Graph, so everything here is somewhat speculative and may not be totally indicative of the true situation that the core protocol faces.


Primary Issue(s) with L2 Expansion
It seems apparent to me that the L2 expansion faces difficulty (but not impossible) due to 2 primary factors:

  1. The dependence of The Graph’s different protocol branches with each other.
  2. The inherent security design of Optimistic rollups which lead to issues with finality & atomicity.

To expand on both of these:

The dependence of The Graph’s different protocol branches with each other
As far as I can tell, most (yet not all) of The Graph’s components are not segregated from one another, but rather use each other to be able to perform their own actions. For example, the GNS contract relies on the curation contract (and it’s associated state), the curation contract relies on the state of the rewards contract, and the rewards contract relies on the staking contract, as a couple of examples. This appears to be intentional by design, because quite frankly, that’s how our network operates. The interaction between network branches (curation, indexing, delegation, etc) is what drives the cryptoeconomic incentive model that The Graph inherently thrives on. In my opinion, it’s a necessity.

The evident primary issue here is that it’s difficult (but not impossible) to segment pieces of the protocol due to this operational dependency. If only the “curation” contract functions are migrated to an L2, it becomes increasingly difficult to understand the total state of the network, as now the network state is divided amongst 2 separate “chains”. In other words, using curation on an Optimistic Rollup like Arbitrum means that a more complicated mechanism must be set in place to ensure that the state of the Ethereum mainnet matches that of Arbitrum, and vice-versa. However, this doesn’t mean it’s impossible- this is an inherent complication with L2’s (interoperability), and it attempting to be remediated most vigorously by players in DeFi, dubbed “liquidity fragmentation”. Essentially it boils down to “How can 2 separate chains share the same state atomically?”, which is especially important to The Graph to prevent certain economic attacks (e.g. indexers indexing the same subgraph, but getting different rewards due to the different amounts curated on each network). This brings us to issue #2 with Optimistic Rollups.

The inherent security design of Optimistic rollups which lead to issues with finality & atomicity

I’ll keep this brief since it assumes some conversational familiarity with Optimistic Rollups, but essentially OR’s carry inherent security features that make state sharing quite difficult, most notably the L2 → L1 challenge period that takes a few days (I believe 7 on Arbitrum) to ensure enough time for a fraud proof to be published. Again, this is a necessity of OR’s, since they accept all transactions optimistically.
This appears to introduce certain problems specifically around atomicity and finality. As explained here and here, current approaches to sharing state between L1’s and L2’s revolve around a protocol built around intermediaries (sometimes trusted, but trust is NOT an inherent requirement) to facilitate the messaging between layers, particularly from L2 to L1. The primary point here is that optimistic rollups don’t have an “out of the box” way of maintaining bidirectional state atomicity; it only appears that it works one-way (L1 → L2) through an ETH bridge.

However, the approaches to this are up-and-coming, with the 2 I personally know of being Hop and Connext, who appear to have (decentralized?) protocols to facilitate messaging between layers efficiently. Essentially, it follows a pattern sometimes found in dAMM’s (decentralized automated market makers) to ensure that liquidity is present for the cross-chain operations being performed. Again, I’m not super well-versed in the specifics of the protocols, but at the core they act as an intermediary to facilitate moving assets (fungible and non-fungible) between chains.

Okay, so what are some good solutions?

This is where I’ll start getting creative, and throwing down some notes of ideas that could lead to a successfully interconnected Graph network:

  1. Work with Connext to see how their protocol could be used to maintain state between L1 and L2, while still maintaining decentralization. Again, not a member of the core dev team, but this could be an option considering that the state channels used to settle queries run on the Vector network.

  2. Prioritize atomicity in one direction only; essentially, if an ETH state change occurs, send a ticket to the L2 to have the same change occur. Since it appears that Arbitrum’s retryable tickets are atomic from the L1 → L2, this means that (theoretically, again I am not an expert) at least the L2 will have the synced changes from actions performed on the L1. This seems like the least palatable solution, and I think there could be a dozen race conditions here that would make the system quite complicated.

  3. Use a centralized oracle to propagate changes between chains. I was actually wrong before, this is probably the least palatable solution, for obvious reasons.

  4. Wait until ZK-Rollups reach EVM maturity. This one isn’t worth talking about too much, since the details of EVM-compatibility on projects like zkSync and StarkNet don’t include too much concrete information on cross-chain messaging, but the inherent design of ZK Rollups alleviate many of the issues found in Optimistic Rollups. I personally have high hopes for EVM-compliant rollups in regards to interoperability, and think they’re a more palatable long-term solution to the issues described above. But for now, only time will tell.

  5. Redesign (some of) the contracts to allow for time-based segmentation, and include those components into the rollup layer/L2. For example, if delegation and staking/allocations could be moved to the L2, then mandatory “optimistic-rollup” network parameters may be set to facilitate this (e.g. a minimum allocation length of 8 epochs, assuming the Optimistic Rollup challenge period is 7 days/epochs). This is not a bad solution to the issue, but I imagine relies on a significant refactor due to the current linkage of the protocol as it stands.

  6. Migrate the entirety of the network over to the L2, provided enough time for the L2 to prove it’s security is truly equivalent to the L1 (functionally speaking). This one needs little explanation, just point all users to Arbitrum. Bit of a nightmare I’m sure, though.

  7. Probably quite a few more that I’m either forgetting or haven’t thought of yet. I don’t have too much of a team to bounce my ideas off of. :wink:


I would love to get some more input on this, but I think the core team is working really hard to make this work- we all know how big a deal it would be for the protocol, and I’m sure they want to make it happen (as well as taking a much-deserved offsite to spread all the good news) just as much as we do. @ariel or @Brandon I’m sure will speak up at some point.

At the very least, hopefully this kicks off some good discussion on L2’s and how our network can embrace the rollup-centric future.

12 Likes

Wow thanks for the response here!

This is exactly what I was looking for as far as getting the talking started between people that actually know what they are talking about. As for myself, I would say I still have a very shallow understanding of all of these concepts so I would honestly just embarrass myself if I tried to chime in and act like I had anything more intelligent to add to the conversation.

Nonetheless, I am very interested in learning more about all of this and the potential solutions that could be pursued for this very real problem. In the meantime I’m going to try to brush up on rollups and other L2 tech as much as I can.

Look forward to hear more from the core dev team!

1 Like

@flynntropolis what @rburdett shares captures the main challenge with making use of a L2.

We’ve seen some DeFI protocols that deployed on multiple rollups, the way that worked is that each deployment is it’s own app, in the sense that is isolated from the other chains. So when Uniswap deployed Optimism it starts with zero state, new liquidity is created in the new deployment without necessary connection between Uniswap Mainnet and Uniswap Optimism. Their main issue is liquidity fragmentation.

In our case the protocol contracts encode the logic for all the incentive mechanisms and rules that are interrelated. That means that if we deploy the protocol in its entirety on a rollup we would have a split in the indexer stake, delegation and curation signal, because all the indexer components rely on them.

That means we need to find a way to effectively partition the architecture to allow for some actions to happen on L1 and others in L2. We have global state that each contract needs to query to perform its functions, and L1<->L2 state change has restriction that we need to account in the design.

I’m currently doing some research about the different options that involves:

  • Performing gas estimation to measure the improvement we get
  • Looking into how all the global state is interrelated to find opportunities to partition into a new architecture
  • Investigating about bridge design
  • Creating a POC of L1<->L2 messaging
2 Likes

This makes a lot of sense when you compare The Graph’s intricacies with that of something a bit more simple like DeFi liquidity pools, so thank you for providing that insight. As someone that is a web2 dev, but just beginning to learn about web3 my knee jerk reaction to solving a problem like this purely from a design standup is to use some kind of adapter pattern which allows existing actors to continue to participate with a contract that has parity with that of the L1 solution, but actually interacts on L2.

I’m not sure exactly what that would look like in this ecosystem of technology, but I wonder if there could be a way of leveraging an L2 solution like Polygon to abstract away the heavy lifting being done on mainnet and allowing all net-new operation to be made there with proxy calls to mainnet when needed.

I don’t think there’s really a way to completely mitigate operating on existing L1 contracts that already have GRT in them unless some kind of administrator transfer could be done to allow moving staked tokens from L1 to L2 in a migration-like operation.

Again, I probably have no idea what I’m saying here, but this is all very interesting nonetheless :smiley:. Thanks for your time and insights and I look forward to seeing more discussion and progress on this matter.

I’m more leaning to use a rollup like Arbitrum / Optimism.

Some general principles for the design:

  • L1 to L2 communication is fast while L2 to L1 is slow due to the fraud-proofs in optimistic rollups.
  • You can’t move state automatically, or copy it over.
  • Any action that moves state around should be user initiated.
  • Define what parts of the whole can only be used on L2 or both (opt-in).
  • Bridge design: A bridge between L1 and L2 can lock tokens that are then minted and unlocked in L2, or it can just lock tokens and update any other state on L2 to make a mechanism available.
4 Likes

Took some time to think about this more, and I think this is ultimately the best choice for the protocol considering the current state of L2’s on Ethereum.

The rationale:

  • There is a fundamental difference between an L2 and a sidechain like Polygon; rollups inherit the security of Ethereum (generally), whereas sidechains have their own consensus and security mechanisms that make it fast and cheap, but overall less secure (in my opinion) theoretically. However, for the time being, they are a necessary part of Ethereum scaling until L2’s come to better fruition (the GRT billing contract, for example).
  • The bridge-design approach is (again, theoretically) extensible for other rollups of any type, including ZK’s and other optimistic rollups like Optimism. Different bridges would likely be required for different rollups, but essentially they all follow a similar pattern; halt the ability to make changes to the L1 state with some tokens, atomically bridge/unlock those tokens on the L2, and allow for changes to be made inside that walled garden until a user decided to bridge back.
  • There are some caveats to this- as Ariel mentioned, the interoperability of The Graph’s protocol is something that throws a wrench into an “easy” design for L2 compatibility. It means that a structural change may be required, or at the very least redesigned to support a segregated part from the whole- without more research into this (I’m picturing an interconnected diagram that shows all dependencies and interactions of contracts across The Graph protocol) it’s hard to say what can be segregated and what can’t. Personally the hardest part I’m imagining is reward distribution to indexers, but again, without more research on my end it’s hard to say.

With that in mind, a user-initiated bridge makes the most sense, as it doesn’t attempt to automatically manage state (which can cause race-condition issues) and also more elegantly handles the issue of lockups.

I think it’s also worth mentioning that I (personally) believe that this is an issue fundamental to Optimistic rollups, and that ZK-rollups don’t have this problem as much- there are certainly tradeoffs, but as far as I know, the atomicity and finality issues with ZK’s are much easier to deal with than OR’s, since fraud proofs are automatic.

4 Likes
  1. The inherent security design of Optimistic rollups which lead to issues with finality & atomicity.

Limited interoperability and extensive fragmentation in token bridges (with weak security guarantees) and messaging protocols will be improved by CCIP (Cross-Chain Interoperability Protocol (CCIP) | Chainlink). Future use cases that come to mind: Cross-chain yield harvesting, cross-chain collateralized loans, redirection of tx to a higher throughput chain if the one you’re on is congested. Maybe The Graph can utilise it too in the future.

Thanks for starting this thread @flynntropolis. In addition to what @rburdett and @Ariel have, said which I largely agree with, I’ll introduce a few other considerations.

Defi Interoperability. As noted by others, we’ll need to design a partioning of protocol state across L1 and L2. This should account for the benefits of having defi interoperability of things like GRT, Curation Shares, Delegation Shares, etc. on L1.

Vesting Contracts. Many Indexers & Delegators in the network today are interacting with the protocol via special vesting contracts that have allowlisted certain functions in the protocol running on L1. Need to think through how these users will participate when some functionality moves to L2.

Timing. There are many dimensions to this: How long until rollups see large adoption and increase gas costs on L1 due to induced demand? Projecting forward an increase in # of subgraphs on the network and # of Indexers/subgraph, how long until the gas cost improvements from something like Arbitrum/Optimism are completely washed out? How long should we invest in 1.X of the protocol and how long should we wait for improvements on ZK Rollups + EVM–>ZK Rollup compilers to emerge?

Subgraph/Indexer Migrations We’ve already seen how much work it is to just migrate subgraphs from Edge & Node’s hosted service to The Graph decentralized network. For L2, we’ll have to do this again for both subgraphs and Indexers. Can we do this in such a way that doesn’t lead to outages in any production apps relying on The Graph? Related to timing considertions above… how many times do we want to go through such a heavy migration? Is it worth migrating to a stop gap solution or just holding off until the long-term higher throughput solutions are more stable?

Multiblockchain Bridges. Assuming we move PoI submisssion and allocation management to the L2, we’ll likely need a bridge to the various other L1s The Graph supports, in order to have deterministic PoIs + indexing rewards. Which L2 designs lend themselves better to bridging to other (potentially high throughput) L1 chains?


Personally, I’m very intrigued by the ZK Porter/ Volition lines of research, because they offer orders of magnitude greater throughput, by introducing the option to store data off-chain.

I still have a lot of questions as to precisely what kinds of data availability guarantees these will offer longer–i.e., will it follow a LazyLedger, Arweave model or something else? The Graph has the need for a source of truth on data availability for other protocol logic, so a built-in source of truth on data availability could be desirable for other reasons (i.e. store an entire subgraph manfifest on L2 storage, not just the subgraph ID), depending on what the storage costs and guarantees are.

A higher throughput chain also would potentially allow for higher resolution bridging to other fast L1s. While rollups increase throughput of using Ethereum, they don’t improve latency (i.e. you are still dependent on L1 block times for block production frequency).

I also like Arbitrum a lot for the simple reason that it is EVM compatible, gives a large throughput gain, is simpler to reason about, and available sooner. One nice thing about designing our L2 architecture w/ Arbitrum in mind as the deployment target–many of the decisions would likely carry over to using zkSync (zkPorter) or Starknet (Volition), once both those chains support EVM compilation. By the time we make serious headway on our L2 design/implementation we’ll likely have a lot more data on how both those projects have developed, as well as how gas costs for Indexers, Subgraph Developers and others evolve as more subgraphs are deployed to The Graph’s decentralized network.

10 Likes