GRC-006: Mainline — A Decentralized Firehose Data Service on Horizon
Stage: RFC (Request for Comment)
GRC: 006
Authors: @cargopete (Petko Pavlovski)
Related: GRC-005: Dispatch · GIP-0066: Graph Horizon · GIP-0054: GraphTally · The Graph 2026 Technical Roadmap · Firehose documentation · Substreams Data Service · Lodestar — How to Build a Horizon Data Service
1. Abstract
This GRC introduces Mainline — a Horizon data service that serves raw, fork-aware, cursor-resumable Firehose block streams over gRPC. Mainline sits one level below the in-flight Substreams Data Service in the Graph stack: it is the decentralized substrate that Substreams, Subgraphs, Tycho, Token API, and the JSON-RPC Dispatch service all consume. Indexers stake GRT, provision it to the FirehoseDataService contract, register the chains they serve, and get paid per streamed gigabyte and per Fetch request via GraphTally (TAP v2). The contract inherits from the same DataService base as the SubgraphService, reusing HorizonStaking, GraphTallyCollector, and PaymentsEscrow unchanged. Verification is tiered after GRC-005 Dispatch: Tier 1 uses Merkle-root comparison against canonical chain headers (proof-based); Tier 2 uses quorum across operators; Tier 3 uses reputation plus cursor-attestation audit trails. The proposal ships with a concrete reference implementation plan in Rust, bootstrapping from firehose-ethereum and firehose-solana as the first two chains. The strategic thesis: The Graph cannot credibly market Substreams, Tycho, or any streaming-first data service as “decentralized” while the blocks underneath those products come exclusively from StreamingFast’s proprietary endpoint. Mainline fixes that.
2. Specification
2.1 Data Service Contract
The FirehoseDataService contract inherits the same stack as SubgraphService: DataService base + DataServicePausable + DataServiceFees + DataServiceRescuable + DataServiceUpgradeable (GIP-0066). No new base contracts. No new payment primitives.
// SPDX-License-Identifier: GPL-2.0-or-later
pragma solidity 0.8.27;
import { DataService } from "@graphprotocol/horizon/contracts/data-service/DataService.sol";
import { DataServicePausable } from "@graphprotocol/horizon/contracts/data-service/extensions/DataServicePausable.sol";
import { DataServiceFees } from "@graphprotocol/horizon/contracts/data-service/extensions/DataServiceFees.sol";
import { DataServiceRescuable } from "@graphprotocol/horizon/contracts/data-service/extensions/DataServiceRescuable.sol";
import { IGraphPayments } from "@graphprotocol/horizon/contracts/interfaces/IGraphPayments.sol";
contract FirehoseDataService is
DataService,
DataServicePausable,
DataServiceFees,
DataServiceRescuable
{
// --- Protocol parameters (governance-set) ---
uint256 public constant MIN_PROVISION_TOKENS = 25_000 ether; // 25k GRT
uint256 public constant STAKE_TO_FEES_RATIO = 4; // 4:1 (vs 5:1 on Subgraph/Dispatch)
uint64 public constant MIN_THAWING_PERIOD = 21 days; // longer than Dispatch (14d)
uint32 public constant MAX_VERIFIER_CUT_PPM = 500_000; // 50% max slash
// --- Chain registry (permissioned at Phase 1; bond-based by Phase 3) ---
struct ChainManifest {
uint64 genesisBlock;
bytes32 genesisHash;
string firehoseProtoType; // e.g. "sf.ethereum.type.v2.Block"
uint32 firstStreamableBlock;
uint32 reorgDepth; // irreversibility horizon
bool supportsFetch; // true for archive-backed chains
}
mapping(bytes32 => ChainManifest) public chains; // chainId => manifest
// --- Indexer registration ---
struct IndexerService {
string url; // gRPC endpoint (TLS)
bytes32[] chainIds; // chains served
Tier tier; // verification tier indexer subscribes to
uint32 geoHint;
uint64 advertisedLIB; // last advertised irreversible block per chain
}
enum Tier { Reputation, Quorum, ProofBacked }
mapping(address => IndexerService) public services;
// --- Core methods (overrides from DataService) ---
function register(address indexer, bytes calldata data) external override;
function startService(address indexer, bytes calldata data) external override;
function stopService(address indexer, bytes calldata data) external override;
// Firehose-specific
function advertiseChain(bytes32 chainId, uint64 lib) external;
function collect(
address indexer,
IGraphPayments.PaymentTypes paymentType,
bytes calldata data
) external override returns (uint256);
function slash(
address indexer,
uint256 tokens,
uint256 reward,
bytes calldata evidence
) external override;
}
Recommended defaults and rationale:
| Parameter | Value | Rationale vs. SubgraphService / Dispatch |
|---|---|---|
MIN_PROVISION_TOKENS |
25,000 GRT | Higher than Dispatch (10k) because Firehose requires multi-TB storage; lower than SubgraphService (100k) because there is no curation market to align with. |
STAKE_TO_FEES_RATIO |
4:1 | A provider collecting 2,500 GRT locks their full provision — meaningful skin-in-the-game. |
MIN_THAWING_PERIOD |
21 days | Longer than Dispatch’s 14 days because Firehose disputes need a re-derivation window against archive. |
MAX_VERIFIER_CUT_PPM |
500,000 (50%) | Matches SubgraphService. |
2.2 Service Interface
Mainline exposes the existing StreamingFast sf.firehose.v2 protobuf contract unchanged. This is non-negotiable: every Firehose consumer in the wild already speaks this schema. The service surface (per the Firehose protocol reference and the published Buf schema):
// Consumed verbatim from streamingfast/firehose
package sf.firehose.v2;
service Stream {
// Historical + live, cursor-resumable, fork-aware
rpc Blocks(Request) returns (stream Response);
}
service Fetch {
// Single-block lookup by num or hash
rpc Block(SingleBlockRequest) returns (SingleBlockResponse);
}
service EndpointInfo {
rpc Info(InfoRequest) returns (InfoResponse);
}
Recommendation: expose all three services. Do not cherry-pick.
Streamis the primary product — live + historical streaming withSTEP_NEW,STEP_UNDO,STEP_IRREVERSIBLEfork steps and string cursors that encode ForkDB position.Fetchis essential for institutional and RPC-derivative use cases — a single deterministic block-by-hash or block-by-number lookup. Graph-node’s historical block hash backfill specifically needs this (graph-node issue #3518).EndpointInfois how consumers (and the Horizon gateway) discover what chains and block ranges a given indexer serves. Mainline requires it to be truthful and refreshed on every block.
Verification responses additionally embed a Mainline-specific signed attestation as a gRPC trailer:
// Mainline extension
message MainlineAttestation {
bytes chain_id = 1;
uint64 block_number = 2;
bytes block_hash = 3;
bytes state_root = 4; // chain-native canonical root
bytes payload_hash = 5; // sha256 of the protobuf Block payload
bytes cursor = 6;
bytes indexer_sig = 7; // EIP-712 signature over (chain_id, block_num, block_hash, payload_hash)
}
The attestation is the anchor for all three verification tiers in §3.6.
2.3 Chain Registration
Every supported chain requires a ChainManifest on-chain. The manifest pins the protobuf schema URI and genesis parameters so consumers can validate they are talking to the right chain.
Recommendation: three-phase chain onboarding.
| Phase | Mechanism | Entry bar |
|---|---|---|
| Phase 1 (launch) | Governance allowlist — the Graph Council approves ChainManifest entries. |
0 GRT; reputation only. Bootstraps the initial chain set. |
| Phase 2 (year 1) | Bond-based permissionless — anyone can register a chain by locking 50,000 GRT as a chain bond. | 50,000 GRT bond, slashable on fraudulent manifest. |
| Phase 3 (steady state) | Curation-weighted — chains are prioritized by the signal curators place on them. | Mirrors subgraph curation. |
At Phase 1, the initial chain set is the intersection of (a) chains with open-source Firehose instrumentation and (b) chains Graph Foundation already supports for Subgraphs: Ethereum, Solana, Base, Arbitrum, Optimism, Polygon, BSC, NEAR, and Starknet (firehose-docs supported protocols).
Indexers register per-chain via advertiseChain(chainId, lib); discovery happens through a Mainline network subgraph that indexes these events plus the IndexerService URL. Consumers (and the gateway, when used) use the subgraph to build their operator pool exactly as Dispatch does today.
2.4 Payment Model
All payments flow through GraphTally (TAP v2), GraphTallyCollector, and PaymentsEscrow — the same primitives that power Subgraph queries and Dispatch requests today (What changes with Graph Horizon). No new payment primitive is proposed. Adding one would be a mistake given that indexer-service-rs and indexer-tap-agent v2.0.0+ already mandate TAPv2 exclusively.
Firehose is unusual because its units of work span two very different shapes:
- Streaming — long-lived connections billed by bytes pushed to the consumer (aligned with StreamingFast’s own commercial metering: “In Firehose, billable bytes consists of all egress bytes” (streamingfast.io/pricing)).
- Fetch — one-shot requests billed per block retrieved.
Recommendation: two-lane pricing under one collector.
| Lane | Unit | Default price | Notes |
|---|---|---|---|
| Stream — live tail | per GiB egress | 0.50 GRT / GiB | Live blocks are small, billed on what the consumer actually pulls. |
| Stream — historical backfill | per GiB egress | 0.35 GRT / GiB | Lower than live; served from merged-blocks storage, amortized across many consumers. |
| Fetch | per block | 0.00004 GRT / block | ~ $10 per million Fetch blocks at GRT $0.10. |
| Stream — subscription lane | flat monthly | 500 GRT / mo per chain | Unlimited egress per chain, analogous to Helius’ LaserStream model. Offered at the indexer’s discretion; default off. |
TAP receipts are signed per-burst (not per-block) to avoid signature overhead on a stream that emits thousands of blocks per minute. The indexer-tap-agent aggregates into RAVs every 60 seconds and submits collect() hourly — identical cadence to Dispatch. Subscription-lane billing is modeled as a single monthly RAV with a capacity claim rather than a per-block receipt stream.
The two-lane model matters because it aligns with operator economics: bytes for high-throughput institutional consumers; flat-rate for predictable budget consumers; per-block for the Fetch service which is a genuinely different workload.
2.5 Service Level Agreements
Firehose SLAs are more constrained than Dispatch’s because the ground truth is globally unambiguous — there is one canonical chain.
| Metric | Tier-1 target | Tier-2 target | Tier-3 target | Enforcement |
|---|---|---|---|---|
| Live block latency (p50) | ≤ 500 ms post-finality | ≤ 1,000 ms | ≤ 2,000 ms | Gateway quality score |
| Live block latency (p99) | ≤ 1,500 ms | ≤ 3,000 ms | ≤ 5,000 ms | Gateway quality score |
| Completeness (missing blocks / 10,000) | 0 | ≤ 1 | ≤ 5 | On-chain dispute window |
| Fork-handling correctness | Every undo emitted within 1 block of canonicalization | 2 blocks | 3 blocks | On-chain dispute window |
| Availability (30-day rolling) | 99.9% | 99.5% | 99.0% | Gateway quality score + reputation |
| Fetch p50 latency | ≤ 200 ms | ≤ 500 ms | ≤ 1,000 ms | Gateway quality score |
| Historical backfill throughput | ≥ 200 MB/s per connection | ≥ 100 MB/s | ≥ 50 MB/s | Gateway quality score |
Advertised LIB (last irreversible block) must not regress. Any regression is treated as evidence for a Tier-1 dispute.
2.6 Verification & Dispute Mechanism
This is the hard part, and it is structurally easier than Dispatch because Firehose output is deterministic: the canonical Ethereum block at height N has exactly one correct protobuf representation (modulo schema version). That means Firehose can reach cryptographic verification for a meaningful subset of claims, not just economic.
Recommendation: three tiers, modeled directly on GRC-005 Dispatch but with stronger Tier 1 than Dispatch can achieve.
| Tier | Mechanism | What it proves | Slashing enabled? |
|---|---|---|---|
| Tier 1: Proof-backed | Merkle-root comparison: the Mainline attestation commits to the chain’s canonical state root (Ethereum stateRoot, Solana bank hash, etc.). A watcher derives the expected root from the chain’s own canonical source (L1 for L2s; consensus layer for beacon-backed chains; a Merkle proof against a trusted header for Ethereum L1) and compares. Disagreement → slash. |
Block payload matches canonical chain state | Yes |
| Tier 2: Quorum | Gateway or client queries k of n indexers for the same (chain, block_num) via Fetch; mismatching payload_hash values flag the minority. |
Block payload agrees with majority of independent operators | Economic only |
| Tier 3: Reputation + attestation audit trail | All responses carry MainlineAttestation; consumers retain them. Consistent misreporting is visible retroactively and feeds into gateway routing weight. |
Tamper-evident audit log | No — quality-score penalty |
Why Tier 1 is genuinely achievable here when it wasn’t in Dispatch. In Dispatch, an eth_call result has no on-chain root to compare against without full re-execution (GRC-005). A Firehose block, by contrast, is the thing the canonical chain commits to via its block hash. Given a trusted block header source (Ethereum L1 headers via light-client proofs; beacon chain for consensus-layer correctness; L1 inbox for rollups), a verifier can recompute the expected payload_hash and check the indexer’s attestation against it. The DisputeManager contract (already live for SubgraphService) is the right home for this; it needs a new FirehoseDisputeVerifier that knows how to interpret MainlineAttestation payloads per chain.
Dispute flow:
- Any party posts a
DisputeAttestation(indexer, chainId, blockNum, claimedHash, evidenceHash)with a bond (10,000 GRT). - During the 21-day window, any party can post a counter-attestation with a valid header proof.
- At window close, the contract compares
claimedHashagainst the canonical value derived from the winning evidence. Fraudulent indexer losesmin(5 × fees_collected_for_block_range, provision_balance). Successful challenger collects the verifier’s cut (up to 50% perMAX_VERIFIER_CUT_PPM).
For chains where header-proof infrastructure is not yet available (e.g., novel L1s), Tier 1 degrades to Tier 2 automatically — the contract records which tier each chain supports.
2.7 Cursor & Reorg Semantics
Firehose cursors are opaque strings that encode the ForkDB position (firehose.streamingfast.io). They are provider-specific today — a cursor emitted by StreamingFast’s endpoint may not be accepted by Pinax’s endpoint, and vice versa. This is unacceptable in a decentralized operator set where consumers must be able to switch operators mid-stream.
Recommendation: mandate a Mainline-standardized cursor format across all Tier-1 and Tier-2 operators.
mainline-cursor-v1 := base64url(
chainId (4 bytes) || libNum (8) || libHash (32) || headNum (8) || headHash (32) || forkSteps_seen (varint)
)
This cursor is portable: any compliant Mainline operator can resume a stream from any other compliant operator’s cursor, because the cursor contains only globally-addressable chain state (block numbers + hashes), not operator-local ForkDB internals. Operators that cannot honor a portable cursor (e.g., because their local fork history is too shallow) must return a specific gRPC status code (FAILED_PRECONDITION with MAINLINE_CURSOR_UNRESUMABLE) so the client can either re-derive from a nearby finalized block or switch operators.
Reorg handling in the billing layer: STEP_UNDO messages are billed at the same per-byte rate as STEP_NEW, because the operator incurred the same egress cost. The alternative (free undos) creates a perverse incentive for operators to push speculative blocks.
4. Rationale
The key design decisions, and what was considered and rejected:
1. Why a separate DS instead of a SubgraphService extension? SubgraphService is tied to subgraph-specific concepts (allocations, POIs, curation signal, maxPOIStaleness). Firehose has none of those. Conflating them would pollute the SubgraphService data model — the graph-network-subgraph schema already carries maxPOIStaleness and stakeToFeesRatio fields tagged [SubgraphService] (graph-network-subgraph schema.graphql). A separate FirehoseDataService contract keeps concerns clean and matches the Data Service Framework intent: one contract per data service type.
2. Why not build on the Substreams DS contract directly? Because Mainline is consumed by the Substreams DS. The right architectural layering is: Firehose DS (raw blocks) → Substreams DS (transformation) → Subgraph DS (storage + query). Collapsing this into a single contract creates circular dependencies and forces Firehose consumers who don’t need Substreams to participate in Substreams economics.
3. Why GraphTally per-byte instead of per-block? Block sizes vary by 3+ orders of magnitude across chains (a Solana block can exceed 10 MB; an Arweave “block” can be tiny). Per-byte pricing gives indexers predictable revenue per unit of work and gives consumers predictable costs. StreamingFast’s own commercial metering validates this (streamingfast.io/pricing), as does Helius (helius.dev/laserstream). Per-block pricing was rejected.
4. Why expose StreamingFast’s protobuf contract unchanged? Because every Firehose consumer on Earth — graph-node, substreams-sink-postgres, substreams-sink-kafka, Tycho, every custom Rust binary in the StreamingFast ecosystem — speaks sf.firehose.v2. A custom Mainline protobuf contract would require every consumer to integrate a new SDK. The cost-benefit is obviously wrong.
5. Why Tier-1 proof-backed verification from Phase 3, not Phase 1? Because writing a correct FirehoseDisputeVerifier is real work, and Dispatch launched with slash() reverting unconditionally (GRC-005). The honest sequencing is: ship with economic security (stake lock + reputation), add quorum dispatch via the gateway, and bring slashing online when verifiers for major chains are audited.
5. Phased Rollout
| Phase | Scope | Timing (target) | Exit criterion |
|---|---|---|---|
| Phase 0 — Reference implementation | FirehoseDataService contract on Arbitrum Sepolia; one operator (cargopete) running firehose-ethereum Mainnet + firehose-base; gRPC endpoint live; TAP receipts flowing end-to-end. |
Q1 → Q2 2026 | Full payment loop demonstrated on testnet; at least one external consumer (graph-node dev stack) pulling blocks. |
| Phase 1 — Limited mainnet | Contract on Arbitrum One. 3–5 invited operators across 4 chains (Ethereum L1, Base, Solana, Arbitrum One). Governance-allowlisted chains. Tier-2 quorum verification via gateway. Slashing disabled. | Q2 → Q3 2026 | 10 paying consumers; 99.5% availability on all chains across all operators for 30 consecutive days. |
| Phase 2 — General availability | Permissionless operator registration. Bond-based chain registration (50k GRT bond). Subscription-lane pricing enabled. Substreams DS officially routes its upstream Firehose traffic through Mainline where operators are co-located. | Q4 2026 | Substreams DS running at least one production consumer through Mainline. |
| Phase 3 — Verification tier | FirehoseDisputeVerifier for Ethereum L1 and at least one L2 goes live. slash() activated. Tier-1 proof-backed verification operational for supported chains. |
Q1–Q2 2027 | First successful on-chain slash of a fraudulent operator (demonstrated on testnet first). |
These timings are deliberately aggressive and depend on Horizon contract audits, which are ongoing (Graph Horizon overview). They should be read as targets, not commitments.
6. Open Questions & Risks
Operator bar is materially higher than Subgraph indexing. Running a Firehose stack for Ethereum mainnet requires an instrumented geth or reth node (currently ~2 TB pruned), plus the merged-blocks archive in object storage. StreamingFast’s own documentation notes that Solana alone consumes ~61 GiB compressed per day (~22 TiB/year) of merged blocks (firehose-setup/solana), and Ethereum’s Substreams backend is sized at ~3 TiB for low-to-medium-complexity processing (streamingfast.io/pricing). Many current Subgraph indexers will not have the storage or bandwidth profile for Mainline. This is a feature, not a bug — it concentrates the operator set on serious infra teams — but it means we should not expect the Subgraph indexer count to map 1:1 onto the Mainline operator count. Realistically, 10–30 operators globally is the ceiling for Phase 2.
Who maintains the chain-specific patches? Geth-firehose, firesol, and the various other patches live in StreamingFast’s repos and are effectively maintained by StreamingFast. A decentralized operator set depends on these patches keeping up with hard forks. Concretely: when Ethereum ships a hard fork, the instrumented geth fork must ship within the fork’s activation window, or every Mainline operator’s Ethereum stream breaks simultaneously. Recommendation: the Graph Foundation (specifically my team’s chain-integrations work) should fund a formal SLA with StreamingFast for timely patch delivery on the top 10 chains, and should mirror and independently build these patches as a redundancy hedge. This is a genuine single-point-of-failure at the human/organizational level that cannot be papered over with decentralization rhetoric.
How does Mainline interact with the chain integrations I ship day-to-day? When we onboard a new chain to The Graph’s Subgraph support (currently a cross-team workflow between my team, StreamingFast, and the indexer community), we already have the Firehose instrumentation as a deliverable. Mainline changes the deliverable target: instead of “get StreamingFast’s Firehose endpoint to the indexer community,” it becomes “get the Firehose instrumentation into a state where any Mainline operator can run it.” This is mostly a docs and packaging shift — the code already exists — but it is work I will need to staff.
Bandwidth economics for small operators. An operator in a region with expensive egress (ex-US, ex-EU) may not be cost-competitive on a per-GiB basis. The subscription lane partially mitigates this by letting operators amortize fixed egress across a monthly commitment. Longer-term, geographic pricing tiers may be warranted; I do not recommend building that into Phase 1.
Proprietary-chain problem. Some chains have no open-source Firehose instrumentation (some Cosmos-based chains, newer alt-L1s). Mainline cannot serve these until someone invests in the instrumentation. This is the same gap that exists for Subgraphs today and is orthogonal to this proposal.
Race condition with the “Horizon-based P2P data service MVP” on the 2026 roadmap. The roadmap references a “Horizon-based P2P data service MVP” for Substreams in Q3 (The Graph Technical Roadmap). If that MVP bundles Firehose into Substreams, Mainline is redundant. If it layers Substreams over a separate Firehose DS, Mainline is that DS. This GRC exists partly to force that architectural decision explicitly rather than letting it get answered by default.
Cursor portability assumption. The portable cursor in §3.7 assumes Mainline operators maintain a ForkDB deep enough to resume from an arbitrary recent block. Operators that run a minimal ForkDB (only N blocks deep) will reject resumption from cursors outside that window. The protocol handles this gracefully with MAINLINE_CURSOR_UNRESUMABLE, but it means not every operator can serve every cursor; the gateway’s routing must respect this.
7. Reference Implementation Plan
7.1 Repository structure
Model on graphprotocol/substreams-data-service and cargopete/dispatch:
graphprotocol/firehose-data-service/
├── contracts/ # Solidity (Hardhat + Foundry)
│ ├── FirehoseDataService.sol
│ ├── FirehoseDisputeVerifier.sol (Phase 3)
│ └── test/
├── mainline-service/ # Rust — the indexer-side daemon
│ ├── src/
│ │ ├── main.rs
│ │ ├── grpc/ # re-exports sf.firehose.v2 stubs via tonic
│ │ ├── attestation/ # MainlineAttestation signing
│ │ ├── billing/ # TAP receipt verification
│ │ └── chain_adapter/ # pluggable per-chain adapters (eth, sol, ...)
│ └── Cargo.toml
├── mainline-gateway/ # Rust — optional managed gateway (mirrors dispatch-gateway)
├── mainline-sdk/ # Rust + TypeScript consumer SDKs (TAP signing)
├── subgraph/ # The Graph network subgraph for Mainline state
└── docs/
Language: Rust throughout the off-chain stack. This matches GRC-005 Dispatch, matches GRC-004’s push on Rust as a first-class language (GRC-004), and matches the Rust-based indexer-service-rs and indexer-tap-agent v2.0.0 that the rest of the Horizon stack runs on. Go is not proposed for the service layer despite Firehose being a Go ecosystem — Mainline is not reimplementing Firehose, it is wrapping it in Horizon primitives that are all Rust.
7.2 Operator stack (reference topology)
┌─────────────────────────────────────────────────────────────┐
│ Instrumented node (geth-firehose / firesol / ...) │
│ ├─ dmlog → firecore reader │
│ └─ merged-blocks → object store (S3 / Ceph / GCS) │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ firehose-core (Relayer + gRPC server), unmodified │
│ Port 13042 — speaks sf.firehose.v2 natively │
└─────────────────────────────────────────────────────────────┘
│ gRPC (internal)
▼
┌─────────────────────────────────────────────────────────────┐
│ mainline-service (Rust, this GRC) │
│ - TAP receipt validation │
│ - MainlineAttestation signing per block │
│ - Per-chain advertised-LIB publishing │
│ - Quality metrics (latency, throughput, completeness) │
│ - TLS termination │
└─────────────────────────────────────────────────────────────┘
│ sf.firehose.v2 (TLS)
▼
Consumers
Indexers continue to run the indexer-agent and indexer-tap-agent they already run for Subgraphs and Dispatch. Mainline adds one new binary (mainline-service) plus the Firehose stack itself.
7.3 Initial chain coverage
Phase 0: Ethereum Mainnet only. Smallest risk surface, best-understood Firehose integration, and the single most-demanded chain by existing Graph consumers.
Phase 1 expansion: Base, Solana, Arbitrum One. Base because it is fast-growing and fully-EVM (same geth-firehose patch); Solana because it is the most commercially interesting streaming market (per LaserStream’s traction); Arbitrum because the Graph protocol itself runs there and the dogfooding matters.
Phase 2: Optimism, Polygon, BSC, NEAR, Starknet, plus any chain sponsoring its own onboarding.
9. Economic Analysis
These numbers are rough estimates intended to size the opportunity, not commitments. All figures assume GRT at $0.10 for ease of reasoning.
9.1 Operator unit economics (Ethereum Mainnet only, Phase 2)
Annual fixed costs for one operator (single-chain, single-region):
| Line item | Estimate | Notes |
|---|---|---|
| Instrumented full node (compute) | $6,000 | 32 GB RAM, 16 vCPU, 4 TB NVMe, 24×7 |
| Archive/merged-blocks storage | $4,800 | ~5 TB @ $0.02/GB/mo object storage |
| Egress (assume 50 TB/year) | $4,500 | Hyperscaler egress rates; cheaper on bare metal |
| Operator labor share | $10,000 | Marginal cost for an existing indexer |
| Total | ~$25,300/yr |
Break-even revenue at 4:1 stake-to-fees ratio and 25k GRT provision:
- Provision locked: 25,000 GRT = $2,500 at $0.10
- Max fees collectable before provision fully locked: 25,000 / 4 = 6,250 GRT per thawing cycle (21 days) = ~$625
- Annualized max fees per cycle: ~$10,800
- At 0.35 GRT/GiB historical pricing, an operator needs ~18 TB/year of paid egress to approach that ceiling
This says something important: at Phase 2 volume, provision locking, not hardware, is the binding economic constraint. The stake-to-fees ratio should be revisited once real demand data exists. I prefer erring low (4:1) and tightening later rather than over-collateralizing and starving supply.
9.2 Pricing floor
The recommended default of 0.35 GRT/GiB historical backfill is roughly 40% below StreamingFast’s managed Firehose pricing on a comparable-service basis (adjusting for the commercial bundling). This is deliberate: a decentralized service must be price-competitive with the centralized incumbent to earn migration, and the overhead cost of Mainline’s on-chain settlement is real but small relative to the operator’s underlying cost structure. The subscription lane is priced to match Helius’ LaserStream positioning ($999/mo ≈ 10,000 GRT/mo — though a Mainline-native subscription would be per-chain, giving fine-grained pricing).