Do We Really Need Ethereum Blobs?!
Currently, the most popular and widely supported solution for scaling Ethereum is Layer-2 (L2) networks, specifically rollups. In a rollup, computations are performed off-chain, while transaction data is stored on-chain, where consensus is also established. This ensures the same level of security as the Ethereum mainnet.
In practice, this works by having an off-chain sequencer collect transactions, bundle them into a block, and then submit the new state root and transaction data to the blockchain. Rollups can be categorized into two main types:
Optimistic Rollups
In optimistic rollups, the system assumes that the new state root is valid (hence the name “optimistic”), so no immediate verification is performed. However, the new state root can be challenged within a certain time frame (e.g., 7 days). If someone submits a valid fraud-proof within this period, the state root is reverted to the last valid state, and the sequencer who submitted the invalid state root is penalized.
Pros: Optimistic rollups are cost-effective.
Cons: Moving assets between L1 and L2 is slow due to the challenge period. Additionally, implementing the fraud-proof mechanism is complex—Optimism, for example, had to build a full MIPS virtual processor to support it.
zk-Rollups
In zk-Rollups, the sequencer submits a zero-knowledge proof (zk-proof) along with the new state root and transaction data. This proof mathematically verifies that the new state root is correct without requiring explicit transaction validation on-chain.
Pros: Verifying a zk-proof on-chain is efficient and cost-effective, and transactions are faster because there is no need for a long challenge period.
Cons: Generating zk-proofs is computationally intensive and complex. However, improving algorithms and more powerful hardware are gradually reducing this challenge.
The Role of Blobs in Ethereum Rollups
Before blobs were introduced, rollups stored all transaction data in calldata—the same section where smart contract call parameters are recorded. While calldata is relatively cheap, it quickly became a bottleneck as demand grew. This is where blobs come into play.
Blobs were introduced in Ethereum through EIP-4844. They are data packets attached to blocks, designed to offer a more scalable way to store transaction data. Unlike calldata, which remains permanently on-chain, a blob is only retained for 18 days. However, it allows for much larger and cheaper data storage compared to calldata.
The 18-day retention period is longer than the challenge period for optimistic rollups, making blobs suitable for storing data for fraud proofs.
Blobs function as sidecars to Ethereum blocks. Their contents are not accessible to smart contracts; instead, each blob is associated with a KZG commitment—a cryptographic hash that uniquely represents the blob’s contents.
This KZG commitment enables the creation of inclusion proofs, which can verify whether a specific data entry exists at a certain position within a blob.
- In optimistic rollups, this feature is useful for validating fraud proofs.
- In zk-Rollups, the zero-knowledge proof must also verify that the transactions generate the expected KZG commitment, ensuring cryptographic consistency (as the KZG commitment is a public parameter).
Ethereum nodes are responsible for ensuring the availability of blob data. However, not every node stores the entire blob. Instead, Ethereum uses erasure coding, which distributes fragments of the blob across multiple nodes. If a sufficient number of nodes retain their assigned portions, the entire blob can be reconstructed when needed.
Nodes that store blob data must provide inclusion proofs to verify that they indeed hold the data. Ensuring data availability is an essential function of the Ethereum protocol itself.
Blobs seem like an ideal solution to calldata’s scalability bottleneck. They offer cheaper, larger, and more efficient data storage while maintaining availability guarantees.
But do we really need them? Is this entire mechanism truly necessary?
First and foremost, blobs do not provide true on-chain storage compared to calldata. Since blob data is only available for 18 days, it is sufficient for generating fraud proofs, but anyone needing to reconstruct the correct state root still requires access to historical data. This means that an external archiving solution is necessary to preserve blob data. But if such an off-chain storage solution already exists, do we even need blobs in the first place?
Validium and Off-Chain Storage Solutions
Rollups that store data off-chain are officially called Validium. A strong candidate for archiving blob data and supporting Validium-type rollups is Ethereum Swarm.
Swarm uses an erasure coding mechanism as Ethereum blobs to ensure redundant storage across multiple nodes. Additionally, in Swarm, each data chunk is hashed as a Binary Merkle Tree (BMT), where the chunk’s address is the Merkle root. This allows for inclusion proofs similar to those used in the KZG commitment scheme for blobs.
Currently, Swarm only implements a positive incentive system, where data availability is verified using, among other checks, randomized inclusion proofs, much like blobs. In simple terms, rewards are distributed through a recurring redistribution game, where nodes can claim a reward pot if they prove data availability using inclusion proofs.
This system could easily be enhanced with a stake-based guarantee for data storage. If a storage node fails to provide proof of inclusion, it wouldn’t just lose potential rewards—it could also forfeit a significant stake, ensuring stronger data availability guarantees.
As we can see, a well-designed incentive system can ensure the same level of data availability as blobs. The only real advantage blobs have over Swarm is that Ethereum itself guarantees data availability.
But the key question remains: do we actually need this guarantee?
In optimistic rollups, the data availability (DA) challenge mechanism can be introduced alongside fraud proofs to address data loss risks.
If, despite the guarantees mentioned earlier, a portion of the data becomes unavailable on Swarm, a DA challenge can be initiated. The storage node responsible for the missing data must then respond by providing the requested chunk via calldata. If it fails to do so, the rollup is rolled back, just like in the case of a fraud-proof.
This mechanism ensures full validation of the state root without requiring blobs.
For optimistic rollups like Optimism, blobs are unnecessary. Storing transaction data on Swarm is much cheaper and more efficient than using blobs.
In zk-Rollups, the key requirement is proving that the blob commitment (a special hash) corresponds to the correct transactions.
Here, KZG commitments are more suitable than Merkle roots based on SHA-256, as used in Swarm. This is because SHA-based hashes are computationally expensive to include in zero-knowledge proof generation.
Fortunately, Swarm’s Single Owner Chunk (SOC) solution allows defining custom chunk types while keeping them within the same address space. This is the same approach used for erasure coding to generate dispersed replicas in cases where the scope contains only a single chunk.
Thus, we could define chunks whose addresses are derived from KZG commitments, or retain the Binary Merkle Tree structure but replace SHA-256 with a zk-friendly hash function, such as Poseidon.
With these modifications, Swarm’s data availability mechanisms would continue to function seamlessly, while also becoming more zk-Rollup friendly. Swarm is a low-cost, highly adaptable general-purpose storage solution that also provides a reliable method for archiving transaction data.
An additional advantage of Swarm is that, beyond data storage, it also incentivizes data transfer—a crucial aspect that Ethereum itself currently lacks any built-in mechanism for.
Conclusion: Are Blobs a Step in the Wrong Direction?
Based on the above, I believe that introducing blobs is a fundamentally flawed approach. It’s essentially a hack—a makeshift temporary storage layer bolted onto the blockchain. However, this is unnecessary, as existing solutions already effectively address the problem.
Even the term “sidecar” highlights that blobs are not an organic part of the system—they have simply been tacked onto blocks rather than integrated into Ethereum’s architecture. The movement of blob data also places an unnecessary burden on the network. A blockchain should not be responsible for the temporary storage of large data packages.
Ethereum should stay a Blockchain—storage should be handled by storage engines.