Search Extension

API/RSS

Social

Full text of Vitalik's ETHCC speech: How to optimize Ethereum in the future?

Vitalik Buterin

24-07-15 16:48

Read this article in 39 Minutes

AI summary

View the summary

Original translation: 0xxz, Golden Finance

EthCC7 was recently held in Brussels, and the organizer invited Ethereum founder Vitalik to give a keynote speech.

It is worth noting that 2024 is the 10th anniversary of Ethereum IC0. After Vitalik's speech, the three former core founders of Ethereum, Vitalik Buterin, Joseph Lubin and Gavin Wood, took a group photo together again to commemorate.

This article is the keynote speech of Ethereum founder Vitalik at EthCC7 recently.

Talk Topic

Strengthening L1: Optimizing Ethereum to be a highly reliable, trustworthy, and permissionless Layer 2 base layer

Ethereum Vision Spectrum

I think there is a spectrum of possible different roles that the Ethereum base layer might play in the ecosystem over the next five to ten years. You can think of it as a spectrum from left to right.

On the left side of the spectrum, it basically tries to be a very minimalist base layer that basically just acts as a proof validator for all L2s. Maybe also provide the ability to transfer ETH between different L2s. But other than that, that's basically it.

On the right side of the spectrum, there's basically a refocus on dApps running primarily on L1, with L2 only being used for some very specific and high-performance transactions.

In the middle of the spectrum there are some interesting options. I put Ethereum as a base layer for L2 second from the left. On the far left I put an extreme version, and the extreme version is that we completely throw away the execution client part of the entire Ethereum, just keep the consensus part, add some zero-knowledge proof validators, and basically turn the entire execution layer into a Rollup as well.

I mean the very extreme options are on the left, and on the right it can be a base layer, but it can also try to provide more functionality for L2. One idea in this direction is to further reduce the swap time of Ethereum, which is currently 12 seconds, maybe down to 2-4 seconds. The purpose of this is to actually make basic rollups feasible as the main way L2 operates. So right now, if you want L2 to have a top UX, you need to have your own pre-confirmation, which means either a centralized sorter or your own decentralized sorter. If their consensus speeds up, then L2 will no longer need to do this. If you really want to enhance the scalability of L1, then the need for L2 will also decrease.

So, it's a spectrum. Right now I'm focusing on the second version on the left, but the things I'm suggesting here also apply to other visions, and the suggestions here don't actually hinder other visions. This is something I think is very important.

Ethereum's robustness advantage

A big advantage of Ethereum is that there is a large and relatively decentralized staking ecosystem.

The left side of the above picture is a chart of all Bitcoin mining pools, and the right side is a chart of Ethereum stakers.

Bitcoin's hashrate distribution is not very good at the moment, with two mining pools adding up to more than 50% of the hashrate, and four mining pools adding up to more than 75%.

And Ethereum's situation is actually better than the chart shows, because the second largest part of the gray part is actually unidentified, which means it could be a combination of many people, and there may even be a lot of independent stakers in it. And the blue part Lido is actually a weird, loosely coordinated structure consisting of 37 different validators. So, Ethereum actually has a relatively decentralized staking ecosystem that performs quite well.

There are a lot of improvements we can make in this regard, but I think there is still value in recognizing this. This is one of the unique advantages that we can really build on top of it.

Ethereum's robustness advantages also include:

· Having a multi-client ecosystem: There are Geth execution clients and non-Geth execution clients, and the proportion of non-Geth execution clients even exceeds that of Geth execution clients. A similar situation also occurs in consensus client systems;

· International community: people are in many different countries, including projects, L2, teams, etc.;

· Multi-center knowledge ecosystem: there is the Ethereum Foundation, there are client teams, and even teams like Paradigm's Reth have been increasing their leadership in open source recently;

· A culture that values these attributes

So, the Ethereum ecosystem already has these very strong advantages as a base layer. I think this is a very valuable thing and should not be given up easily. I would even say that there are clear steps that can be taken to further advance these advantages and even make up for our weaknesses.

Where does Ethereum L1 fall short of high standards and how can it be improved?

This is a poll I did on Farcaster about half a year ago: If you are not doing Solo staking, what is stopping you from doing it?

I can repeat this question in this room, who is doing Solo staking? If you are not doing Solo staking, who thinks the 32 ETH threshold is the biggest barrier, who thinks it is too difficult to run a node is the biggest barrier, who thinks the biggest barrier is not being able to put your ETH into DeFi protocols at the same time? Who thinks the biggest barrier is the fear of having to put your private key on a running node that is more vulnerable to theft?

As you can see, the top two barriers that are unanimously agreed upon are: the minimum requirement of 32 ETH and the difficulty of operating a node. It is always important to recognize this.

A lot of times when we start to dig into how to maximize how people can double-use their collateral in DeFi protocols, we find that a large number of people are not even using DeFi protocols at all. So let's focus on the main issues and what we can do to try to solve them.

Starting from running a validating node, or, in other words, starting from a threshold of 32 ETH. Actually, these two questions are related because they are both functions of the number of validators in Ethereum Proof of Stake.

Today we have about 1 million validator entities, each with a deposit of 32 ETH, so if the minimum requirement changes to 4 ETH, then we would have 8 million or maybe more than 8 million, maybe 9 million or 10 million validators. If we want to reduce to 100,000 validators, then the minimum requirement may have to go up to about 300 ETH.

So, it's a trade-off. Ethereum has historically tried to be in the middle of the trade-off. But if we can find any way to improve it, then we have additional statistical points that we can choose to use to reduce the minimum requirement, or to make it easier to run a node.

In fact, right now I think that aggregating signatures isn't even the main difficulty in running a node. In the beginning, we'll probably focus more on reducing the minimum requirements, but eventually it's going to be both.

So, there are two techniques that can improve both of these aspects.

One technique is to allow staking or finality without requiring every validator to sign. Basically, you need some kind of random sampling, randomly sampling enough nodes to achieve significant economic security.

Right now, I think we have far more than enough economic security. The cost of doing a 51% attack, in terms of the amount of ETH slashed, is one-third of 32 million ETH, which is about 11 million ETH. Who is going to spend 11 million ETH to break the Ethereum blockchain. No one, not even the US government.

These sampling techniques are similar to if you have a house, if the front door is protected by four layers of steel plates, but the windows are just a shoddy glass that someone can easily break with a baseball bat. I think Ethereum is like that to some extent, if you want to do a 51% attack, you have to lose 11 million ETH. But in reality, there are many other ways to attack the protocol, and we really should strengthen these defenses more. So instead, if you have a subset of validators doing finality, then the protocol is still secure enough, and you can really increase the level of decentralization.

The second technique is better signature aggregation. Instead of supporting 30,000 signatures per slot, you can do some advanced stuff like Starks, and eventually we might be able to support more than that. That's the first part.

The second part is making it easier to run a node.

The first step is history expiration, and actually EIP-4444, there's a lot of progress on that.

The second step is stateless clients. Verkle has been around for a long time, and another possible option is to do a binary hash tree like Poseidon, Stark-friendly hash functions. Once you have that, in order to verify Ethereum blocks, you no longer need a hard disk. Later on, you can also add a Type 1 ZKVM that can Stark-verify the entire Ethereum block, so that you can verify arbitrarily large Ethereum blocks by downloading data, or even data availability sampling data, and then you only need to verify a proof.

If you do that, it will become much easier to run a node. One very annoying thing about having stateless clients right now is that if you want to change your hardware or software setup, usually you either need to start from scratch and lose a day, or you need to do something very dangerous and put your keys in two places, which will get slahed, and if we have stateless clients, you no longer need to do that.

You can simply start a new standalone client, shut down the old one, move the keys over, and start the new one. You only lose one epoch.

Once you have ZKVM, the hardware requirements are basically reduced to almost zero.

So, the 32 ETH threshold and the difficulty of running a node, both of these problems can be solved technically. I think there are a lot of other benefits to doing this, which will really improve our ability to increase people's ability to stake individually, and will give us a better ecosystem for individual staking and avoid the risk of staking centralization.

There are other challenges with proof of stake, such as risks associated with liquid staking, risks associated with MEV. These are also important issues that need to continue to be considered. Our researchers are thinking about these.

Recovering from a 51% attack

I really started to think seriously and rigorously. It's surprising that many people don't think about this topic at all and just treat it as a black box.

What would happen if a 51% attack really happened?

Ethereum could be 51% attacked, Bitcoin could be 51% attacked, a government could be 51% attacked, like buying off 51% of the politicians.

One problem is that you don’t want to rely on just prevention, you want to have a recovery plan as well.

A common misconception is that people think 51% attacks are about reversing finality. People focus on this because this is something that Satoshi Nakamoto emphasized in the white paper. You can double spend, after I bought my private jet, I 51% attack, get my Bitcoin back, and I can keep my private jet and fly around.

Actually more realistic attacks might involve deposits on exchanges and things like breaking DeFi protocols.

But reversals are actually not the worst thing. The biggest risk we should worry about is actually censorship. 51% of the nodes stop accepting blocks from the other 49% of the nodes or any node that tries to include a certain type of transaction.

Why is this the biggest risk? Because finality reversal has slashing, there is immediate on-chain verifiable evidence that at least a third of the nodes did something very, very wrong and they were punished.

Where in a censorship attack, this is not programmatically attributable, there is no immediate programmatic evidence to say who did something bad. Now, if you are an online node, if you want to see that a certain transaction has not been included within 100 blocks, but, we don't even have software written to do this kind of check,

Another challenge with censorship is that if someone wants to attack, they can do so, they start by delaying transactions and blocks they don't like for 30 seconds, then delay it for a minute, then delay it for two minutes, and you don't even have consensus on when to respond.

So, I say, actually censorship is the bigger risk.

There's an argument in blockchain culture that if there's an attack, the community will come together and they'll obviously do a minority soft fork and cut off the attacker.

That may be true today, but that relies on a lot of assumptions about coordination, ideology, all kinds of other things, and it's not clear how true that will be in 10 years. So what a lot of other blockchain communities are starting to do is they say, we have things like censorship, we have these inherently more unattributable bugs. So we have to rely on social consensus. So let's just rely on social consensus and proudly admit that we're going to use it to solve our problems.

I'm actually advocating for going in the opposite direction. We know that it's mathematically impossible to fully coordinate an automated response and an automated fork to a majority attacker who's censoring. But we can get as close as we can.

You can create a fork that, based on some assumptions about network conditions, actually brings in at least a majority of the nodes online. The point I'm trying to make here is that what we actually want is to try to make the response to a 51% attack as automated as possible.

If you're a validator, then your node should be running software that automatically forks the majority chain if it detects that a transaction is being censored or that certain validators are being censored, and all honest nodes will automatically coordinate on the same minority soft fork because of the code they run.

Of course, again there is the mathematical impossibility result, at least anyone who is offline at the time will not be able to tell who is right and who is wrong.

There are a lot of limits, but the closer you get to this goal, the less work social consensus needs to do.

If you imagine a 51% attack that actually happens. It's not going to be like, all of a sudden at some point in time, Lido, Coinbase, and Kraken are going to put out a blog post at 5:46 that basically says, hey guys, we're censoring now.

What's going to happen is that you're going to see a social media war at the same time, you're going to see all kinds of other attacks at the same time. If in fact a 51% attack does happen, by the way, I mean, we shouldn't assume that Lido, Coinbase, and Kraken are going to be in power in 10 years. The Ethereum ecosystem is going to become increasingly mainstream, and it needs to be very resilient to that. We want the social layer to be as lightly burdened as possible, which means we need the technical layer to at least present a clear winning candidate, and if they want to fork off of a chain that's censoring, they should rally around a minority soft fork.

I advocate that we do more research and come up with a very specific proposal.

Proposal: Raise the Quorum Threshold to 75% or 80%

I think that the threshold for Quorum (Note: Quorum mechanism is a voting algorithm commonly used in distributed systems to ensure data redundancy and eventual consistency) can be raised from today's two-thirds to around 75% or 80%.

The basic argument is that if a malicious chain, such as a censored chain, attacks, it becomes very, very difficult to recover. However, on the other hand, if you increase the proportion of Quorum, what is the risk? If the Quorum is 80%, then instead of 34% of the nodes being offline to stop finality, it is 21% of the nodes being offline to stop finality.

This is risky. Let's see what happens in practice? From what I've read, I think we've only had one instance where finality was down for about an hour due to more than a third of the nodes being offline. And then, have there been any incidents where 20% to 33% of the nodes were offline? I think at most once, at least zero. Because in practice, very few validators are offline, I actually think that the risk of doing this is pretty low. The payoff is basically that the threshold that an attacker needs to hit is greatly increased, and the range of scenarios where the chain goes into safe mode in the event of a client vulnerability is greatly increased, so people can really collaborate to figure out what the problem is.

If the threshold for Quorum goes from 67% to 80%, then, let's say a client goes from 67% to 80%, then the value of a minority client, or the value that a minority client can provide, really starts to increase.

Other censorship concerns

Other censorship concerns, either the inclusion list or some alternative to the inclusion list. So, this whole multiple parallel proposer thing, if it works, could potentially even be a replacement for inclusion lists. You need, either account abstraction, you need some kind of in-protocol account abstraction.

The reason you need it is, because right now, smart contract wallets don't really benefit from inclusion lists. Smart contract wallets don't really benefit from any kind of protocol-level censorship resistance guarantees.

If there was an in-protocol account abstraction, then they would benefit. So, there are a lot of things, actually a lot of these things have value in both the L2-centric vision and the L1-centric vision.

I think of the different ideas that I talked about, about half are probably specifically for Ethereum focused on L2, but the other half are basically, for users of L2 as the base layer of Ethereum and L1, or, like, direct-to-user applications as users.

Light clients everywhere

In a lot of ways, it's a little sad how we interact with the space, we're decentralized, we're trustless, who in this room runs a light client on his computer that verifies consensus? Very few. Who uses Ethereum by trusting Infura's browser wallet? In five years, I'd like to see the number of hands raised reversed. I'd like to see wallets that don't trust Infura for anything. We need to integrate light clients.

Infura can continue to provide data. I mean, if you don't have to trust Infura, that's actually good for Infura because it makes it easier for them to build and deploy infrastructure, but we have tools that can remove the trust requirement.

What we can do is, we can have a system where the end user runs something like the Helios light client. It should actually run directly in the browser, directly verifying the Ethereum consensus. If he wants to verify something on the chain, like interact with the chain, then you just verify the Merkle proof directly.

If you do that, you actually get a level of trustlessness in your interaction with Ethereum. This is for L1. In addition, we need an equivalent solution for L2.

On the L1 chain, there are block headers, there is state, there is a sync committee, there is consensus. If you verify the consensus, if you know what the block header is, you can walk the Merkle branch and see what the state is. So how do we provide light client security guarantees for L2s. The state root of the L2 is there, and if it's base Rollup, there's a smart contract, and that smart contract stores the block headers for the L2. Or if you have preconfirmations, then you have a smart contract that stores who the preconfirmer is, so you determine who the preconfirmer is and then listen for a two-thirds subset of their signatures.

So once you have the Ethereum block headers, there's a fairly simple chain of trust, hashes, Merkle branches, and signatures that you can verify, and you can get light client verification. The same is true for any L2.

I've brought this up to people in the past, and a lot of times the response is, wow, that's interesting, but what's the point? A lot of L2s are multisig. Why don't we trust the multisig to verify the multisig?

Fortunately, as of last year, that's actually no longer true. Optimism and Arbitrum are in phase 1 of Rollup, which means they actually have proof systems running on-chain, there's a security committee that can cover them in case there are vulnerabilities, but the security committee needs to pass a very high voting threshold, like 75% of 8 people, Arbitrum's size increases to 15 people. So, in the case of Optimism and Arbitrum, they're not just multisig, they have actual proof systems, and those proof systems actually have power, at least in terms of majority power in deciding which chain is right or wrong.

The EVM is even further along, I believe it doesn't even have a security committee, so it's completely trustless. We're really starting to move forward on that, and I know a lot of other L2s are moving forward on that as well. So L2 is more than just multisig, so the concept of light clients for L2 is actually starting to make sense.

We can already verify Merkle branches today, just by writing code. Tomorrow, we can also validate ZKVM, so you can fully validate Ethereum and L2 in your browser wallet.

Who wants to be a trustless Ethereum user in a browser wallet? Great. Who would rather be a trustless Ethereum user on their phone? From a Raspberry Pi? From a smartwatch? From the space station? We'll get to that, too. So, what we need is the equivalent of an RPC configuration that contains not only which servers you're talking to, but also the actual light client validation instructions. That's something we can work towards.

Quantum-Resistant Strategies

The timeline for quantum computing is decreasing. Metaculous thinks quantum computers will arrive in the early 2030s, and some think sooner.

So we need a quantum-resistant strategy. We do have one. There are four parts of Ethereum that are vulnerable to quantum computing, and there are natural replacements for each of them.

The quantum-resistant alternative to Verkle Tree is Starked Poseidon Hash, or if we want to be more conservative, we can use Blake consensus signatures. We currently use BLS aggregate signatures, which can be replaced with Stark aggregate signatures. Blob uses KZG, and can use separate encoding Merkle tree Stark proofs. User accounts currently use ECDSA SECP256K1, which can be replaced with hash-based signatures and account abstraction and aggregation, smart contract wallet ERC 4337, etc.

Once we have that, users can set their own signing algorithms, and basically use hash-based signatures. I think we really need to start thinking about actually building hash-based signatures in a way that user wallets can easily upgrade to hash-based signatures.

Protocol Simplification

If you want a strong base layer, the protocol needs to be simple. It shouldn't have 73 random hooks and some backward compatibility that exists because of some random stupid idea that some random person named Vitalik came up with in 2014.

So there's value in trying to really simplify, start to really eliminate technical debt. Logs are currently based on bloom filters, they don't work very well, they're not fast enough, so there needs to be improvements to Log that add stronger immutability, and we're already doing that on the stateless side, basically limiting the amount of state access per block.

Ethereum is currently an incredible combination of RLP, SSZ, and API. Ideally, we should only use SSZ, but at least get rid of RLP, state, and binary Merkle trees. Once you have a binary Merkle tree, all Ethereum is on the binary Merkle tree.

Fast finality, Single Slot Finality (SSF), clean up unused precompilers, such as the ModX precompiler, which often causes consensus errors. If we can delete it and replace it with high-performance solidity code, it will be great.

Summary

As a strong base layer, Ethereum has very unique advantages, including some advantages that Bitcoin does not have, such as consensus decentralization, such as significant research on 51% attack recovery, etc.

I think it is necessary to really strengthen these advantages. While recognizing and correcting our shortcomings, making sure we meet very high standards. These ideas are completely compatible with the aggressive L1 roadmap.

One of the things I'm most pleased with about Ethereum, especially the core development process, is that our ability to work in parallel has greatly improved. This is a strong point, and we can actually work on a lot of things in parallel. So caring about these topics doesn't actually affect the ability to improve the L1 and L2 ecosystem. For example, improving the L1 EVM to make it easier to do cryptography. It's currently too expensive to verify a Poseidon hash in the EVM. 384-bit cryptography is also too expensive.

So there are some ideas on top of EOF, like SIMD opcodes, EVM max, etc. There is an opportunity to attach this high-performance coprocessor to the EVM. This is better for Layer 2 because they can verify proofs more cheaply, and it is also better for Layer 1 applications because privacy protocols such as zk SNARKs are cheaper.

Who has used privacy protocols? Who wants to pay 40 fees instead of 80 fees using privacy protocols? More people. The second group can use it on Layer 2, and Layer 1 can get significant cost savings.

Ethereum "Big Three" reunited

2024 is the 10th anniversary of Ethereum IC0. The 2024 EthCC invited all three of the former Ethereum core founders, Vitalik Buterin, Joseph Lubin, and Gavin Wood, to attend.

After Vitalik’s speech, they were invited to take a group photo: