Original Title: What I would love to see in a wallet
Original Author: Vitalik Buterin
Original Translation: DeepTech Techflow
Special thanks to Liraz Siri, Yoav Weiss, and the feedback and review from ImToken, Metamask, and OKX developers.
A key layer of the Ethereum infrastructure stack is the wallet, but it is often underestimated by core L1 researchers and developers. The wallet is the window between users and the Ethereum world, where users can only benefit from any decentralization, censorship resistance, security, privacy, or other properties provided by Ethereum and its applications if the wallet itself also possesses these properties.
Recently, we have seen Ethereum wallets make significant progress in improving user experience, security, and functionality. The purpose of this article is to provide my own views on some of the features an ideal Ethereum wallet should have. This is not an exhaustive list; it reflects my cypherpunk bias, which focuses on security and privacy, and it is almost certain that it is incomplete in terms of user experience. However, I believe a wishlist is more valuable in focusing on security and privacy attributes as optimizing for user experience can be more effective simply by deploying and iterating based on feedback.
There is now an increasingly detailed roadmap to improve the user experience for cross-Layer-2, with a short-term and long-term part. Here, I will discuss the short-term part: ideas that could theoretically still be implemented today.
The core idea is (i) built-in cross-Layer-2 sending, and (ii) chain-specific addresses and payment requests. Your wallet should be able to provide you with an address (following the style of this ERC draft) like this:
When someone (or some applications) provides you with an address in this format, you should be able to paste it into the wallet's "recipient" field and then click "send." The wallet should automatically handle the sent data in any way possible:
· If you already have enough of the required type of token on the target chain, send the token directly
· If you have the required type of token on another chain (or multiple other chains), use protocols like ERC-7683 to send the tokens (which is essentially a cross-chain DEX)
· If you have different types of tokens on the same chain or on another chain, use a decentralized exchange to convert them to the correct type of currency on the correct chain and send it. This should require the user's explicit permission: the user will see how much they are paying in fees and how much the recipient is receiving.
A model of a wallet interface that supports cross-chain addresses
The above content applies to the use case of "someone pays you after you copy-paste an address (or ENS, e.g., vitalik.eth@optimism.eth)." If a dapp requests a deposit (e.g., see this Polymarket example), the ideal process would be to extend the web3 API and allow the dapp to make chain-specific payment requests. Then, your wallet will be able to fulfill that request in any way necessary. To provide a good user experience, standardization of the getAvailableBalance request is also needed, and wallets need to carefully consider on which chains to default users' assets for the highest security and transfer convenience.
Chain-specific payment requests can also be embedded in a QR code, which can be scanned by a mobile wallet. In face-to-face (or online) consumer payment scenarios, the receiver will issue a QR code or a web3 API call indicating "I want X units of token Y Z on this chain, with reference ID or callback W," and the wallet can freely fulfill that request in any way. Another option is the claim link protocol, where a user's wallet generates a QR code or URL containing a claim authorization to retrieve a certain amount of funds from their on-chain contract, and the receiver's task is to figure out how to move these funds to their own wallet.
Another related topic is gas payment. If you receive assets on an L2 without ETH and need to send a transaction on that L2, the wallet should be able to automatically use a protocol (e.g., RIP-7755) to pay the on-chain Gas where you have ETH. If the wallet expects you to do more transactions on L2 in the future, it should also only use a DEX to send, for example. ETH worth millions of Gas, so that future transactions can spend Gas directly there (as it's cheaper).
One way I conceptualize the challenge of account security is that a good wallet should serve two purposes simultaneously: (i) protecting users from wallet developers' hacks or malicious attacks and (ii) protecting users from their own missteps.
The "missteps" on the left are unintentional. However, when I saw it, I realized it fit the context very well, so I decided to keep it.
My preferred solution to this, for over a decade, has always been social recovery and multi-signature wallets with hierarchical access control. Users' accounts have two layers of keys: a master key and N guardians (e.g., N = 5). The master key is capable of low-value and non-financial operations. Most guardians need to perform either (i) high-value operations, such as sending all funds in the account, or (ii) changing the master key or any guardian. If needed, the master key can be allowed to perform high-value operations via a time lock.
The above is a basic design that can be expanded. Session keys and permission mechanisms like ERC-7715 can help support different balances between convenience and security for various applications. More intricate guardian structures, such as having multiple time-locked durations at different thresholds, can help maximize the chance of successfully recovering a legitimate account while minimizing the risk of theft.
The above is a basic design that can be expanded. Session keys and permission mechanisms like ERC-7715 can help support different balances between convenience and security for various applications. More intricate guardian structures, such as having multiple time-locked durations at different thresholds, can help maximize the chance of successfully recovering a legitimate account while minimizing the risk of theft.
For experienced cryptocurrency users in the community of experienced crypto users, a viable option is your friends and family's keys. If you request each person to provide you with a new address, then nobody needs to know who they are - in fact, your guardians don't even need to know each other. The likelihood of collusion is low as long as they do not gossip to you. However, this option might not be available for most new users.
The second option is institutional guardians: companies that provide services where they only sign transactions upon receiving additional confirmation information from you, e.g., confirmation codes, or video calls for high-value users. People have long tried to create these, e.g., I gave an introduction to CryptoCorp in 2013. However, so far, these companies have not been very successful.
The third option is multiple personal devices (such as phone, desktop, hardware wallet). This can work, but it is also challenging to set up and manage for inexperienced users. There is also a risk of multiple devices being lost or stolen, especially when they are in the same location.
Recently, we have started to see more universal key-based solutions. The keys can only be backed up on your devices, making it a personal device solution, or they can be backed up in the cloud, relying on a complex mix of password security, institutional and trusted hardware assumptions for security. In fact, keys are a valuable security gain for the average user, but relying solely on them is not enough to protect a user's life savings.
Fortunately, with ZK-SNARKs, we also have a fourth option: ZK-wrapped centralized ID. Examples of this type include zk-email, Anon Aadhaar, Myna Wallet, and more. Essentially, you can take various forms of centralized ID (corporate or governmental) and convert them into Ethereum addresses that can only be transacted by generating a proof of owning the centralized ID with ZK-SNARKs.
With this addition, we now have a wide range of options, and ZK-wrapped centralized IDs have a unique "user-friendliness" aspect.
To achieve this, it needs to be done through a simplified and integrated UI: you should be able to simply specify that you want "example@gmail.com" as a guardian, and it should automatically generate the corresponding zk-email Ethereum address under the hood. Advanced users should be able to input their email (and possibly a privacy salt stored in that email) into an open-source third-party app and confirm that the generated address is correct. This should be the case for any other supported guardian types as well.
It is important to note that a practical challenge zk-email faces today is its reliance on DKIM signatures, which rotates keys every few months and these keys are not signed by any other authority. This means that zk-email today has a degree of trust requirement beyond the provider itself; if zk-email uses TLSNotary in trusted hardware to verify updated keys, it can mitigate this, but it's not ideal. Ideally, email providers should start signing their DKIM keys directly. Today, I would recommend a guardian to use zk-email, but I would not recommend it for most guardians: do not store funds in zk-email as compromise means you cannot access your funds.
New users typically do not want to enter a large number of guardians during their initial registration. Therefore, the wallet should provide them with a very simple option. One natural approach is to use zk-email on their email address, a key stored locally on the user's device (potentially a universal key), and a backup key held by the provider, in a 2-of-3 setup. As users gain more experience or accumulate more assets, they should be prompted at certain points to add more guardians.
Wallet integration into the application is inevitable as apps trying to onboard non-crypto users do not want users to download two new apps at once (the app itself plus an Ethereum wallet), leading to a confusing user experience. However, users of many in-app wallets should be able to link all their wallets together so they only have to worry about one "access control issue." The simplest way is to adopt a layered scheme where there is a quick "linking" process that allows users to set their main wallet as the guardian of all in-app wallets. The Farcaster client Warpcast already supports this:
By default, your Warpcast account recovery is controlled by the Warpcast team. However, you can "claim" your Farcaster account and change the recovery to your own address.
In addition to account security, today's wallets have also put in a lot of work to identify fake addresses, phishing, scams, and other external threats, and strive to protect users from such threats. At the same time, many safeguards are still quite primitive: for example, requiring a click to send ETH or other tokens to any new address, whether you are sending $100 or $100,000. There is no one-size-fits-all solution here. It is a series of slow, continuous fixes and improvements targeting different threat categories. However, there is a lot of value in continuing to improve in this area.
It is now time to take Ethereum's privacy more seriously. ZK-SNARK technology is now very advanced, privacy technologies that do not rely on backdoors to reduce regulatory risk (such as privacy pools) are becoming more mature, and second-layer infrastructures like Waku and ERC-4337 mempools are slowly becoming more stable. However, to date, conducting private transfers on Ethereum requires users to explicitly download and use a "privacy wallet," such as Railway (or Umbra for stealth addresses). This adds great inconvenience and reduces the number of people willing to transact privately. The solution is that private transfers need to be integrated directly into wallets.
A simple implementation is as follows. The wallet can store a portion of the user's assets as a "private balance" in a privacy pool. When a user initiates a transfer, they will first automatically exit the privacy pool. If the user needs to receive funds, the wallet can generate an invisible address.
In addition, the wallet can automatically generate a new address for each application the user interacts with (e.g., a DeFi protocol). Deposits will come from the privacy pool, and withdrawals will go directly into the privacy pool. This allows the user's activity in one application to be unlinkable from their activity in another application.
One advantage of this technology is that it not only serves as a natural pathway for preserving privacy in asset transfers but also as a natural pathway for preserving privacy in identity. Identity has already happened on-chain: any application using identity proof gating (e.g., Gitcoin Grants), any token-gated chat, Ethereum followings protocols, and so on are all on-chain identities. We want this ecosystem to preserve privacy as well. This means that a user's on-chain activity should not be aggregated in one place: each project should store separately, and the user's wallet should be the only thing with a "global view" that can simultaneously see all your proofs. A native ecosystem where each user has multiple accounts helps achieve this goal, as do off-chain proof protocols like EAS and Zupass.
This represents a pragmatic vision for Ethereum privacy in the midterm. While some features could be introduced on L1 and L2 to make privacy-preserving transactions more efficient and reliable, it can be achieved now. Some privacy advocates argue that the only acceptable thing is complete privacy for everything: encrypting the entire EVM. I think that might be the ideal long-term outcome, but it requires a more fundamental rethink of the programming model and is not yet mature enough to be deployed on Ethereum. We do need default privacy to obtain a large enough set of anonymity. However, focusing first on (i) transfers between accounts and (ii) identity and identity-related use cases (such as private proofs) is a practical first step, easier to achieve, and wallets can start using now.
One consequence of any effective privacy solution is the need for users to store off-chain data, whether for payments, identity, or other use cases. This is evident in Tornado Cash, which requires users to hold a "note" representing a deposit of 0.1-100 ETH. More modern privacy protocols sometimes store encrypted data on-chain and decrypt it using a single private key. This is risky because if the key is leaked, or if quantum computers become feasible, the data becomes entirely public. Off-chain proof protocols like EAS and Zupass highlight the need for off-chain data storage.
A wallet not only needs to be software that stores on-chain access privileges but also software that stores your private data. The non-crypto world is also increasingly recognizing this, e.g., please refer to Tim Berners-Lee's recent work on personal data stores. All the issues we need to address regarding robustly ensuring access control, we also need to address regarding robustly ensuring data accessibility and non-leakage. Perhaps these solutions can be combined: if you have N guardians, use M-of-N secret sharing among these N guardians to store your data. Data is inherently harder to protect because you cannot revoke someone's data share, but we should propose as secure a decentralized hosting solution as possible.
Today, wallets trust their RPC providers to inform them of any information about the chain. This is a vulnerability that has two aspects:
1. RPC providers may try to steal funds by providing them with false information, e.g., about market prices.
2. RPC providers can extract private information about the application the user is interacting with and other accounts.
Ideally, we want to plug both of these vulnerabilities. To address the first issue, we need standardized light clients for L1 and L2 that can directly validate blockchain consensus. Helios has done this for L1 and has been doing some groundwork to support some specific L2s. To properly cover all L2s, we need a standard, through which the configuration smart contract representing L2 (also used for chain-specific addresses) can declare a function that might include logic for getting the latest state root and verifying proofs formatted similarly to ERC-3668 and receipts against these state roots. This way, we can have a universal light client that allows wallets to securely verify any state or event on L1 and L2.
For privacy, the only realistic approach today is to run your full node. However, now L2s are entering the picture, running full nodes for everything is becoming increasingly difficult. The equivalent of a light client here is private information retrieval (PIR). PIR involves servers that hold all copies of the data and clients that send encrypted requests to the servers. The servers perform computations on all the data, return the required data to the client, and encrypt it to the client's key without revealing to the server which data the client accessed.
In order to maintain server honesty, each individual database project is a Merkle tree itself, allowing clients to verify them using a light client.
PIR (Private Information Retrieval) involves a very large amount of computation. There are several approaches to solving this problem:
· Brute Force: Improvements in algorithms or specialized hardware may make PIR run fast enough. These techniques may rely on precomputation: the server can store encrypted and shuffled data for each client, which the client can then query. The main challenge in the Ethereum environment is to adapt these techniques to rapidly changing datasets (similar to countries). This makes real-time computation costs lower but likely increases overall computation and storage costs.
· Relaxing Privacy Requirements: For example, limiting each lookup to only have 1 million 'mixins' so that the server knows a million possible values the client can access but not any finer granularity.
· Multi-Server PIR: If you use multiple servers and assume 1-of-N honesty between these servers, the PIR algorithm is usually faster.
· Anonymity Over Confidentiality: Requests can be sent through a mix network to hide the sender of the request rather than the content of the request. However, doing this effectively inevitably adds latency, worsening the user experience.
Identifying the right combination of technologies in the Ethereum environment to maximize privacy while maintaining usability is an open research problem, and I welcome cryptographers to attempt to do so.
Aside from transaction and state access, another key workflow that needs to work seamlessly across L2 contexts is changing an account's verification configuration: whether changing its keys (e.g., recovery) or making deeper changes to the account's entire logic. Here are three layers of solutions, listed in increasing order of difficulty:
1. Replaying Updates: When a user changes their configuration, the message authorizing this change will be replayed on-chain for every chain where the wallet detects the user holds assets. It is possible that message formats and validation rules can be chain-agnostic, enabling automatic replay across as many chains as possible.
2.Key Store on L1: Configuration information is located on L1, and wallets on L2 use L1SLOAD to read it or remote static calls. This way, updating the configuration on L1 will automatically take effect.
3.Key Store on L2: Configuration information exists on L2, and wallets on L2 use ZK-SNARK to read it. This is similar to (2) except that key store updates may be cheaper but reads may be more expensive.
Solution (3) is particularly powerful as it can integrate well with privacy. In a typical "privacy solution," a user possesses a secret s, publishes a "leaf value" L on-chain, and proves L = hash(s, 1) and N = hash(s, 2) for some (never revealed) secret they control. An invalid symbol N is published ensuring that future spending from the same leaf will fail without revealing L, which relies on the security of user s. A recovery-friendly privacy solution would state: s is a location (e.g., an address and storage slot) on-chain, and the user must prove a state query: L = hash(sload(s), 1).
The weakest link in user security is often the dapp. Most of the time, users interact with applications through websites, where the website implicitly downloads real-time user interface code from the server and then executes it in the browser. If the server is hacked, or DNS is compromised, users will receive a fake copy of the interface that may deceive them into performing arbitrary actions. Wallet features like transaction simulation are very helpful in mitigating risks, but they are far from perfect.
Ideally, we would transition the ecosystem to on-chain content versioning: users would access the dapp through their ENS name, which would contain the IPFS hash of the interface. Updating the interface would require an on-chain transaction from a multisig or DAO. The wallet would show users whether they are interacting with a more secure on-chain UI or a less secure Web2 UI. The wallet could also indicate if users are interacting with a secure chain (e.g., Phase 1+, multisig security audit).
For privacy-conscious users, wallets could also offer a paranoid mode, requiring users to click to allow HTTP requests, not just web3 operations:
Paranoid Mode Possible Interface Model
A more advanced approach is to go beyond HTML + Javascript and write the dapp's business logic in a dedicated language (possibly a thin layer on top of Solidity or Vyper). Then, the browser can automatically generate any UI for the required functionality. OKContract has already been doing this.
Another direction is Encrypted Economic Information Defense: dapp developers, security firms, chain deployers, and others can create a bond that pays out to affected users (as determined by a on-chain arbitration DAO) if the dapp is hacked or harms users in a highly deceptive manner. Wallets can show users a score based on the size of the bond.
All of the above has taken place in the context of a traditional interface, where there is pointing, clicking on things, and inputting things into text fields. However, we are also at the cusp of a paradigm shift:
· Artificial Intelligence that may lead us from a click-type paradigm to a "say what you want to do, and the robot figures it out" paradigm.
· Brain-Machine Interfaces: There are both "gentle" methods like eye-tracking and more direct or even invasive technologies (see: this year's first Neuralink patient).
· Client-Side Active Defense: The Brave browser actively protects users from ads, trackers, and many other malign actors. Many browsers, extensions, and crypto wallets have whole teams actively working to shield users from various security and privacy threats. These "active guardians" will only grow stronger in the next decade.
These three trends together will lead to a deeper reimagining of how interfaces work. Through natural language input, eye tracking, or eventually more direct brain-machine interfaces, coupled with your history (perhaps including text messages, as long as all data is processed locally), a "wallet" can understand clearly and intuitively what you want to do. Then, AI can translate that intuition into a concrete "action plan": a series of on-chain and off-chain interactions to accomplish your intent. This can significantly reduce the need for third-party user interfaces. If users do engage with third-party apps (or other users), AI should engage in adversarial thinking on behalf of the user, identify any threats, and propose action plans to mitigate them. Ideally, these AIs should have an open ecosystem generated by different groups with varying biases and incentive structures.
These more radical ideas rely on technology that is very immature today, so I wouldn't put my assets into a wallet that depends on them today. However, similar things seem to be an obvious trend for the future, so it's worth starting to explore in this direction more actively.
欢迎加入律动 BlockBeats 官方社群:
Telegram 订阅群:https://t.me/theblockbeats
Telegram 交流群:https://t.me/BlockBeats_App
Twitter 官方账号:https://twitter.com/BlockBeatsAsia