Detailed explanation of Merkel tree and Merkel root

Science Learning

2023-11-18 22:51 Update

Table of Contents

What is a Merkle tree?
How does a Merkle tree work?
Why is Merkel root used in Bitcoin?
- Mining
- Verification
Summary

What is a Merkle tree?

In the early 1980s, Ralph Merkle, a well-known computer scientist in the field of public key cryptography, proposed the concept of Merkle tree.

The Merkle tree structure can effectively verify the integrity of the data set, and is even more effective in peer-to-peer networks that require participants to share and independently verify information.

The hash function is the core of the Merkle tree structure. Therefore, we recommend that you understand what is hashing before continuing reading this article.

How does the Merkle tree work?

Suppose you want to download a large file. With open source software downloads, it is often necessary to check whether the hash of the downloaded file matches the one made public by the developer. If they match, the two documents are consistent.

If the hashes don't match, you're in trouble. Either you downloaded a malicious file disguised as software, or you downloaded it incorrectly, and the end result is that the file is unusable. If the download is incorrect, you will definitely feel annoyed because you have been waiting for a long time for the file to download. Now if you start over again, you have to hope that the same problem won't happen again.

Have you considered Is there an easier way to solve this problem? This is where the Merkle tree comes in. Merkle trees can break files into multiple data blocks. For example, a 50GB file can be broken into 100 copies, each of which is 0.5GB in size. Then, you can download them one by one. This is how torrents work.

The file source at this time is a hash value, called Merkle root. This single hash value represents all the data blocks that make up the file. Moreover, Merkle roots make data validation easier.

In order to facilitate understanding, we give an example. Below, an 8GB file is divided into eight parts, and each fragment is named from A to H. Each fragment is then plugged into a hash function, resulting in eight different hash values.

通过哈希函数，计算出八个片段的哈希值。

Use the hash function to calculate the hash of eight fragments Hope value.

Hopefully the above example explanation is easy to understand. We have the hash values of all the fragments. If one of them is wrong, can we find the problem by comparing it with the source file? Maybe, but it's still extremely inefficient. If the file has tens of thousands of fragments, do we need to hash all the fragments and compare the results in detail?

No need. We just need to combine a pair of hash values and perform a combined hash operation. That is, we performed withhA + hB,hC + hD,hE + hF, andhG + hH Hash operation. The result will be four hash values. We then proceed to the next round of merged hashes until we obtain two hash values. These two hash values are then combined and the main hash value is finally obtained, which is the Merkle root (also called the root hash value).

这个结构看起来像一棵倒置的树。底部一排叶子，相互结合产生节点，最后生成根。

This structure looks like an upside-down tree. The bottom row of leaves combines with each other to create nodes, and finally roots.

Now we have the Merkle root representing the downloaded file. Compare the root hash with the value of the source file, and if it matches, everyone is happy! Once the hash values are different, it proves that the data has been tampered with. In other words, one or more fragments generated a different hash value. Therefore, even small data modifications can completely change the Merkel root.

Fortunately, it's easy to check for errors. Assume that the error ishE. First, we ask others for two hashes (hABCD and hEFGH) to generate the Merkle root. OurhABCDvalue should match others, proving that the subtree is error-free. IfhEFGHdoes not match, we can correct the error from here. Then ask others for theirhEF andhGHhash values and compare them with your own. IfhGH is OK, then indicatehEF is the culprit. Finally, we compare the hash values of hE and hF, and once we find that the source of the error is hE, we can re-download the data block.

To sum up, the function of the Merkle tree is to divide the data into multiple parts, and then repeat the hash operation to finally form the Merkel root, so that you can effectively verify where the error occurred. data. In the next section, we will introduce other interesting applications.

Want to start your cryptocurrency journey? Go to Binance and buy Bitcoin now!

Why is Merkel root used in Bitcoin?

The Merkle tree has many use cases, but this article focuses on its important role in the blockchain. Bitcoin and many cryptocurrencies are inseparable from Merkle trees. The Merkle tree is an integral part of every block, usually located at the block header. Through the transaction hash value (TXID) of each transaction in the block, we can obtain the leaves.

In this case, the Merkel root serves several purposes. Let’s take a look at the application of Merkle Root in cryptocurrency mining and transaction verification.

Mining

Bitcoin blocks are composed of two parts. The first part is the block header, which is fixed in size and contains block metadata. The second part is the block body, which is variable in size but usually much larger than the block header and contains the transaction list.

Miners need to repeatedly hash the data until a result that meets specific conditions is produced before they can dig out a valid block. In order to get the correct result, they need to try trillions of times. With each attempt, the miner changes the random number in the block header, the Nonce value, to generate different results. However, the rest of the block remains the same, and the thousands of transactions within it still need to be hashed each time.

Merkel has greatly simplified this process. When mining begins, all transaction queues are packed and constructed into a Merkle tree, and the generated 32-bit root hash value is placed in the block header. Then, there is no need to hash the entire block, just the block header.

This method prevents data tampering and is therefore effective, allowing all transactions in a block to be efficiently summarized in a compact form. The transaction list of valid block headers cannot be modified, otherwise the Merkle root will be changed. After the block is sent to other nodes, the root hash is calculated from the transaction list. If it does not match the value in the block header, the block can be rejected.

Verification

We can also use another interesting attribute of Merkle roots, which is related to lightweight Applications for clients (nodes that do not maintain a complete copy of the blockchain). If you're running a node on a device with limited resources, you don't want to download and hash all the transactions in a block. Instead, you simply ask for a Merkle proof, which is proof provided by a full node that a transaction was included in a specific block. This proof is better known as "Simple Payment Verification," or SPV, and was detailed in the Bitcoin white paper by Satoshi Nakamoto.

要想检查hD，只需验证红色的哈希值即可。

To check hD, just verify the red The hash value is enough.

Suppose we want to obtain transaction information about the TXID of hD. IfhCis known,hCDcan be calculated. Then, throughhAB, you can calculatehABCD. Finally, by referring to hEFGH, you can check whether the calculated Merkle root is consistent with the root hash value in the block header. A successful match proves that the transaction was included in the block, as generating the same hash using different data is nearly impossible.

In the above example, we only hashed three times. Without Merkel's certification, it would take seven. Today's blocks contain thousands of transactions, and Merkle proofs save us a lot of time and computing power.

Summary

Merkle trees have proven themselves to be important in computer science applications, as As we've seen, it's also quite valuable in blockchain. Merkle trees make information verification in distributed systems more convenient and avoid congestion of redundant data in the network.

Without Merkle trees and Merkle roots, Bitcoin and other cryptocurrency blocks would not be as compact as they are today. Although lightweight clients lack advantages in terms of privacy and security, with Merkle proofs, users can verify that transactions are included in blocks with minimal fees.