A Merkle Tree is a binary tree data structure where every leaf node contains the hash of a data element, and every non-leaf node contains the hash of its two child nodes creating a hierarchical structure that allows efficient and secure verification of large datasets. Merkle Trees are fundamental to blockchain design, enabling Bitcoin and Ethereum to organise block transactions in a way that allows efficient verification without downloading entire blocks.
BUILDING A MERKLE TREE STEP BY STEP
Imagine a block with four transactions: TX1, TX2, TX3, TX4.
Step 1 — Hash the leaves: H1 = hash(TX1), H2 = hash(TX2), H3 = hash(TX3), H4 = hash(TX4).
Step 2 — Hash the pairs: H12 = hash(H1 + H2), H34 = hash(H3 + H4).
Step 3 — Hash to root: Merkle Root = hash(H12 + H34).
Each layer halves the number of nodes. For a block with 2,048 transactions, the tree has 11 levels (log₂ 2048). The Merkle Root at the top summarises all 2,048 transactions in a single 32-byte hash stored in the block header.
ODD NUMBER OF TRANSACTIONS
If a block has an odd number of transactions at any level, the last hash is duplicated to create a pair ensuring every level has an even number of nodes to pair up.
MERKLE PROOFS: THE KEY APPLICATION
A Merkle Proof allows anyone to verify that a specific transaction is included in a block using only O(log n) hashes a logarithmic fraction of all transaction data. To prove TX1 is in the block: provide H2 (TX1's sibling), H34 (the other branch), verify that hash(hash(TX1) + H2) combined with H34 produces the known Merkle Root. This tiny proof just two hashes in our example is as cryptographically trustworthy as downloading the entire block.
MERKLE TREES BEYOND BLOCKCHAIN
Merkle Trees are used throughout computer science: Git version control uses Merkle Trees to track code changes. Certificate Transparency logs use Merkle Trees to verify SSL certificate issuance. ZK-SNARK proofs use Merkle Trees for efficient state membership proofs. IPFS uses content-addressed Merkle DAGs for distributed file storage.