Merge changes to docs (#41)

* Merge changes to docs

* Fix typo

* Correct SUMMARY so it compiles; update .gitignore

* Clean up statements.md

Make syntax and notation consistent with Rust source code.

* Fix statements for Merkle trees and compound types

* First draft of custom statements and small updates to signedpod.md

* Update book/src/merkletree.md

Co-authored-by: Ahmad Afuni <root@ahmadafuni.com>

* merklestatements correct typo

Co-authored-by: Ahmad Afuni <root@ahmadafuni.com>

* add todo for gadget ids

Co-authored-by: Ahmad Afuni <root@ahmadafuni.com>

* Remove custom statements, will do on separate branch

* Restore Merkle examples and statements table

---------

Co-authored-by: Ahmad Afuni <root@ahmadafuni.com>
This commit is contained in:
tideofwords 2025-02-10 10:06:45 -08:00 committed by GitHub
parent 34a223ac76
commit dc6b5553e8
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
10 changed files with 467 additions and 37 deletions

View file

@ -1,12 +1,99 @@
# MerkleTree
In the POD system, MerkleTrees are used to store the key-values of the POD. From the high level, we can think of it as a 'hashmap' storage, that allows us to generate proofs of inclusion and non-inclusion of the key-values stored into it.
In the POD2 backend, a MerkleTree is used to store an unordered set of key-value pairs. The frontend compound types `Array`, `Set`, and `Dictionary` are all represented as `MerkleTree`s on the backend.
From the high level, we can think of a `MerkleTree` as a 'hashmap' storage, that allows us to generate proofs of inclusion and non-inclusion of the key-values stored into it.
## Leaves
Each leaf position is determined by the `key` content in binary representation (little-endian).
A `MerkleTree` is represented in-circuit as its Merkle root; in the Plonky2 backend, this root is a tuple of four field elements. This makes a `MerkleTree` the same size in-circuit as the atomic types `Integer` and `String`. (In general, regardless of the proof system used on the backend, all three types are represented in-circuit by the same number of field elements; this number is determined by the security requirement of the hash function.)
### Example 1
The encoding of the `MerkleTree` is a recursive process:
- Encode all keys and values in the `MerkleTree`.
- Put all keys and values into a sparse Merkle tree.
- The `MerkleTree` is encoded in-circuit as the root of this sparse Merkle tree.
This document explains the construction of the sparse Merkle tree.
## The branching rule
A sparse Merkle tree is implemented as a binary tree. The insertion path of any key is given by a deterministic rule: given ```key``` and a nonnegative integer ```depth```, the rule determines that ```key``` belongs to either the ```left``` or ```right``` branch at depth ```depth```.
The precise rule is as follows. In-circuit, compute a Poseidon hash of ```key``` to obtain a 4-tuple of field elements
```
Poseidon(key) = (k_0, k_1, k_2, k_3).
```
Write the field elements in binary (in little-endian order):
```
k_0 = b_0 b_1 ... b_63
k_1 = b_64 b_65 ... b_127
....
```
At the root, ```key``` goes to the left subtree if ```b_0 = 0```, otherwise the right subtree. At depth 1, ```key``` goes to the left subtree if ```b_1 = 0```, otherwise the right subtree, and similarly for higher depth.
## The tree structure
A Merkle tree with no entry at all is represented by the hash value
```root = hash(0).```
(With the Plonky2 backend, the hash function ```hash``` will output a 4-tuple of field elements.)
A Merkle tree with a single entry ```(key, value)``` is called a "leaf". It is represented by the hash value
```root = hash((key, value, 1)).```
A Merkle tree ```tree``` with more than one entry is required to have two subtrees, ```left``` and ```right```. It is then represented by the hash value
```root = hash((left_root, right_root, 2)).```
(The role of the constants 1 and 2 is to prevent collisions between leaves and non-leaf Merkle roots. If the constants were omitted, a large Merkle tree could be dishonestly interpreted as a leaf, leading to security vulnerabilities.)
## Example 1
Suppose we want to build a Merkle tree from the following `Dictionary` with three key-value pairs:
```
{
4: "even",
5: "odd",
6: "even"
}
```
The keys are integers, so the are represented in-circuit by themselves. Let's suppose that in little-endian order, the first three bits of the hashes of the keys are:
```
hash(4) = 000...
hash(5) = 010...
hash(6) = 001...
```
The resulting tree looks like:
```
root
/\
/ \
/ \
/ \
L_root R_root = hash(0)
/\
/ \
/ \
/ \
LL_root LR_root = hash((4, "even", 1))
/\
/ \
/ \
/ \
LLL_root LLR_root = hash((5, "odd", 1))
||
hash((6, "even", 1))
```
The intermediate roots are computed as hashes of their subroots:
```
LL_root = hash((LLL_root, LLR_root, 2))
L_root = hash((LL_root, LR_root, 2))
root = hash((L_root, R_root, 2)).
```
The full `Dictionary` is then represented in the backend as `root` (four field elements in the Plonky2 backend).
### Example 2
So for example, imagine we have 8 key-pairs, where the keys are just an enumeration from 0 to 7, then the tree leaves positions would look like:
![](img/merkletree-example-1-a.png)
@ -15,9 +102,9 @@ Now let's change the key of the leaf `key=1`, and set it as `key=13`. Then, thei
![](img/merkletree-example-1-b.png)
### Example2
### Example 3
Suppose we have 4 key-values, where the keys are `0000`, `0100`, `1010` and `1011`. The tree would look like:
Suppose we have 4 key-values, where the first four bits of the hashes of the keys are `0000`, `0100`, `1010` and `1011`. The tree would look like:
![](img/merkletree-example-2-a.png)
To iterate this example, suppose we have the following data in a POD:
@ -36,7 +123,7 @@ To iterate this example, suppose we have the following data in a POD:
The merkletree will contain the key values from the `kvs` field.
Suppose that the binary representation of the key `userPk` is `1011...`. This uniquely defines the leaf position that contains the public key of the authenticated user. Similarly for the other key-values:
Suppose that the binary representation of the hash of the key `userPk` is `1011...`. This uniquely defines the leaf position that contains the public key of the authenticated user. Similarly for the other key-values:
![](img/merkletree-example-2-b.png)