Merge changes to docs (#41)

* Merge changes to docs

* Fix typo

* Correct SUMMARY so it compiles; update .gitignore

* Clean up statements.md

Make syntax and notation consistent with Rust source code.

* Fix statements for Merkle trees and compound types

* First draft of custom statements and small updates to signedpod.md

* Update book/src/merkletree.md

Co-authored-by: Ahmad Afuni <root@ahmadafuni.com>

* merklestatements correct typo

Co-authored-by: Ahmad Afuni <root@ahmadafuni.com>

* add todo for gadget ids

Co-authored-by: Ahmad Afuni <root@ahmadafuni.com>

* Remove custom statements, will do on separate branch

* Restore Merkle examples and statements table

---------

Co-authored-by: Ahmad Afuni <root@ahmadafuni.com>
This commit is contained in:
tideofwords 2025-02-10 10:06:45 -08:00 committed by GitHub
parent 34a223ac76
commit dc6b5553e8
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
10 changed files with 467 additions and 37 deletions

2
.gitignore vendored
View file

@ -1,2 +1,4 @@
/target
Cargo.lock
.DS_Store
aardnotes.md

View file

@ -3,14 +3,18 @@
- [Introduction](./introduction.md)
# Specification
- [Backend types](./backendtypes.md)
- [The frontend structure of a POD]()
- [Frontend POD value types](./values.md)
- [Anchored keys](./anchoredkeys.md)
- [The backend structure of a POD]()
- [Backend types](./backendtypes.md)
- [MerkleTree](./merkletree.md)
- [Deductions](./deductions.md)
- [Statements](./statements.md)
- [Statements involving compound types and Merkle trees](./merklestatements.md)
- [Operations](./operations.md)
- [POD types](./podtypes.md)
- [SignedPOD](./signedpod.md)
- [MainPOD](./mainpod.md)
- [MockPOD](./mockpod.md)
- [POD value types](./values.md)
- [Examples](./examples.md)

22
book/src/anchoredkeys.md Normal file
View file

@ -0,0 +1,22 @@
# Anchored keys
Rather than dealing with just keys, we introduce the notion of an *anchored key*, which is a pair consisting of an origin specifier and a key, i.e.
```
type AnchoredKey = (Origin, Key)
type Key = String
```
An *origin* is a triple consisting of a numeric identifier called the *origin ID*, a string called the *origin name* (omitted in the backend) and another numeric identifier called the *gadget ID*, which identifies the means by which the value corresponding to a given key is produced.
The origin ID is defined to be 0 for 'no origin' and 1 for 'self origin', otherwise it is the content ID[^content-id] of the POD to which it refers. The origin name is not cryptographically significant and is merely a convenience for the frontend.
The gadget ID takes on the values in the following table:
| Gadget ID | Meaning |
|-----------|-------------------------------------------------------------------------------------------|
| 0 | no gadget |
| 1 | `SignedPOD` gadget: The key-value pair was produced in the construction of a `SignedPOD`. |
| 2 | `MainPOD` gadget: The key-value pair was produced in the construction of a `MainPOD`. |
For example, a gadget ID of 1 implies that the key-value pair in question was produced in the process of constructing a `SignedPOD`.
[^content-id]: <font color="red">TODO</font> Refer to this when it is documented.

1
book/src/custom.md Normal file
View file

@ -0,0 +1 @@
# Custom statements and custom operations

View file

@ -0,0 +1,197 @@
# Statements involving compound types and Merkle trees
The front end has three compound types
- `Dictionary`
- `Array`
- `Set`,
all of which are represented as `MerkleTree` on the back end.
The frontend compound types and their implementation as Merkle trees is explained under [POD value types](./values.md#dictionary-array-set). The backend structure of a MerkleTree is explained on [the Merkle tree page](./merkletree.md).
The POD2 interface provides statements for working with Merkle trees and compond types at all layers of the stack:
- Primitive statements for Merkle trees
- General derived statements for Merkle trees
- Specialized `ContainsKey`, `NotContainsKey`, and `ContainsValue` statements for the three front-end types.
## Primitive statements for Merkle trees
```
Branches(parent: AnchoredKey::MerkleTree, left: AnchoredKey::MerkleTree, right: AnchoredKey::MerkleTree)
Leaf(node: AnchoredKey::MerkleTree, key: AnchoredKey, value: AnchoredKey)
IsNullTree(node: AnchoredKey::MerkleTree)
GoesLeft(key: AnchoredKey, depth: Value::Integer)
GoesRight(key: AnchoredKey, depth: Value::Integer)
```
These four statements expose the inner workings of a Merkle tree. Their implementations depend on the implementation details of POD2's sparse Merkle trees. In-circuit, verifying these statements requires low-level computation: either a hash or a binary decomposition.
Every Merkle root either:
- is a special type of Merkle tree called a "null tree", which has no elements,
- is a special type of Merkle tree called a "leaf", which just has a single element, or
- has two branches, left and right -- each of which is itself a Merkle tree. Such a tree is called a "non-leaf" Merkle tree.
### `Branches`
```
Branches(parent, left, right)
```
means that ```parent``` is a non-leaf Merkle node, and ```left``` and ```right``` are its branches.
A `Branches` statement is proved by computing a hash, as specified on [the Merkle tree page](./merkletree.md).
### `Leaf`
```
Leaf(node, key, value)
```
means that ```node``` is a leaf Merkle node, whose single item is the key-value pair ```(key, value)```.
A `Leaf` statement is proved by computing a hash, as specified on [the Merkle tree page](./merkletree.md).
### `IsNullTree`
```
IsNullTree(node)
```
means that ```node``` is a null Merkle tree.
An `IsNullTree` statement is proved by comparing the value of `node` to `hash(0)`.
### `GoesLeft` and `GoesRight`
```
GoesLeft(key, depth)
```
means that if ```key``` is contained in a sparse Merkle tree, then at depth ```depth```, it must be in the left branch.
```GoesRight``` is similar.
A `GoesLeft` or `GoesRight` statement is proved by computing a binary decomposition of `key` and extracting the bit at index `depth`, as specified on [the Merkle tree page](./merkletree.md).
## General derived statements for Merkle trees
```
MerkleSubtree(root: AnchoredKey::MerkleTree, node: AnchoredKey::MerkleTree)
MerkleCorrectPath(root: AnchoredKey::MerkleTree, node: AnchoredKey::MerkleTree, key: AnchoredKey, depth: Value::Integer)
Contains(root: AnchoredKey::MerkleTree, key: AnchoredKey, value: AnchoredKey)
NotContains(root: AnchoredKey::MerkleTree, key: AnchoredKey)
```
### `MerkleSubtree`
```
MerkleSubtree(root, node)
```
means that there is a valid Merkle path of length `depth` from `root` to `node`.
A `MerkleSubtree` statement is proved as follows:
```
MerkleSubtree(root, root)
```
is automatically true.
Otherwise, `MerkleSubtree(root, node)` can be deduced from either
```
MerkleSubtree(root, parent)
Branches(parent, node, other)
```
or
```
MerkleSubtree(root, parent)
Branches(parent, other, node).
```
### `MerkleCorrectPath`
```
MerkleCorrectPath(root, node, key, depth)
```
means that there is a valid Merkle path of length `depth` from `root` to `node`, and if `key` appears as a key in the Merkle tree with root `root`, then `key` must be in the subtree under `node`.
A `MerkleCorrectPath` statement is proved as follows:
```
MerkleCorrectPath(root, root, key, 0)
```
is automatically true.
Otherwise, `MerkleCorrectPath(root, node, key, depth)` can be deduced from either:
```
MerkleCorrectPath(root, parent, key, depth-1)
Branches(parent, node, other)
GoesLeft(key, depth-1)
```
or
```
MerkleCorrectPath(root, parent, key, depth-1)
Branches(parent, other, node)
GoesRight(key, depth-1).
```
### `Contains`
```
Contains(root, key, value)
```
means that the key-value pair ```(key, value)``` is contained in the Merkle tree with Merkle root ```root```.
A `Contains` statement can be deduced from the following two statements.
```
MerkleSubtree(root, node)
Leaf(node, key, value)
```
### `NotContains`
```
NotContains(root, key)
```
means that the key ```key``` is not contained in the sparse Merkle tree with Merkle root ```root```.
The statement `NotContains(root, key)` can be deduced from either
```
MerkleCorrectPath(root, node, key, depth)
Leaf(node, otherkey, value)
NotEqual(otherkey, key)
```
or
```
MerkleCorrectPath(root, node, key, depth)
IsNullTree(node).
```
## Specialized statements for front-end compound types
```
ContainsHashedKey(root: AnchoredKey::DictOrSet, key: AnchoredKey)
NotContainsHashedKey(root: AnchoredKey::DictOrSet, key: AnchoredKey)
ContainsValue(root: AnchoredKey::Array, value: AnchoredKey)
```
When a dictionary or set is converted to a Merkle tree, its key is hashed -- see the [POD2 values page](./values.md#dictionary-array-set).
```ContainsHashedKey(root, key)``` is deduced from
```
Contains(root, keyhash, value)
keyhash = hash(key).
```
```NotContainsHashedKey(root, key)``` is deduced from
```
NotContains(root, keyhash)
keyhash = hash(key)
```
```ContainsValue(root, value)``` is deduced from
```
Contains(root, idx, value).
```

View file

@ -1,12 +1,99 @@
# MerkleTree
In the POD system, MerkleTrees are used to store the key-values of the POD. From the high level, we can think of it as a 'hashmap' storage, that allows us to generate proofs of inclusion and non-inclusion of the key-values stored into it.
In the POD2 backend, a MerkleTree is used to store an unordered set of key-value pairs. The frontend compound types `Array`, `Set`, and `Dictionary` are all represented as `MerkleTree`s on the backend.
From the high level, we can think of a `MerkleTree` as a 'hashmap' storage, that allows us to generate proofs of inclusion and non-inclusion of the key-values stored into it.
## Leaves
Each leaf position is determined by the `key` content in binary representation (little-endian).
A `MerkleTree` is represented in-circuit as its Merkle root; in the Plonky2 backend, this root is a tuple of four field elements. This makes a `MerkleTree` the same size in-circuit as the atomic types `Integer` and `String`. (In general, regardless of the proof system used on the backend, all three types are represented in-circuit by the same number of field elements; this number is determined by the security requirement of the hash function.)
### Example 1
The encoding of the `MerkleTree` is a recursive process:
- Encode all keys and values in the `MerkleTree`.
- Put all keys and values into a sparse Merkle tree.
- The `MerkleTree` is encoded in-circuit as the root of this sparse Merkle tree.
This document explains the construction of the sparse Merkle tree.
## The branching rule
A sparse Merkle tree is implemented as a binary tree. The insertion path of any key is given by a deterministic rule: given ```key``` and a nonnegative integer ```depth```, the rule determines that ```key``` belongs to either the ```left``` or ```right``` branch at depth ```depth```.
The precise rule is as follows. In-circuit, compute a Poseidon hash of ```key``` to obtain a 4-tuple of field elements
```
Poseidon(key) = (k_0, k_1, k_2, k_3).
```
Write the field elements in binary (in little-endian order):
```
k_0 = b_0 b_1 ... b_63
k_1 = b_64 b_65 ... b_127
....
```
At the root, ```key``` goes to the left subtree if ```b_0 = 0```, otherwise the right subtree. At depth 1, ```key``` goes to the left subtree if ```b_1 = 0```, otherwise the right subtree, and similarly for higher depth.
## The tree structure
A Merkle tree with no entry at all is represented by the hash value
```root = hash(0).```
(With the Plonky2 backend, the hash function ```hash``` will output a 4-tuple of field elements.)
A Merkle tree with a single entry ```(key, value)``` is called a "leaf". It is represented by the hash value
```root = hash((key, value, 1)).```
A Merkle tree ```tree``` with more than one entry is required to have two subtrees, ```left``` and ```right```. It is then represented by the hash value
```root = hash((left_root, right_root, 2)).```
(The role of the constants 1 and 2 is to prevent collisions between leaves and non-leaf Merkle roots. If the constants were omitted, a large Merkle tree could be dishonestly interpreted as a leaf, leading to security vulnerabilities.)
## Example 1
Suppose we want to build a Merkle tree from the following `Dictionary` with three key-value pairs:
```
{
4: "even",
5: "odd",
6: "even"
}
```
The keys are integers, so the are represented in-circuit by themselves. Let's suppose that in little-endian order, the first three bits of the hashes of the keys are:
```
hash(4) = 000...
hash(5) = 010...
hash(6) = 001...
```
The resulting tree looks like:
```
root
/\
/ \
/ \
/ \
L_root R_root = hash(0)
/\
/ \
/ \
/ \
LL_root LR_root = hash((4, "even", 1))
/\
/ \
/ \
/ \
LLL_root LLR_root = hash((5, "odd", 1))
||
hash((6, "even", 1))
```
The intermediate roots are computed as hashes of their subroots:
```
LL_root = hash((LLL_root, LLR_root, 2))
L_root = hash((LL_root, LR_root, 2))
root = hash((L_root, R_root, 2)).
```
The full `Dictionary` is then represented in the backend as `root` (four field elements in the Plonky2 backend).
### Example 2
So for example, imagine we have 8 key-pairs, where the keys are just an enumeration from 0 to 7, then the tree leaves positions would look like:
![](img/merkletree-example-1-a.png)
@ -15,9 +102,9 @@ Now let's change the key of the leaf `key=1`, and set it as `key=13`. Then, thei
![](img/merkletree-example-1-b.png)
### Example2
### Example 3
Suppose we have 4 key-values, where the keys are `0000`, `0100`, `1010` and `1011`. The tree would look like:
Suppose we have 4 key-values, where the first four bits of the hashes of the keys are `0000`, `0100`, `1010` and `1011`. The tree would look like:
![](img/merkletree-example-2-a.png)
To iterate this example, suppose we have the following data in a POD:
@ -36,7 +123,7 @@ To iterate this example, suppose we have the following data in a POD:
The merkletree will contain the key values from the `kvs` field.
Suppose that the binary representation of the key `userPk` is `1011...`. This uniquely defines the leaf position that contains the public key of the authenticated user. Similarly for the other key-values:
Suppose that the binary representation of the hash of the key `userPk` is `1011...`. This uniquely defines the leaf position that contains the public key of the authenticated user. Similarly for the other key-values:
![](img/merkletree-example-2-b.png)

View file

@ -1 +1,4 @@
# POD types
- SignedPod
- MainPod

View file

@ -6,6 +6,8 @@ A SignedPod consists of the following fields:
- the Signer's public key is one of the key-values in the `kvs`.
- `id`: the Root of the `kvs` MerkleTree
- `signature`: a signature over the `id`
- `signer`: the public key attached to the digital signature `signature`
- `type`: the constant `SIGNATURE`
<br>

View file

@ -1,29 +1,25 @@
# Statements
The claims asserted by a POD are referred to as its *statements*. These statements introduce values and express relations between them, where the values may or may not be part of the same POD. The mechanism for referring to values in arbitrary PODs is furnished by *anchored keys*.
## Anchored keys
Rather than dealing with just keys, we introduce the notion of an *anchored key*, which is a pair consisting of an origin specifier and a key, i.e.
A _statement_ is any sort of claim about the values of entries: for example, that two values are equal, or that one entry is contained in another.
```
type AnchoredKey = (Origin, Key)
type Key = String
```
Statements come in two types: _built-in_ and _custom_. There is a short list of built-in statements (see below). [^builtin]
In addition, users can freely define custom statements.
An *origin* is a triple consisting of a numeric identifier called the *origin ID*, a string called the *origin name* (omitted in the backend) and another numeric identifier called the *gadget ID*, which identifies the means by which the value corresponding to a given key is produced.
From the user (front-end) perspective, a statement represents a claim about the values of some number of entries -- the statement can only be proved if the claim is true. On the front end, a statement is identified by its _name_ (`ValueOf`, `Equal`, etc.).
The origin ID is defined to be 0 for 'no origin' and 1 for 'self origin', otherwise it is the content ID[^content-id] of the POD to which it refers. The origin name is not cryptographically significant and is merely a convenience for the frontend.
From the circuit (back-end) perspective, a statement can be proved either:
- by direct in-circuit verification, or
- by an operation (aka deduction rule).
On the back end, a statement is identified by a unique numerical _identifier_.
The gadget ID takes on the values in the following table:
## Built-in statements
| Gadget ID | Meaning |
|-----------|-------------------------------------------------------------------------------------------|
| 0 | no gadget |
| 1 | `SignedPOD` gadget: The key-value pair was produced in the construction of a `SignedPOD`. |
| 2 | `MainPOD` gadget: The key-value pair was produced in the construction of a `MainPOD`. |
The POD system has several builtin statements. These statements are associated to a reserved set of statement IDs.
For example, a gadget ID of 1 implies that the key-value pair in question was produced in the process of constructing a `SignedPOD`.
### Backend statements
<font color="red">TODO: update table of backend statements </font>
## Statement types
A statement is a code (or, in the frontend, string identifier) followed by 0 or more arguments. These arguments may consist of up to three anchored keys and up to one POD value.
The following table summarises the natively-supported statements, where we write `value_of(ak)` for 'the value anchored key `ak` maps to', which is of type `PODValue`, and `key_of(ak)` for the key part of `ak`:
@ -42,4 +38,116 @@ The following table summarises the natively-supported statements, where we write
| 9 | `ProductOf` | `ak1`, `ak2`, `ak3` | `value_of(ak1) = value_of(ak2) * value_of(ak3)` |
| 10 | `MaxOf` | `ak1`, `ak2`, `ak3` | `value_of(ak1) = max(value_of(ak2), value_of(ak3))` |
[^content-id]: <font color="red">TODO</font> Refer to this when it is documented.
### Frontend statements
```
ValueOf(key: AnchoredKey, value: ScalarOrVec)
Equal(ak1: AnchoredKey, ak2: AnchoredKey)
NotEqual(ak1: AnchoredKey, ak2: AnchoredKey)
Gt(ak1: AnchoredKey::Integer, ak2: AnchoredKey::Integer)
Lt(ak1: AnchoredKey::Integer, ak2: AnchoredKey::Integer)
GEq(ak1: AnchoredKey::Integer, ak2: AnchoredKey::Integer)
LEq(ak1: AnchoredKey::Integer, ak2: AnchoredKey::Integer)
SumOf(sum: AnchoredKey::Integer, arg1: AnchoredKey::Integer, arg2:
AnchoredKey::Integer)
ProductOf(prod: AnchoredKey::Integer, arg1: AnchoredKey::Integer, arg2: AnchoredKey::Integer)
MaxOf(max: AnchoredKey::Integer, arg1: AnchoredKey::Integer, arg2: AnchoredKey::Integer)
```
The following statements relate to Merkle trees and compound types; they are explained in detail on a [separate page](./merklestatements.md).
```
Branches(parent: AnchoredKey::MerkleTree, left: AnchoredKey::MerkleTree, right: AnchoredKey::MerkleTree)
Leaf(node: AnchoredKey::MerkleTree, key: AnchoredKey, value: AnchoredKey)
IsNullTree(node: AnchoredKey::MerkleTree)
GoesLeft(key: AnchoredKey, depth: Value::Integer)
GoesRight(key: AnchoredKey, depth: Value::Integer)
Contains(root: AnchoredKey::MerkleTree, key: AnchoredKey, value: AnchoredKey)
MerkleSubtree(root: AnchoredKey::MerkleTree, node: AnchoredKey::MerkleTree)
MerkleCorrectPath(root: AnchoredKey::MerkleTree, node: AnchoredKey::MerkleTree, key: AnchoredKey, depth: Value::Integer)
Contains(root: AnchoredKey::MerkleTree, key: AnchoredKey, value: AnchoredKey)
NotContains(root: AnchoredKey::MerkleTree, key: AnchoredKey)
ContainsHashedKey(root: AnchoredKey::DictOrSet, key: AnchoredKey)
NotContainsHashedKey(root: AnchoredKey::DictOrSet, key: AnchoredKey)
ContainsValue(root: AnchoredKey::Array, value: AnchoredKey)
```
In the future, we may also reserve statement IDs for "precompiles" such as:
```
PoseidonHashOf(A.hash, B.preimage) // perhaps a hash_of predicate can be parametrized by an enum representing the hash scheme; rather than having a bunch of specific things like SHA256_hash_of and poseidon_hash_of etc.
```
```
EcdsaPrivToPubOf(A.pubkey, B.privkey)
```
### Built-in statements for entries of any type
A ```ValueOf``` statement asserts that an entry has a certain value.
```
ValueOf(A.name, "Arthur")
```
An ```Equal``` statement asserts that two entries have the same value. (Technical note: The circuit only proves equality of field elements; no type checking is performed. For strings or Merkle roots, collision-resistance of the hash gives a cryptographic guarantee of equality. However, note both Arrays and Sets are implemented as dictionaries in the backend; the backend cannot type-check, so it is possible to prove an equality between an Array or Set and a Dictionary.)
```
Equal(A.name, B.name)
```
An ```NotEqual``` statement asserts that two entries have different values.
```
NotEqual (for arbitrary types)
```
##### Built-in Statements for Numerical Types
An ```Gt(x, y)``` statement asserts that ```x``` is an entry of type ```Integer```, ```y``` is an entry or constant of type ```Integer```, and ```x > y```.
```
Gt (for numerical types only)
Gt(A.price, 100)
Gt(A.price, B.balance)
```
The statements ```Lt```, ```GEq```, ```Leq``` are defined analogously.
```SumOf(x, y, z)``` asserts that ```x```, ```y```, ```z``` are entries of type ```Integer```, and [^fillsum]
```ProductOf``` and ```MaxOf``` are defined analogously.
The two items below may be added in the future:
```
poseidon_hash_of(A.hash, B.preimage) // perhaps a hash_of predicate can be parametrized by an enum representing the hash scheme; rather than having a bunch of specific things like SHA256_hash_of and poseidon_hash_of etc.
```
```
ecdsa_priv_to_pub_of(A.pubkey, B.privkey)
```
##### Primitive Built-in Statements for Merkle Roots
[See separate page](./merklestatements.md).
[^builtin]: <font color="red">TODO</font> List of built-in statements is not yet complete.
[^fillsum]: <font color="red">TODO</font> Does sum mean x+y = z or x = y+z?

View file

@ -1,10 +1,14 @@
# POD value types
From the frontend perspective, POD values may be one of three[^type] types:
From the frontend perspective, POD values may be one of the following[^type] types: two atomic types
- `Integer`
- `String`
- `Dictionary`, `array`, `set`
From the backend perspective, however, these types will all be encoded as a fixed number of field elements, the number being chosen so as to accommodate the `Integer` type as well as hashes to represent the `String` and `MerkleTree` types with the appropriate level of security.
and three compound types
- `Dictionary`
- `Array`
- `Set`.
From the backend perspective, however, these types will all be encoded as a fixed number of field elements, the number being chosen so as to accommodate the `Integer` type as well as hashes to represent the `String` and compound types with the appropriate level of security.
In the case of the Plonky2 backend with 100 bits of security, all of these types are represented as 4 field elements, the output of the Poseidon hash function used there being