In ‘Anatomy of A Zcash Transaction’ we gave a general overview of Zcash Transactions. The purpose of this post is to provide a simplified explanation of how privacy-preserving transactions work in Zcash, and where exactly Zero-Knowledge proofs come into the picture. In the terminology of that post, we are focusing exclusively here on transactions between shielded addresses (commonly referred to as z-addrs).
To focus on understanding the privacy-preserving aspect, let’s put aside everything having to do with reaching consensus using Proof of Work and the blockchain, and focus on a particular node that has obtained the correct list of unspent transaction outputs.
Let’s recall first how this list looks on the bitcoin blockchain. Each unspent transaction output (UTXO) can be thought of as an unspent ‘note’ that is described by the address/public key of its owner and the amount of BTC it contains. For simplicity of presentation, let’s assume each such note contains exactly 1 BTC, and there is at most one note per address. Thus, at a given time, a node’s database consists of a list of unspent notes, where each note can be described simply by the address of the owner. For example, the database may look like this.
:math:`\mathsf{Note}_1=` :math:`(\mathsf{PK}_1)`, :math:`\mathsf{Note}_2=` :math:`(\mathsf{PK}_2)`, :math:`\mathsf{Note}_3=` :math:`(\mathsf{PK}_3)`
Suppose :math:`\mathsf{PK}_1` is Alice’s address and she wishes to send her 1 BTC note to Bob’s address :math:`\mathsf{PK}_4.` She sends a message which essentially says “Move 1 BTC from :math:`\mathsf{PK}_1` to :math:`\mathsf{PK}_4`” to all nodes. She signs this message with the secret key :math:`\mathsf{sk}_1` corresponding to :math:`\mathsf{PK}_1,` and this convinces the node she has the right to move money from :math:`\mathsf{PK}_1.` After the node checks the signature, and checks that there is indeed a 1 BTC note with address :math:`\mathsf{PK}_1,` it will update its database accordingly:
:math:`\mathsf{Note}_4=` :math:`(\mathsf{PK}_4)`, :math:`\mathsf{Note}_2=` :math:`(\mathsf{PK}_2)`, :math:`\mathsf{Note}_3=` :math:`(\mathsf{PK}_3)`
Now suppose each note also contains a random ‘serial number’ (aka unique identifier) :math:`r.` We will soon see this is helpful for obtaining privacy. Thus, the database may look like this.
:math:`\mathsf{Note}_1=` :math:`(\mathsf{PK}_1,r_1)`, :math:`\mathsf{Note}_2=` :math:`(\mathsf{PK}_1,r_2)`, :math:`\mathsf{Note}_3=` :math:`(\mathsf{PK}_2,r_3)`
A natural first step towards privacy would be to have the node store only “encryptions”, or simply hashes, of the notes, rather than the notes themselves.
:math:`\mathsf{H}_1=` :math:`\mathbf{HASH}(\mathsf{Note}_1)`, :math:`\mathsf{H}_2=` :math:`\mathbf{HASH}(\mathsf{Note}_2)`, :math:`\mathsf{H}_3=` :math:`\mathbf{HASH}(\mathsf{Note}_3)`
As a second step to preserve privacy, the node will continue to store the hash of a note even after it has been spent. Thus, it is no longer a database of unspent notes, but rather a database of all notes that ever existed.
The main question now is how to distinguish, without destroying privacy, between notes that have been spent and notes that haven’t. This is where the nullifier set comes in. This is a list of hashes of all serial numbers of notes that have been spent. Each node stores the nullifier set, in addition to the set of hashed notes. For example, after :math:`\mathsf{Note}_2` has been spent, the node’s database might look like this.
Hashed notes | Nullifier set |
---|---|
:math:`\mathsf{H}_1=` :math:`\mathbf{HASH}(\mathsf{Note}_1)` | :math:`\mathsf{nf}_1=` :math:`\mathbf{HASH}(\mathsf{r}_2)` |
:math:`\mathsf{H}_2=` :math:`\mathbf{HASH}(\mathsf{Note}_2)` | |
:math:`\mathsf{H}_3=` :math:`\mathbf{HASH}(\mathsf{Note}_3)` |
How a transaction is made
Now suppose Alice owns :math:`\mathsf{Note}_1` and wishes to send it to Bob, whose public key is :math:`\mathsf{PK}_4.` We will assume for simplicity that Alice and Bob have a private channel between them although this is not actually necessary in Zcash. Basically, Alice will invalidate her note by publishing its nullifier, and at the same time create a new note that is controlled by Bob.
More precisely, she does the following.
- She randomly chooses a new serial number :math:`r_4` and defines the new note :math:`\mathsf{Note}_4=` :math:`(\mathsf{PK}_4,r_4).`
- She sends :math:`\mathsf{Note}_4` to Bob privately.
- She sends the nullifier of :math:`\mathsf{Note}_1,` :math:`\mathsf{nf}_2=` :math:`\mathbf{HASH}(\mathsf{r}_1)` to all nodes.
- She sends the hash of the new note :math:`\mathsf{H}_4=` :math:`\mathbf{HASH}(\mathsf{Note}_4)` to all nodes.
Now, when a node receives :math:`\mathsf{nf}_2` and :math:`\mathsf{H}_4,` it will check whether the note corresponding to :math:`\mathsf{nf}_2` has already been spent, simply by checking if :math:`\mathsf{nf}_2` already exists in the nullifier set. If it doesn’t, the node adds :math:`\mathsf{nf}_2` to the nullifier set and adds :math:`\mathsf{H}_4` to the set of hashed notes; thereby validating the transaction between Alice and Bob.
Hashed notes | Nullifier set |
---|---|
:math:`\mathsf{H}_1=` :math:`\mathbf{HASH}(\mathsf{Note}_1)` | :math:`\mathsf{nf}_1=` :math:`\mathbf{HASH}(\mathsf{r}_2)` |
:math:`\mathsf{H}_2=` :math:`\mathbf{HASH}(\mathsf{Note}_2)` | :math:`\mathsf{nf}_2=` :math:`\mathbf{HASH}(\mathsf{r}_1)` |
:math:`\mathsf{H}_3=` :math:`\mathbf{HASH}(\mathsf{Note}_3)` | |
:math:`\mathsf{H}_4=` :math:`\mathbf{HASH}(\mathsf{Note}_4)` |
..but wait a second, we checked that :math:`\mathsf{Note}_1` wasn’t spent before.. but we didn’t check whether it belongs to Alice. Actually, we didn’t check that it was a ‘real’ note at all, real in the sense that its hash was in the node’s table of hashed notes. The simple way to fix this would be for Alice to simply publish :math:`\mathsf{Note}_1,` rather than its hash; but of course, this would undermine the privacy we are trying to achieve.
This is where Zero-Knowledge proofs come to the rescue:
In addition to the steps above, Alice will publish a proof-string :math:`\pi` convincing the nodes that whomever published this transaction knows values :math:`\mathsf{PK}_1,` :math:`\mathsf{sk}_1,` and :math:`r_1` such that
- The hash of the note :math:`\mathsf{Note}_1=` (:math:`\mathsf{PK}_1,` :math:`r_1)` exists in the set of hashed notes.
- :math:`\mathsf{sk}_1` is the private key corresponding to :math:`\mathsf{PK}_1` (and thus, whomever knows it is the rightful owner of :math:`\mathsf{Note}_1`).
- The hash of :math:`r_1` is :math:`\mathsf{nf}_2`, (and thus, if :math:`\mathsf{nf}_2` – that we now know is the nullifier of :math:`\mathsf{Note}_1` – is not currently in the nullifier set, :math:`\mathsf{Note}_1` still hasn’t been spent).
The properties of Zero-Knowledge proofs will ensure no information about :math:`\mathsf{PK}_1,` :math:`\mathsf{sk}_1,` or :math:`r_1` are revealed by :math:`\pi`.
The main places above where we cheated or omitted details
We emphasize that this has been an oversimplified description, and recommend the protocol spec for full details.
Here are some of the main things that were overlooked:
- The hashed notes need to be stored not just as a list, but in a Merkle tree. This plays an important role in making the Zero-Knowledge proofs efficient. Moreover, we need to store a computationally hiding and binding commitment of the note, rather than just its hash.
- The nullifier needs to be defined in a slightly more complex way to ensure future privacy of the receiver in relation to the sender.
- We did not go into details on how to eliminate the need for a private channel between sender and recipient.