ZNARKs: SNARKs for The Integers

Written by Mathias Hall-Andersen on Nov 11, 2024

top

$\mathbb{Z}$NARKs - SNARKs for The Integers

Are SNARKs always for computation over finite fields? Turns out no.

Today, we will explore the techniques presented in our recent preprint Fully-Succinct Arguments over the Integers from First Principles, which investigates the construction of SNARKs for circuits over the integers. This work provides a simple, but novel, approach to building efficient proof systems for computations involving whole numbers which sidesteps most of the usual complications of dealing with integers. The techniques enable us to “compile” existing (multilinear) SNARKs into arguments over the integers using a new tool: polynomial commitments with modular remainder.

Applications

Existing SNARKs are (usually) designed for computations over finite fields: satisfiability of circuits or execution of programs over finite fields. A few works have also explored SNARKs over other finite rings, usually $\mathbb{Z}_{2^k}$ i.e. $k$-bit integers. In both cases it means that every value in the computation has a finite a priori bounded size. In this recent work, we introduce the first (fully succinct) SNARK for computations over the integers: a SNARK over the integers allows the circuit assignment (or memory in the case of a program) to take any value in $\mathbb{Z}$. The proof remains sound and succinct no matter how large the values in the computation are.

Computation over the integers has a number of potentially interesting applications, including emulating (even very small) prime fields. The usual reader of our blog is well aware of all the circuit “tricks” which apply for finite fields, but might be less acquainted with computation/circuits over the integers, let us start by exploring a few examples of applications where SNARKs over the integers might be particularly useful.

Efficient Range Checks using Sum-of-Squares

Range checks over the integers do not require decomposing the value into bits, meaning the cost of the range check is independent of the size of the range. The trick is the observation that all non-negative integers can be represented as the sum of four squares, in other words, for any $n \geq 0$ the prover can (efficiently) find $a, b, c, d$ such that:

$$ n = a^2 + b^2 + c^2 + d^2 $$

Using this technique you can prove that $n \in [0, B]$ by proving you know $a_1, b_1, c_1, d_1$ and $a_2, b_2, c_2, d_2$ such that:

$$ n = a_1^2 + b_1^2 + c_1^2 + d_1^2 $$

$$ B - n = a_2^2 + b_2^2 + c_2^2 + d_2^2 $$

In other words, $n \geq 0$ and $B - n \geq 0$. This can be optimized, Couteau, Peters and Pointcheval showed $n \in [\mathsf{low}, \mathsf{high}]$ is equivalent to the existence of $a, b, c$ such that:

$$ 4 \cdot (n - \mathsf{low}) \cdot (\mathsf{high} - n) + 1 = a^2 + b^2 + c^2 $$

So you get range checks for a constant cost of 4 R1CS constraints, or a single PlonKish gate.

Efficient Mixed Field Emulation

We can emulate prime fields inside integer circuits very easily: the values $\bar{x} \in \mathbb{F}_p$ are represented as integers $x \in \mathbb{Z}$ such that $x \equiv \bar{x} \mod p$. Given two values $x, y \in \mathbb{Z}$, we can constrain their product to be $z \equiv x \cdot y \mod p$ by requiring that:

$$ z = x \cdot y - q \cdot p $$

For some $q \in \mathbb{Z}$ chosen by the prover. Similar deal for addition. Note that we do not require that the prover fully reduces the result, because there is no “modulus to overflow” the circuit behaves correctly even if the prover uses a large representative of the equivalence class: he just makes everything slower for himself. If a unique representative is required, you can simply use the (very efficient) range check from above to check that $z \in [0, p - 1]$.

This even allows emulating “dynamically” chosen fields $\mathbb{F}_p$ e.g. where $p$ is provided as a public input. It also allows emulating fields of any size with the same number of constraints and to emulate (and switch between) any number of different fields in the same circuit.

Verifying Computation in RSA Groups

In the example above, we chose the modulus $p$ to be a prime, but it works equally well for composite moduli $N$, for instance RSA moduli $N = p \cdot q$. This allows, for instance, verifying RSA signatures and RSA accumulators inside integer circuits efficiently.

An Intellectual Curiosity: $\mathbb{Q}$-Circuits

It is fairly easy to use a proof system for $\mathbb{Z}$ to prove statements about rational numbers $\mathbb{Q}$: we represent rational numbers as pairs of integers $(n, d)$ representing the fraction $n / d$. Whenever needed we can force $d \neq 0$ by enforcing $d \cdot \mathsf{sign} \in [1, \infty)$ for some $\mathsf{sign} \in \mathbb{Z}$ chosen by the prover and using the “sum of squares” technique to enforce the range check. Note that many non-zero checks can be batched by computing $d_1 \cdot d_2 \cdot \ldots \cdot d_n$ and enforcing that the product is non-zero.

To enforce a product of fractions $n_1 / d_1 \cdot n_2 / d_2 = n_3 / d_3$, we can require:

$$ \begin{aligned} n_3 &= n_1 \cdot n_2 \\ d_3 &= d_1 \cdot d_2 \end{aligned} $$

Division can be similarly enforced as:

$$ \begin{aligned} n_3 &= n_1 \cdot d_2 \\ d_3 &= d_1 \cdot n_2 \end{aligned} $$

Equality can be checked by enforcing:

$$d_1 \cdot n_2 = d_2 \cdot n_1$$

The hardest part is sums of fractions, which require a common denominator:

$$ \begin{aligned} n_3 &= n_1 \cdot f_1 + n_2 \cdot f_2 \\ d_3 &= d_1 \cdot f_1 \\ d_1 \cdot f_1 &= d_2 \cdot f_2 \\ f_1 &\neq 0 \\ f_2 &\neq 0 \end{aligned} $$

For $f_1, f_2$ chosen by the prover.

I am not aware of any practical use of $\mathbb{Q}$-circuits, but it is kind of interesting that you can have a SNARK for a field of characteristic zero.

A Detour: Fingerprinting

Before we get into the details of how to construct SNARKs for the integers, let’s take a small detour into “fingerprinting”: suppose you have a circuit $C$ over the integers, consisting of addition and multiplication gates with fan-in two and arbitrary fan-out. For instance, it might look something like this:

Circuit

Suppose you want to convince someone that you know a witness $w = (X, Y, Z)$ such that $C(w) = 0$. For the circuit above, a simple protocol suffices: the prover simply sends $w$ and the verifier checks that $C(w) = 0$.

Notice however, that the multiplications increase the size of the internal values inside the circuit: the product $(X \cdot Y) \cdot Z$ has a size that is the sum of the sizes of $X$, $Y$, and $Z$. In fact because you can successively multiply intermediate results at every layer of the circuit, the size of the internal values grows exponentially in the depth of the circuit. For instance, the circuit computing $x^{2^k}$ has depth $k$, computing $x^{2^{i}} = x^{2^{i-1}} \cdot x^{2^{i-1}}$ at every layer, but $x^{2^k}$ has size $O(2^k \cdot |x|)$. Hence verifying the computation of the circuit directly is tractable only for circuits of small depth…

Can we do better?

Of course we can! Namely using a technique called “fingerprinting” which relies on randomness. The technique is described in Computational Complexity: A Modern Approach by Sanjeev Arora and Boaz Barak. The idea is pretty simple, rather than naively recomputing the circuit, after receiving the witness $w$ the verifier picks a random prime $p$ and computes the circuit modulo $p$. If the result is zero, then the verifier is convinced that the witness is correct.

Slightly informally:

$$ \Pr_{p}\left[C(w, x) \text{ mod } p = 0\right] > \epsilon \iff C(w, x) = 0 $$

Why does this work? Well, suppose $y = C(x, w)$ and $y \neq 0$, i.e. the circuit is not satisfied, but convinces the verifier that it is. The maximum value of $y$ is:

$$|y| \leq \max \{ |x|, |w| \}^{2^d}$$

Where $d$ is the depth of the circuit. Now $y = 0 \text{ mod } p$ iff $p$ divides $y$. How many distinct primes can divide $y$? Well, a very rough estimate is less than $\log_2(y)$. Also, there are roughly $B / (\log B)$ primes less than $B$. So we pick $B$ st.

$$ \frac{B}{\log B} \cdot \frac{1}{\log_2(y)} < \frac{B}{\log B} \cdot \frac{1}{\log_2(\max \{ |x|, |w| \}) \cdot {2^d}}< \epsilon $$

So how big should our random prime $p$ be for $\epsilon$ to be (very) small?

It is clear from the equation above that if we pick $B \gg 2^d$ then $\epsilon$ will be small. In other words, the prime needs to have size roughly proportional to the (multiplicative) depth of the circuit.

Taking an Exponential Step Down

The crucial idea in our paper is to “minify” this idea by making the circuit have only constant depth. Luckily, constant depth circuits are the norm in ZK proof systems: a circuit of greater depth is flattened into a constant depth circuit by introducing new variables, for instance, the circuit checking an R1CS relation has multiplicative depth one.

The use of a prime number allows us to prove satisfiability of the circuit efficiently using existing SNARK techniques because we simply have to prove satisfiability of a circuit over a prime field.

Polynomial Commitment with Modular Remainder

In the previous section, the verifier received the entire witness $w \in \mathbb{Z}^{n}$ from the prover. That’s a problem: we are trying to build a SNARK and we want the communication to be small. The solution is to have the prover commit to $w$ and send us the commitment instead.

Since we will need to compute “$C(w) \text{ mod } p$” afterwards, it is useful to have a polynomial commitment scheme to multilinear polynomials $f(X_1, \ldots, X_k) \in \mathbb{Z}[X_1, \ldots, X_n]$ which allows opening the polynomial at a point $x_1, \ldots, x_k$ modulo arbitrary primes $p$, showing that: $$y = f(x_1, \ldots, x_k) \text{ mod } p$$ The crucial point being that you can choose p after committing to the polynomial. This is in contrast to the usual polynomial commitment schemes in which the prime is fixed at the time of commitment. This new primitive is clearly more powerful than the usual polynomial commitment schemes: you can always just fix the prime $p$ – in which case you have a regular polynomial commitment scheme.

$\mathbb{Z}$aratan

As for the sea turtle, it is of such huge size that people on shipboard take it for an island. One merchant has reported: ‘Rising out of the sea we discovered an island with green plants, and we went ashore and dug pits for a cooking fire, and the island began to move and the sailors said: “Back to the ship! It’s a turtle! The heat of the fires has wakened him and we’ll be lost!” '

– Book of Animals, Zakariya al-Qazwini (13th Century)

Zaratan is our new SNARK for the integers. It combines a polynomial commitment with modular remainders constructed from DARK-like techniques with a standard multi-linear SNARK over prime fields: in our case Spartan.

For technical reasons, we cannot base our construction upon DARK commitments: DARK commitments are binding only for the class of rational functions $f(X_1, \ldots, X_k) / N$ where $N$ is a small integer. This is fine for SNARKs over finite fields, because $N^{-1} \text{ mod } p$ is a field element (except if $p | N$), which means that $f(X_1, \ldots, X_k) / N \text{ mod } p$ is a polynomial over the field, however this does sufficient for our application since we need it to be a polynomial over the integers. Instead we need to derive a scheme from the much less efficient polynomial commitment scheme of Block et al..

However, since all we require is a multilinear polynomial commitment scheme with modular remainder to compile standard off-the-shelf multi-linear SNARK (e.g. Spartan or HyperPlonk), we have some ideas for how to create far more efficient instantiations from very different techniques. So stay tuned!