A challenge on the Jolt zkVM

Written by Giorgio Dell'Immagine on Sep 24, 2024

Last weekend I took part in creating some challenges for a CTF event during the MOCA italian hacker camp. One of the tasks I authored was a cryptography challenge called “2+2=5” featuring the Jolt zkVM: it involved crafting a proof for an invalid execution of a RISC-V program exploiting a modified version of the Jolt library.

This post will go over the challenge statement and solution, if you want to try and tackle the challenge by yourself you can download the original attachments here!

What is Jolt?

Jolt is a zkVM which targets the RISC-V ISA. To use Jolt, we need to write two Rust programs:

The guest is the program for which we want to prove the execution. In Jolt, this program will be compiled to a RISC-V ELF file and is assumed to be known both to the prover and the verifier.
The host is the component that interacts with the Jolt SDK for emitting a proof of execution of the guest, or verifying a previously generated proof given a target guest program.

The guest program is compiled and emulated to compute its execution trace, which is a structure that holds all the ordered instructions that have been executed, together with concrete values for the instruction executions, memory reads, memory writes and so on. The Jolt zkVM takes an execution trace, and emits a proof for its validity. Roughly speaking, the proof will be accepted by the verifier if the input trace encodes a correct execution of the guest program, following the RISC-V semantics.

An overview of the architecture of the Jolt proof system is given in the documentation. There are four main components in the Jolt zkVM:

Read-Write memory which uses a memory checking argument to check that reads and writes in memory are correct. Read and write operations on registers are implemented using memory operations at special addresses.
Instruction lookup which uses a custom lookup argument called Lasso to check that the executions of the instructions are correct, e.g., that the result of the execution of an add instruction really is the sum of the operands.
Bytecode which uses a read-only memory checking argument to perform reads into the decoded instructions of the guest program.
R1CS which handle program counter updates, and serves as a glue for all the other modules, imposing for example constraints over values spanning across the other components.

Challenge description

The challenge lets users interact with a server, and the attachments contain the sources of the remote service.

.
├── jolt         # The modified Jolt crate
├── server       # The host server
│   └── guest    # The guest program
├── diff.patch   # The patch applied to Jolt
├── Dockerfile   # Dockerfile to run the server
└── readme.txt   # Readme

To run it locally just build the Docker container and run it!

$ docker build . -t two_plus_two
$ docker run -p 5555:5555 -e FLAG="flag{some_secret_value}" -t two_plus_two

The server

The server is a Rust program which will serve as the host program for the Jolt zkVM. It builds the guest program, it takes as input from the user an execution proof for the guest programs and verifies it. Apart from the actual verification, the server does some additional checks:

it checks that the guest program has no inputs,
it checks that the output value is 5, and
it checks that the program has not panicked during the execution.

If the user provides a valid proof that satisfies those checks, the server will return the flag and the challenge will be solved!

use jolt_sdk::RV32IHyraxProof;

pub fn main() {
    let (_prove_two_plus_two, verify_two_plus_two) = guest::build_two_plus_two();

    println!("Can you prove that 2+2=5?");

    let line = std::io::stdin().lines().next().unwrap().unwrap();
    if line.len() == 0 {
        println!("k thx bye");
        return;
    }

    let proof = RV32IHyraxProof::deserialize_from_bytes(
            &hex::decode(line).unwrap()
        ).unwrap();

    let inputs = &proof.proof.program_io.inputs;
    println!("inputs: {:?}", inputs);
    assert_eq!(inputs.len(), 0);

    let outputs = &proof.proof.program_io.outputs;
    println!("outputs: {:?}", outputs);
    assert_eq!(outputs.len(), 1);
    assert_eq!(outputs[0], 5); // 2+2 is 5!

    let panics = &proof.proof.program_io.panic;
    println!("panics: {:?}", panics);
    assert!(!panics);

    println!("Verifying the proof...");
    let is_valid = verify_two_plus_two(proof);
    if is_valid {
        println!("The proof is valid!");
        println!("FLAG: {}", std::env::var("FLAG").unwrap());
    } else {
        println!("The proof is invalid! :(");
    }
}

The guest program

The guest program is a Rust program that computes 2+2, and returns the result. The #[jolt::provable] attibute attached to the two_plus_two function specifies that this function implements a guest program for which we want to prove the execution. This attribute will automatically generate some additional function that are visible in the host (e.g., build_two_plus_two) which will provide the host with the proving and verifying functions.

The body of the two_plus_two function may seem complicated, it simply loads the value 2 into a register, adding it to itself (i.e., computing 2+2) and then returning the result. The guest was written using inline assembly to avoid optimizations peformed by the Rust compiler. The goal was to retain the add instruction in the final guest executable, but this has little impact on the core of the challenge, apart from making the intended solution slightly easier.

#![cfg_attr(feature = "guest", no_std)]
#![no_main]

#[jolt::provable]
fn two_plus_two() -> u16 {
    let mut n: u16 = 2;

    #[cfg(any(target_arch = "riscv32", target_arch = "riscv64"))]
    unsafe {
        core::arch::asm!(
            "li {n}, 2",
            "add {n}, {n}, {n}",
            n = inout(reg) n,
        );
    }

    #[cfg(target_arch = "x86_64")]
    unsafe {
        core::arch::asm!(
            "mov {n}, 2",
            "add {n}, {n}, {n}",
            n = inout(reg) n,
        );
    }
    n
}

Following the ISA semantics, this program will always return 4, which is the result of the computation of 2+2. The point of the challenge is to craft a proof of execution with output 5!

The modified Jolt library

The readme.txt file explains the steps which have been taken to obtain the modified Jolt library.

git clone [email protected]:a16z/jolt.git
cd jolt

# just the latest commit at time of writing
git checkout 0cc7aa31981ff8503fe256706d2aa9c320abd1cd
git apply ../diff.patch

The Jolt repository is taken as-is at last commit (at the time of writing of the challenge), and a patch is applied to it. The patch can be found in the diff.patch file: it removes a few of lines from the jolt-core/src/r1cs/jolt_constraints.rs file.

diff --git a/jolt-core/src/r1cs/jolt_constraints.rs b/jolt-core/src/r1cs/jolt_constraints.rs
index 5fb0d871..295dce32 100644
--- a/jolt-core/src/r1cs/jolt_constraints.rs
+++ b/jolt-core/src/r1cs/jolt_constraints.rs
@@ -289,13 +289,8 @@ impl<F: JoltField> R1CSConstraintBuilder<F> for UniformJoltConstraints {
 
         // if (rd != 0 && update_rd_with_lookup_output == 1) constrain(rd_val == LookupOutput)
         // if (rd != 0 && is_jump_instr == 1) constrain(rd_val == 4 * PC)
-        let rd_nonzero_and_lookup_to_rd =
+        let _rd_nonzero_and_lookup_to_rd =
             cs.allocate_prod(JoltIn::Bytecode_RD, JoltIn::OpFlags_LookupOutToRd);
-        cs.constrain_eq_conditional(
-            rd_nonzero_and_lookup_to_rd,
-            JoltIn::RD_Write,
-            JoltIn::LookupOutput,
-        );
         let rd_nonzero_and_jmp = cs.allocate_prod(JoltIn::Bytecode_RD, JoltIn::OpFlags_IsJmp);
         let lhs = JoltIn::Bytecode_ELFAddress + (PC_START_ADDRESS - PC_NOOP_SHIFT);
         let rhs = JoltIn::RD_Write;

TL;DR

To solve the challenge we need to provide a proof of execution of the guest program to the server with output 5. In order to do this, we need to exploit the fact that Jolt has been modified and is possibly not sound anymore.

Patch analysis

The patch removes a constraint in the R1CS component, let’s take a closer look at the constraint that was removed.

cs.constrain_eq_conditional(
    rd_nonzero_and_lookup_to_rd,
    JoltIn::RD_Write,
    JoltIn::LookupOutput,
);

The function constrain_eq_conditional adds to the R1CS constraint system an equality constraint between the second and third argument, if the first argument is set to 1. Roughly, the emitted constraint is equivalent to the following equation.

cs.constrain_eq_conditional(condition, left, right);
// condition  * (left - right) == 0

There is also a useful comment a few lines above explaining the constraint.

// if (rd != 0 && update_rd_with_lookup_output == 1) constrain(rd_val == LookupOutput)
cs.constrain_eq_conditional(
    rd_nonzero_and_lookup_to_rd,
    JoltIn::RD_Write,
    JoltIn::LookupOutput,
);

The RD_Write value comes from the Read-Write memory component, and it represents the memory value written in the output register. Instead, the LookupOutput value comes from the Instruction lookup component, and represents the output value “returned” by the lookup argument, which performs the instruction execution check. The goal of the constraint is to provide glue between these two values. An intuitive explaination is the following.

If the instruction is supposed to write its result into a register and if the output register is not zero, then the value written in the output register should be equal to the result of the execution lookup argument.

With this constraint removed the exploitation idea is simple: craft an execution trace in which the add instruction which sums 2 and 2, instead of writing back in the output register the value 4 writes the value 5. In the generated trace:

the lookup argument will return 4, as the instruction is an add and the operands values are 2 and 2, but
the value written back in the output register will be 5.

The constraint which imposes that these two values need to be equal has been removed, so the trace will be accepted. The program will then continue to run normally and return the computed value, so in the end the output of the modified execution trace will be exactly 5!

Crafting the proof

Once we understand all the moving parts, the actual exploitation is quite easy: we just need to patch Jolt in the tracer module, which is the module that generates the execution trace. We modify the jolt/tracer/src/emulator/cpu.rs file, changing the semantics of the ADD operation emulation: if the operands are both 2 then write back in the output register the value 5.

Instruction {
    mask: 0xfe00707f,
    data: 0x00000033,
    name: "ADD",
    operation: |cpu, word, _address| {
        let f = parse_format_r(word);

        // If the operands are both 2, then write 5 in the output register
        // otherwise apply normal addition semantics
        if cpu.x[f.rs1] == 2 && cpu.x[f.rs2] == 2 {
            cpu.x[f.rd] = cpu.sign_extend(5);
        } else {
            cpu.x[f.rd] = cpu.sign_extend(cpu.x[f.rs1].wrapping_add(cpu.x[f.rs2]));
        }
        Ok(())
    },
    disassemble: dump_format_r,
    trace: Some(trace_r),
},

One insteresting thing we can print out is the execution trace at the exact step in which the add {n}, {n}, {n} instruction is executed. The main bit to notice here are the memory operations: the instruction is an ADD with both operands equal to 2, so there are two read operations, but then there is a write operation to the same register with value 5!

Trace[5]
JoltTraceStep {
    instruction_lookup: Some(ADD(ADDInstruction(2, 2))),
    bytecode_row: BytecodeRow {
        address: 2147483672,
        bitflags: 17448304640,
        rd: 10,
        rs1: 10,
        rs2: 10,
        imm: 0,
        virtual_sequence_remaining: None
    },
    memory_ops: [Read(10), Read(10), Write(10, 5), Read(0), Read(0), Read(0), Read(0)]
}

We can also print directly the inputs to the R1CS system, which are stored in the R1CSInputs structure. As we can see, the lookup correctly returns 4, but the value written in the output register is 5.

pc[7] = 9
bytecode_a[7] = 9
bytecode_v[7] = 9
memreg_a_rw[7] = 0
memreg_v_reads[7] = 2
memreg_v_writes[7] = 5
chunks_x[7] = 0
chunks_y[7] = 0
chunks_query[7] = 0
lookup_outputs[7] = 4
circuit_flags_bits[7] = 0
instruction_flags_bits[7] = 1

Putting it all together, the main solver Rust program is quite straight-forward (using the modified Jolt library). Running this program will output a proof, which given to the server will return the flag!

use jolt_sdk::RV32IHyraxProof;

pub fn main() {
    let (prove_two_plus_two, verify_two_plus_two) = guest::build_two_plus_two();
    let (output, proof_gen) = prove_two_plus_two();
    println!("Proof generated! {}", output);
    let proof_bytes = proof_gen.serialize_to_bytes().unwrap();
    println!("{}", hex::encode(&proof_bytes));
}

This is one way of approaching the challenge, but actually the removal of that particular constraint gives a much more powerful primitive: we can write arbitrary values into registers for each instruction which writes back its result into a register!

Conclusions

Authoring and solving challenges is essentially about tackling a focused what-if experiment. This approach serves as an excellent proxy for gaining a security-oriented understanding of the underlying technologies and libraries, like the Jolt zkVM in this case. Moreover, the hands-on experience of writing a solver program or script forces us to concretely demonstrate the impact of bugs, a crucial skill while working day-to-day in cryptography audits.