Clox Bytecode Interpreter

June 27, 2025
After the Rust tree-walk interpreter, I implemented Part II of Crafting Interpreters in C: Clox, a bytecode virtual machine with a Pratt-style parser and a compiler that emits opcodes directly—no AST walk at runtime.

Clox architecture

flowchart LR SRC[Source Code] --> COMPILER[Compiler] COMPILER -->|Calls| SCANNER[Scanner] SCANNER -->|Returns Token| PARSER[Parser] PARSER -->|Parses Token| COMPILER subgraph vm["Virtual Machine"] BYTECODES["Byte Codes + Operands"] CHUNKS[Chunks] -->|Contain| BYTECODES IP[Instruction Pointer] -->|Points to| BYTECODES BYTECODES -->|Operate on| STACK[Stack] STACK --> OUTPUT[Output] end COMPILER -->|Emits| BYTECODES COMPILER -->|Creates| CHUNKS
ComponentIn c_loxClassical definition

Scanner

Scans and tokenizes source.

Lexical analysis: tokenizes symbols from the source.

Parser

Parsing lives in compiler.c. Uses a Pratt-style parser. The parser struct holds tokens while they are in progress.

Builds a structured representation of the code from tokens (often an AST). Verifies structure and reports errors when it is not valid.

Compiler

Translates Lox source into virtual machine bytecode. Reports parsing errors.

Translates one language into another; generates code.

Virtual machine (VM)

Runs bytecode instructions—decodes and dispatches.

Software-based emulation of a computer (stack, instruction pointer). Reports runtime errors.

Chunk

Storage for bytecode instructions, driven by the instruction pointer and the stack.

How it works

In the Clox architecture, interpretation is a single-pass pipeline:

  • The scanner reads one token at a time. It lexes each word into a meaningful symbol according to the grammar.

  • The parser processes each token as it is produced (with lookahead when necessary).

  • The compiler generates bytecode for that token or expression.

  • The VM executes that bytecode immediately.

  • Then the cycle repeats for the next token.

Phases

scan

tokenize

parse

compile / emit

run

Try it

Use the runner below to execute Lox code through the Clox backend endpoint.

Run Clox on the server:

Benchmark: bytecode VM vs tree-walk

The same nested-loop program runs on Clox (bytecode VM) and the Rust tree-walk interpreter. Use the button below to run both and compare round-trip time (network + server execution).

Compare the same workload on both interpreters (280×280 nested loops, ~78,400 inner iterations). Timing includes the round trip to the backend plus execution.

// 280 x 280 nested loops (~78,400 inner iterations)
for (var outer = 0; outer < 280; outer = outer + 1) {
  for (var inner = 0; inner < 280; inner = inner + 1) {
    outer + inner;
  }
}
print "finished";

The stack: how does the Clox stack work?

The Clox VM works with chunks that the compiler creates. The VM loads the chunk during initialization:

vm.chunk = &chunk;
vm.ip = vm.chunk->code;

From there the VM interacts with values in the chunk through bytecodes and operands. Operands are either 8 or 16 bits.

Chunks are lists of bytecodes:

OP_POP
OP_JUMP 0xA131
OP_CONSTANT

The rest of the operation is straightforward: the instruction pointer advances, and the stack pushes and pops values:

void push(Value value) {
*vm.stackTop = value; // stores the value in the array of Values
// after the last element in the array
vm.stackTop++;
}
Value pop() {
vm.stackTop--;
return *vm.stackTop;
}

Clearing up confusing terminology

Q: What is an interpreted language?

Often, when people say “interpreted” language, they mean one that runs on top of a virtual machine or runtime environment.

Q: What is the difference between an interpreter and a compiler?

An interpreter scans and executes source code directly. That is what c_lox does at the pipeline level: tokens are parsed and bytecode is emitted and run without building a separate executable for the host CPU. A compiler in the classical sense usually translates source into another form (here, Lox → bytecode) that something else executes; Clox’s compiler.c is that translation step, and the VM is the executor.

Compared with the Rust tree-walk interpreter, there is no long-lived AST evaluation loop—bytecode and the stack carry runtime behavior.

Error handling in Lox

Q: What kinds of errors are there? How are errors handled?

Error quality matters: when should a program panic versus continue?

Lox has parsing errors (reported by the compiler) and runtime errors (reported by the VM). In general, Lox tries to panic as little as possible and report errors without panicking.

Highlights

  • Scanner, Pratt parser, and bytecode compiler in C (compiler.c).

  • Chunk-based VM with stack, instruction pointer, and opcode dispatch.

  • Single-pass compile-and-run model (contrast with tree-walk evaluation in Rust).

  • Parsing and runtime error reporting aligned with the book’s Lox semantics.

Why this belongs in the portfolio

Clox is the bytecode half of Crafting Interpreters: real compiler construction, VM design, and C systems programming in one project. It pairs naturally with the earlier Rust tree-walk implementation and with smaller C exercises like the stack calculator that rehearsed scanner/parser structure before Part II.

Links


Written by Michael Barakat, a front end developer living in Seattle.