JIT Compilation
PetaPerl includes a JIT (Just-In-Time) compiler based on Cranelift, the same backend used by Wasmtime. The JIT compiles hot loops into native machine code, delivering performance that can exceed perl5 by orders of magnitude.
What Gets JIT-Compiled
The JIT targets loops — for and while — that contain arithmetic and comparison operations. Specifically:
Integer For-Loops
my $sum = 0;
for my $i (1..1_000_000) {
$sum += $i;
}
The JIT detects the integer accumulator pattern and compiles the loop body to native code. The loop variable ($i) and accumulators ($sum) are held in CPU registers.
While-Loops with Float Arithmetic
while ($y < $max_y) {
my $cx = $x * $scale;
my $cy = $y * $scale;
# ... computation ...
$y += $step;
}
Floating-point arithmetic (+, -, *, /), comparisons (<, >, <=, >=), and control flow (if/last) within while-loops are compiled to native SSE/AVX instructions.
Nested Loops
The JIT handles nested loops up to 5 levels deep. Each nesting level generates its own set of Cranelift basic blocks with proper phi-node connections for variables that flow between levels.
String Operations (via Extern Calls)
The JIT supports .= (concat-assign) and $x = "" (clear) operations on string variables through extern function calls from JIT-compiled code back into the Rust runtime.
What Is Not JIT-Compiled
- Subroutine calls — function call overhead dominates, JIT benefit is marginal
- Regex operations — the regex engine has its own optimization path
- I/O operations — I/O-bound code doesn’t benefit from JIT
- Complex data structure access — hash/array operations with dynamic keys
- String-heavy computation — string building is handled by the PvBuf optimization instead
Code that isn’t JIT-compiled still runs on the interpreter, which has its own fast paths for common operations.
Performance
Ackermann Function
The interpreter fast path (not JIT) handles recursive integer arithmetic:
| Runtime | Time | Speedup |
|---|---|---|
| perl5 | 630ms | 1.0x |
| pperl (interpreter) | 14ms | 45x faster |
Mandelbrot Set (1000x1000)
JIT compilation of nested while-loops with float arithmetic:
| Runtime | Time | Speedup |
|---|---|---|
| perl5 | 12,514ms | 1.0x |
| pperl (JIT only) | 163ms | 76x faster |
| pperl (JIT + parallel) | 29ms | 431x faster |
How It Achieves This
- Register allocation: Loop variables in CPU registers instead of interpreter stack
- Type specialization: Variables proven to be integer or float use native instructions directly
- Branch elimination: Constant conditions removed at compile time
- No dispatch overhead: Native code replaces the interpreter dispatch loop entirely within JIT’d regions
CLI Control
# Default: JIT enabled
pperl script.pl
# Disable JIT (interpreter only)
pperl --no-jit script.pl
The test harness runs with --no-jit by default to test interpreter correctness. JIT-specific tests in t/62-jit/ override this.
Architecture
Compilation Pipeline
Loop detected → Analyze variables and types → Build JIT IR
→ Compile via Cranelift → Cache compiled function → Execute native code
Caching
Compiled functions are cached by the enterloop op’s ID in the op arena. A CachedWhileLoop stores:
- The compiled native function pointer
- Variable type information (float vs integer vs string)
- Constant string pool (for string operations)
- Metadata for parallel dispatch eligibility
Subsequent iterations of the same loop reuse the cached compilation.
JIT IR
The JIT uses its own intermediate representation (JitIr) that maps Perl operations to Cranelift operations:
JitIr::WhileLoop { condition_ir, body_ir }— recursive for nestingJitIr::FloatVar/JitIr::IntVar— typed variable accessJitIr::BinOp— arithmetic and comparisonJitIr::ExitIfFalse/JitIr::ExitIfTrue— loop exit andlast
Variable Types
The JIT tracks two variable types:
JitType::F64— floating-point values held in an f64 bufferJitType::Ptr— string values accessed through extern calls to the Rust runtime
Variables are typed based on their usage pattern in the loop body. Mixed-type variables fall back to the interpreter.