JIT Compilation#
PetaPerl includes a JIT (Just-In-Time) compiler based on Cranelift, the same backend used by Wasmtime. The JIT compiles hot loops into native machine code, delivering performance that can exceed perl5 by orders of magnitude.
What Gets JIT-Compiled#
The JIT targets loops — for and while — that contain arithmetic and comparison operations. Specifically:
Integer For-Loops#
my $sum = 0;
for my $i (1..1_000_000) {
$sum += $i;
}
The JIT detects the integer accumulator pattern and compiles the loop body to native code. The loop variable ($i) and accumulators ($sum) are held in CPU registers.
While-Loops with Float Arithmetic#
while ($y < $max_y) {
my $cx = $x * $scale;
my $cy = $y * $scale;
# ... computation ...
$y += $step;
}
Floating-point arithmetic (+, -, *, /), comparisons (<, >, <=, >=), and control flow (if/last) within while-loops are compiled to native SSE/AVX instructions.
Nested Loops#
The JIT handles nested loops up to 5 levels deep. Each nesting level generates its own set of Cranelift basic blocks with proper phi-node connections for variables that flow between levels.
String Operations (via Extern Calls)#
The JIT supports .= (concat-assign) and $x = "" (clear) operations on string variables through extern function calls from JIT-compiled code back into the Rust runtime.
What Is Not JIT-Compiled#
Subroutine calls — function call overhead dominates, JIT benefit is marginal
Regex operations — the regex engine has its own optimization path
I/O operations — I/O-bound code doesn’t benefit from JIT
Complex data structure access — hash/array operations with dynamic keys
String-heavy computation — string building is handled by the PvBuf optimization instead
Code that isn’t JIT-compiled still runs on the interpreter, which has its own fast paths for common operations.
Performance#
Ackermann Function#
The interpreter fast path (not JIT) handles recursive integer arithmetic:
Runtime |
Time |
Speedup |
|---|---|---|
perl5 |
630ms |
1.0x |
pperl (interpreter) |
14ms |
45x faster |
Mandelbrot Set (1000x1000)#
JIT compilation of nested while-loops with float arithmetic:
Runtime |
Time |
Speedup |
|---|---|---|
perl5 |
12,514ms |
1.0x |
pperl (JIT only) |
163ms |
76x faster |
pperl (JIT + parallel) |
29ms |
431x faster |
How It Achieves This#
Register allocation: Loop variables in CPU registers instead of interpreter stack
Type specialization: Variables proven to be integer or float use native instructions directly
Branch elimination: Constant conditions removed at compile time
No dispatch overhead: Native code replaces the interpreter dispatch loop entirely within JIT’d regions
CLI Control#
# Default: JIT enabled
pperl script.pl
# Disable JIT (interpreter only)
pperl --no-jit script.pl
The test harness runs with --no-jit by default to test interpreter correctness. JIT-specific tests in t/62-jit/ override this.
Architecture#
Compilation Pipeline#
Loop detected → Analyze variables and types → Build JIT IR
→ Compile via Cranelift → Cache compiled function → Execute native code
Caching#
Compiled functions are cached by the enterloop op’s ID in the op arena. A CachedWhileLoop stores:
The compiled native function pointer
Variable type information (float vs integer vs string)
Constant string pool (for string operations)
Metadata for parallel dispatch eligibility
Subsequent iterations of the same loop reuse the cached compilation.
JIT IR#
The JIT uses its own intermediate representation (JitIr) that maps Perl operations to Cranelift operations:
JitIr::WhileLoop { condition_ir, body_ir }— recursive for nestingJitIr::FloatVar/JitIr::IntVar— typed variable accessJitIr::BinOp— arithmetic and comparisonJitIr::ExitIfFalse/JitIr::ExitIfTrue— loop exit andlast
Variable Types#
The JIT tracks two variable types:
JitType::F64— floating-point values held in an f64 bufferJitType::Ptr— string values accessed through extern calls to the Rust runtime
Variables are typed based on their usage pattern in the loop body. Mixed-type variables fall back to the interpreter.