# Alternatives to ithreads pperl does not implement Perl's interpreter threads. That sounds like a loss until you look at *why* most programs reach for threads. Almost every upstream ithreaded program falls into one of three shapes: - **Parallel compute over data** — "apply this function to each element of this large array, as fast as you can." - **Parallel I/O** — "perform these N independent network or filesystem calls concurrently." - **Background work with isolated state** — "run this task to the side, with no shared memory, and wait for the result." pperl has a dedicated, better-performing answer for each. ## Decision table | Your goal | Upstream tool | pperl-native answer | |----------------------------------------------|------------------------------------|------------------------------------------| | Parallelise a compute-heavy loop | `threads` + work-crew pattern | Auto-parallelisation (JIT + Rayon) | | Parallelise `map` / `grep` over large data | Work-crew thread pool | Parallel `map` / `grep` | | Run I/O-bound tasks concurrently | `threads` + `Thread::Queue` | [`fork`](../../p5/core/perlfunc/fork) with pipes, or event-driven I/O | | Isolate a subtask with its own state | Detached thread | [`fork`](../../p5/core/perlfunc/fork) | | Produce/consume pipeline | `threads` + `Thread::Queue` chain | [`fork`](../../p5/core/perlfunc/fork) + pipes, or sequential iterator | | Synchronised counter | `:shared` + [`lock`](../../p5/core/perlfunc/lock) | Sequential accumulator + parallel reduction | ## Auto-parallelisation for compute loops pperl's JIT detects loops whose bodies are free of I/O, global writes, and impure calls, then dispatches the loop body across a Rayon work-stealing pool. A scalar-accumulator like `$sum += ...` is recognised as a reduction and combined after all iterations finish. ```perl my $sum = 0; for my $i (1 .. 10_000_000) { $sum += sqrt($i); } print $sum, "\n"; ``` Under pperl this compiles to a single Rayon-dispatched loop body. On a typical 8-core machine the speedup is in the 5-6x range over the sequential JIT, with no user-visible thread objects, no synchronisation code, and no shared state. See [Parallel Execution](parallel) for the full story — what qualifies, how reduction detection works, and the CLI flags (`--no-parallel`, `--threads=N`, `--parallel-threshold=N`) that control the behaviour. Porting a work-crew ithreaded loop typically means **deleting the threading code**. The sequential form is often the pperl-optimal form already. Before: ```perl use threads; use threads::shared; my $sum :shared = 0; my @chunks = chunk_data(\@big); my @workers = map { my $c = $_; threads->create(sub { my $local = 0; $local += process($_) for @$c; lock $sum; $sum += $local; }); } @chunks; $_->join for @workers; ``` After: ```perl my $sum = 0; $sum += process($_) for @big; ``` The chunking, the per-thread accumulator, the lock — all gone. The JIT does the chunking. The reduction detector handles the accumulator. There is no shared variable to lock. ## Parallel `map` and `grep` For list-shaped transformations, the built-ins are already the natural idiom: ```perl my @results = map { expensive($_) } @input; my @filtered = grep { test($_) } @input; ``` When the callback has no detectable side effects and the input size exceeds `--parallel-threshold`, pperl dispatches the callback in parallel. Result order is preserved. Equivalent ithreaded code would involve a thread pool, an input queue, an output queue, a sentinel value to signal end-of-input, and per-thread accumulators. The pperl version is one line. ## fork — process-level concurrency When work is **not** a pure compute loop — it involves I/O, subprocess management, or genuinely independent state — reach for [`fork`](../../p5/core/perlfunc/fork). A forked child has: - A full copy of the parent's memory, copy-on-write at the OS level. - Its own file descriptor table (with shared underlying file descriptions). - Complete isolation at the interpreter level — no shared Perl heap, ever. ```perl my $pid = fork; die "fork: $!" unless defined $pid; if ($pid == 0) { # child exec 'processing-tool', @args or die "exec: $!"; } # parent waitpid $pid, 0; ``` Communication uses pipes, sockets, or the filesystem — the same inter-process primitives you would use between unrelated programs. That sounds heavier than in-process threading, and at the raw syscall level it is; but for program structure it is often simpler because there is no shared memory to guard. ### Forking a work crew The upstream `threads::shared` work-crew in the previous section translates to a fork-based equivalent when the work involves I/O: ```perl my @pids; for my $chunk (@chunks) { my $pid = fork // die "fork: $!"; if ($pid == 0) { process_chunk($chunk); exit 0; } push @pids, $pid; } waitpid $_, 0 for @pids; ``` Results flow back through pipes, files, or a named collection point in `/tmp`. Whatever you would have used between separate programs. ### Pipeline via fork and pipes For pipeline-shaped workloads, Perl's `open` with a `|-` or `-|` form spawns a child with a pipe already attached: ```perl open my $producer, '-|', 'find', '/data', '-type', 'f' or die "fork: $!"; while (my $path = <$producer>) { chomp $path; # filter / process / forward } close $producer; ``` Each pipeline stage is its own process, scheduled by the OS, with no shared Perl state. ## Isolated background work The upstream pattern of `threads->create(sub { ... })->detach` — fire and forget — maps to a double-fork: ```perl my $pid = fork // die "fork: $!"; if ($pid == 0) { # first child: fork again and exit immediately, orphaning # the grandchild to init so the parent does not have to wait my $grand = fork // die "fork: $!"; exit 0 if $grand != 0; background_work(); exit 0; } waitpid $pid, 0; # reap the first child, not the grandchild ``` The grandchild runs to completion independently of the original program, no zombie is left behind, and state isolation is absolute. ## Choosing between auto-parallelisation and fork - **Auto-parallelisation** wins for compute-bound loops over in-memory data: no process startup cost, no serialisation of results, JIT-compiled body. - **fork** wins for I/O, subprocess work, and anything where the task should not inherit the parent's global state changes. - **Neither** is the right answer for low-cost task dispatch in tight loops — the usual culprit there is a sequential loop that does not actually benefit from concurrency. Measure before adding either layer. ## See also - [Parallel Execution](parallel) — the auto-parallelisation chapter: what qualifies, CLI flags, reduction detection - [ithreads basics](ithreads-basics) — the upstream model for context on what these alternatives replace - [Shared data](shared-data) — the upstream sharing primitives, for reading existing ithreaded code - [`fork`](../../p5/core/perlfunc/fork) — full reference for the process-level primitive - [`wait`](../../p5/core/perlfunc/wait) — reap a child process - [`lock`](../../p5/core/perlfunc/lock) — the no-op ithreads primitive under pperl - [Reference · P5](../../p5/index) and [Reference · PP](../../pp/index) — threading support is a runtime concern