ithreads basics#

This chapter describes the perl5 ithreads model as it exists in upstream Perl 5.42. It is the vocabulary you need to read existing threaded Perl code. pperl does not run this code — see the landing page for the reason and Alternatives for the pperl-native equivalents.

What is a thread#

A thread is a flow of control through a program with a single execution point. Every process has at least one thread — the main thread — and Perl’s threads module lets a program spawn additional threads that run inside the same OS process.

Each ithread has its own Perl interpreter. When a new thread is created, the entire data state of the spawning thread is copied into the new one, like a fork but within the same process. After that moment the two interpreters are independent: modifying a variable in one does not affect the other, unless the variable is explicitly shared through threads::shared.

This distinguishes Perl ithreads from POSIX threads, Java threads, Win32 threads, and most other threading systems you may have met. Shared-everything is the default in those systems; shared-nothing is the default in Perl.

Checking for thread support#

Thread support is a Perl build-time option. A program that needs threads should fail early if the interpreter was built without it:

use Config;
$Config{useithreads}
    or die "Recompile Perl with threads to run this program.\n";

Under pperl, $Config{useithreads} is false and the threads module does not load. The defensive die fires, which is the correct outcome — better than silently running the single-threaded code path on a program that needed concurrency.

Creating a thread#

Load the threads module and call threads->create, passing a code reference to the subroutine the new thread should run:

use threads;

my $thr = threads->create(\&worker);

sub worker {
    print "in the thread\n";
}

create takes a code reference — or an anonymous sub — and optional arguments passed to the subroutine. Control then continues in both the caller and the new thread. new is a synonym for create.

Passing arguments:

my $thr = threads->create(\&worker, 'first', 'second', 42);

sub worker {
    my @args = @_;
    # ...
}

Each argument is copied into the new thread. Modifications inside the thread are invisible to the caller, because the thread has its own interpreter state.

Joining a thread#

A thread’s subroutine can return a value. To wait for the thread to finish and collect its return, call join:

use threads;

my ($thr) = threads->create(\&worker);
my @result = $thr->join;
print "thread returned @result\n";

sub worker { return ('ok', 2, 'done'); }

join blocks until the thread exits, collects the return value, and performs the OS-level cleanup for the thread. The list-context assignment my ($thr) = ... is the idiom for spawning a thread whose return will be consumed as a list.

If several threads are outstanding, join them in turn:

my @thr = map { threads->create(\&worker, $_) } 1..4;
my @results = map { $_->join } @thr;

Detaching a thread#

A detached thread runs until it finishes, then is cleaned up automatically. Its return value is discarded. Use detach when you do not care about the return and do not want to track the thread object:

my $thr = threads->create(\&background_work);
$thr->detach;

Once detached, a thread cannot be joined. A thread may also detach itself from inside its own body:

sub background_work {
    threads->detach;
    # ... long-running work ...
}

Process and thread termination#

die or exit in any thread terminates the whole process — every other thread stops mid-execution, with no chance to finish. Perl also calls exit implicitly when the main thread falls off the end of the program, even if other threads are still running.

The canonical hang-up:

use threads;

my $thr1 = threads->create(\&thrsub, 'one');
my $thr2 = threads->create(\&thrsub, 'two');

sub thrsub {
    my ($label) = @_;
    sleep 1;
    print "thread $label\n";
}
# Falls off the end. Perl warns "Perl exited with active threads:
# 2 running and unjoined" and aborts both threads.

Fix by joining before the main thread exits:

$thr1->join;
$thr2->join;

Three structural patterns#

Most threaded programs fall into one of three shapes.

Boss/worker#

A single boss thread collects or generates tasks and hands them to worker threads. The boss does little computation itself — its job is dispatch. Typical in GUI and server programs, where the main thread must stay responsive to events.

Work crew#

Several threads run the same subroutine on different pieces of data. Closely maps to data-parallel problems: render tiles of an image, hash slices of a file, fold a large array. This is the pattern pperl’s auto-parallelisation targets — see Alternatives.

Pipeline#

Threads form a chain. Each one receives data, performs one step of processing, and forwards the result to the next. Handy for prime sieves and stream-processing workloads where each stage has its own state.

The three patterns are not exclusive; a non-trivial program uses several of them for different parts.

Thread identity#

Every thread has a numeric thread ID (TID). The main thread has TID 0; each subsequent thread gets the next integer. From inside a thread, ask for its own object with threads->self and its own TID with threads->tid:

my $me = threads->self;
my $id = threads->tid;

threads->list returns every non-detached thread currently running. A common clean-up idiom at end of main:

foreach my $thr (threads->list) {
    $thr->join;
}

yield — usually a no-op#

threads->yield;

Hints to the OS scheduler that this thread is willing to give up the CPU. It is a hint only. On most modern operating systems yield is a no-op, and well-behaved code does not depend on it.

pperl status#

Every snippet on this page fails at compile time under pperl because use threads fails. The cleanest migration path is not to translate the snippets but to identify which of the three structural patterns the original uses, then follow the matching section of Alternatives.

Thread-safety of modules#

An upstream Perl note worth repeating: a module is not thread-safe unless its documentation says so. That caveat applies to the ithreaded program’s code, to every CPAN module it uses, and to the C libraries behind any XS modules.

pperl’s auto-parallelisation sidesteps this problem entirely: the parallel dispatch is opt-in per loop, the analyser refuses to parallelise any body with side effects, and there is no user-visible shared heap to serialise. See Parallel Execution.