Processes

fork#

Create a new process running the same program at the same point.

fork issues the fork(2) system call. After it returns, two processes are executing the same Perl program at the same statement — the original (parent) and a near-identical copy (child). The two are distinguished only by fork’s return value. File descriptors and, on some systems, the locks held on them are shared between parent and child; everything else is copied. On Linux the copy is cheap: data pages are shared copy-on-write, so the actual duplication happens only as each side writes to memory.

Synopsis#

my $pid = fork;

What you get back#

Three return values, one per role:

  • Parent: the child’s PID (a positive integer).

  • Child: 0.

  • Failure: undef, with $! set to the errno from the underlying fork(2) (typically EAGAIN — resource limit hit — or ENOMEM).

The canonical dispatch is a three-way branch. Test for undef first, then for 0, so the parent path is the fall-through:

my $pid = fork // die "fork failed: $!";
if ($pid == 0) {
    # child
    exec $cmd, @args;
    die "exec failed: $!";
}
# parent continues here, with $pid = child's PID

The two-line “fork or die” idiom:

defined(my $pid = fork) or die "fork: $!";

Writing fork or die (without defined) is a bug: it treats the child’s legitimate 0 return as failure and dies in every child.

Flush before you fork#

Perl attempts to flush all output handles before the fork(2) call, but you should not rely on that alone — any unflushed buffered data that does slip through is duplicated in both processes, so the same bytes get written twice. The canonical defence is to disable buffering on any handle you intend to use across a fork:

STDOUT->autoflush(1);           # or: $| = 1 while STDOUT is selected
STDERR->autoflush(1);

print "about to fork\n";        # now actually on the wire
my $pid = fork // die "fork: $!";

Loading IO::Handle is not required for autoflush in modern Perl — the method is available on all filehandles. Setting $| on the currently selected handle has the same effect for that one handle.

Reaping children — or $SIG{CHLD}#

Every child that terminates before its parent reaps it becomes a zombie: a kernel process-table entry held open so the parent can read the exit status. Ignore them and the process table fills up.

Two ways to keep it clean:

  • Actively reap with wait or waitpid:

    my $pid = fork // die "fork: $!";
    if ($pid == 0) {
        # ... child work ...
        exit 0;
    }
    my $reaped = waitpid $pid, 0;     # block until this child exits
    my $status = $?;                  # exit status of the child
    
  • Tell the kernel you do not care:

    $SIG{CHLD} = 'IGNORE';            # children auto-reaped, no status
    

    Useful for fire-and-forget workers. You lose the exit status — wait and waitpid will also no longer see the children.

For long-running parents that do care about exit status, install a reaper handler that drains all pending children without blocking:

$SIG{CHLD} = sub {
    while ((my $pid = waitpid -1, POSIX::WNOHANG) > 0) {
        # optionally inspect $? here
    }
};

See perlipc for fuller patterns.

What the child inherits#

  • Open file descriptors: shared. A print in either process advances the same underlying file offset. Close the descriptors the child does not need, and reopen handles the child should not share with the parent — particularly STDIN/STDOUT/STDERR when they are connected to a pipe or socket driving the parent’s caller. A backgrounded CGI script that forks and exits without closing its inherited STDOUT will leave the HTTP client hanging, because the child still holds the socket open:

    open STDIN,  '<', '/dev/null' or die $!;
    open STDOUT, '>', '/dev/null' or die $!;
    open STDERR, '>', '/dev/null' or die $!;
    
  • Memory: logically copied, physically copy-on-write. Mutating a large data structure in one side materialises only the touched pages.

  • Process ID: changes. $$ (also $PID / $PROCESS_ID under English) is re-read from the kernel on access after a fork, so both parent and child see their own PID.

  • Signal handlers, %ENV, working directory, umask, process group, controlling terminal: all copied.

  • Pending alarms and timers: not inherited by the child (per POSIX).

  • Locks held via flock: shared with the parent — both processes hold the same lock on the same open file description. Locks acquired via fcntl record locks behave differently; consult fcntl(2).

Examples#

Classic fork/exec — launch an external command without going through the shell:

my $pid = fork // die "fork: $!";
if ($pid == 0) {
    exec '/usr/bin/gzip', '-9', $file;
    die "exec: $!";                   # only reached if exec fails
}
waitpid $pid, 0;
die "gzip failed: $?" if $?;

Fan out N workers, then reap them all:

my @kids;
for my $i (1 .. 4) {
    my $pid = fork // die "fork: $!";
    if ($pid == 0) {
        do_work($i);
        exit 0;
    }
    push @kids, $pid;
}
waitpid $_, 0 for @kids;

Parent/child communication through a pipe. Open the pipe before forking so both sides inherit the two ends:

pipe(my $reader, my $writer) or die "pipe: $!";
my $pid = fork // die "fork: $!";
if ($pid == 0) {
    close $reader;
    $writer->autoflush(1);
    print $writer "hello from $$\n";
    exit 0;
}
close $writer;
chomp(my $line = <$reader>);
waitpid $pid, 0;
print "got: $line\n";

Note the close on the unused end in each process — leave both ends open in both processes and the reader will never see EOF.

Detach a daemon-style child and let the kernel reap it:

$SIG{CHLD} = 'IGNORE';
my $pid = fork // die "fork: $!";
if ($pid == 0) {
    # child: will be auto-reaped on exit
    run_background_task();
    exit 0;
}
# parent moves on, never calls wait

Edge cases#

  • fork inside a thread or after DESTROY-sensitive code: only the calling thread survives in the child; anything waiting on a mutex or joined thread in another thread is gone. Keep fork points before you spawn threads.

  • Buffered filehandles in the child: any data still buffered at the moment of fork is present in both processes’ buffers. When each side eventually flushes, the bytes are written twice. Autoflush before forking.

  • exit in the child runs END blocks and DESTROY-ers written by the parent. That is rarely what you want — prefer POSIX::_exit in the child after an exec failure, or ensure END blocks check $$ against the PID recorded at program start.

  • Error vs child confusion: fork returns 0 legitimately in the child and undef on failure. Use // and defined, never || / truth tests.

  • EAGAIN: transient — the kernel’s per-user process limit is reached. Retry with a short sleep, or lower your concurrency.

  • Working directory and chdir in the child do not affect the parent. The same is true of %ENV, umask, and signal handlers: the child has its own copy from the moment of fork on.

  • Seeks on shared descriptors are shared: if parent and child both write to the same inherited STDOUT, their output interleaves at whatever the OS scheduler decides. Coordinate with explicit locking or give each child its own output handle.

Differences from upstream#

Fully compatible with upstream Perl 5.42.

pperl targets Linux only, so fork is always a real fork(2). Unlike traditional perl5 on Windows — where fork is emulated with interpreter-level pseudo-processes sharing a single OS process, with the quirks documented in perlfork — the pperl fork creates an independent OS process with its own PID, address space, and process table entry. Code that assumed pseudo-process semantics (shared global state across “forked” sides) will not port to pperl.

See also#

  • exec — replace the current process image; the typical next call in a child right after fork

  • wait — block until any child exits and return its PID and status in $?

  • waitpid — reap a specific child, optionally non-blocking via WNOHANG

  • exit — terminate the current process; use in the child to end it without falling through into the parent’s code

  • pipe — create a pair of connected filehandles before forking to give parent and child a communication channel

  • $$ — the current process ID, re-read after fork so each side sees its own PID

  • %SIG — in particular $SIG{CHLD}, controlling whether children are auto-reaped or must be collected with wait / waitpid

  • perlipc — fuller treatment of forking, signals, pipes, and reaping moribund children