fork#
Create a new process running the same program at the same point.
fork issues the fork(2)
system call. After it returns, two processes are executing the
same Perl program at the same statement — the original (parent) and
a near-identical copy (child). The two are distinguished only by
fork’s return value. File descriptors and, on some systems, the
locks held on them are shared between parent and child; everything
else is copied. On Linux the copy is cheap: data pages are shared
copy-on-write, so the actual duplication happens only as each side
writes to memory.
Synopsis#
my $pid = fork;
What you get back#
Three return values, one per role:
Parent: the child’s PID (a positive integer).
Child:
0.Failure:
undef, with$!set to theerrnofrom the underlyingfork(2)(typicallyEAGAIN— resource limit hit — orENOMEM).
The canonical dispatch is a three-way branch. Test for undef first,
then for 0, so the parent path is the fall-through:
my $pid = fork // die "fork failed: $!";
if ($pid == 0) {
# child
exec $cmd, @args;
die "exec failed: $!";
}
# parent continues here, with $pid = child's PID
The two-line “fork or die” idiom:
defined(my $pid = fork) or die "fork: $!";
Writing fork or die (without defined) is a bug: it treats the
child’s legitimate 0 return as failure and dies in every child.
Flush before you fork#
Perl attempts to flush all output handles before the
fork(2) call, but you
should not rely on that alone — any unflushed buffered data that does
slip through is duplicated in both processes, so the same bytes get
written twice. The canonical defence is to disable buffering on any
handle you intend to use across a fork:
STDOUT->autoflush(1); # or: $| = 1 while STDOUT is selected
STDERR->autoflush(1);
print "about to fork\n"; # now actually on the wire
my $pid = fork // die "fork: $!";
Loading IO::Handle is not required for autoflush
in modern Perl — the method is available on all filehandles. Setting
$| on the currently selected handle has the same effect for
that one handle.
Reaping children — or $SIG{CHLD}#
Every child that terminates before its parent reaps it becomes a zombie: a kernel process-table entry held open so the parent can read the exit status. Ignore them and the process table fills up.
Two ways to keep it clean:
Actively reap with
waitorwaitpid:my $pid = fork // die "fork: $!"; if ($pid == 0) { # ... child work ... exit 0; } my $reaped = waitpid $pid, 0; # block until this child exits my $status = $?; # exit status of the child
Tell the kernel you do not care:
$SIG{CHLD} = 'IGNORE'; # children auto-reaped, no status
Useful for fire-and-forget workers. You lose the exit status —
waitandwaitpidwill also no longer see the children.
For long-running parents that do care about exit status, install a reaper handler that drains all pending children without blocking:
$SIG{CHLD} = sub {
while ((my $pid = waitpid -1, POSIX::WNOHANG) > 0) {
# optionally inspect $? here
}
};
See perlipc for fuller patterns.
What the child inherits#
Open file descriptors: shared. A
printin either process advances the same underlying file offset. Close the descriptors the child does not need, and reopen handles the child should not share with the parent — particularlySTDIN/STDOUT/STDERRwhen they are connected to a pipe or socket driving the parent’s caller. A backgrounded CGI script that forks and exits without closing its inheritedSTDOUTwill leave the HTTP client hanging, because the child still holds the socket open:open STDIN, '<', '/dev/null' or die $!; open STDOUT, '>', '/dev/null' or die $!; open STDERR, '>', '/dev/null' or die $!;
Memory: logically copied, physically copy-on-write. Mutating a large data structure in one side materialises only the touched pages.
Process ID: changes.
$$(also$PID/$PROCESS_IDunderEnglish) is re-read from the kernel on access after a fork, so both parent and child see their own PID.Signal handlers,
%ENV, working directory, umask, process group, controlling terminal: all copied.Pending alarms and timers: not inherited by the child (per POSIX).
Locks held via
flock: shared with the parent — both processes hold the same lock on the same open file description. Locks acquired viafcntlrecord locks behave differently; consultfcntl(2).
Examples#
Classic fork/exec — launch an external command without going through the shell:
my $pid = fork // die "fork: $!";
if ($pid == 0) {
exec '/usr/bin/gzip', '-9', $file;
die "exec: $!"; # only reached if exec fails
}
waitpid $pid, 0;
die "gzip failed: $?" if $?;
Fan out N workers, then reap them all:
my @kids;
for my $i (1 .. 4) {
my $pid = fork // die "fork: $!";
if ($pid == 0) {
do_work($i);
exit 0;
}
push @kids, $pid;
}
waitpid $_, 0 for @kids;
Parent/child communication through a pipe. Open the pipe before forking so both sides inherit the two ends:
pipe(my $reader, my $writer) or die "pipe: $!";
my $pid = fork // die "fork: $!";
if ($pid == 0) {
close $reader;
$writer->autoflush(1);
print $writer "hello from $$\n";
exit 0;
}
close $writer;
chomp(my $line = <$reader>);
waitpid $pid, 0;
print "got: $line\n";
Note the close on the unused end in each process — leave both ends
open in both processes and the reader will never see EOF.
Detach a daemon-style child and let the kernel reap it:
$SIG{CHLD} = 'IGNORE';
my $pid = fork // die "fork: $!";
if ($pid == 0) {
# child: will be auto-reaped on exit
run_background_task();
exit 0;
}
# parent moves on, never calls wait
Edge cases#
forkinside a thread or afterDESTROY-sensitive code: only the calling thread survives in the child; anything waiting on a mutex or joined thread in another thread is gone. Keep fork points before you spawn threads.Buffered filehandles in the child: any data still buffered at the moment of fork is present in both processes’ buffers. When each side eventually flushes, the bytes are written twice. Autoflush before forking.
exitin the child runsENDblocks andDESTROY-ers written by the parent. That is rarely what you want — preferPOSIX::_exitin the child after anexecfailure, or ensureENDblocks check$$against the PID recorded at program start.Error vs child confusion:
forkreturns0legitimately in the child andundefon failure. Use//anddefined, never||/ truth tests.EAGAIN: transient — the kernel’s per-user process limit is reached. Retry with a short sleep, or lower your concurrency.Working directory and
chdirin the child do not affect the parent. The same is true of%ENV,umask, and signal handlers: the child has its own copy from the moment of fork on.Seeks on shared descriptors are shared: if parent and child both write to the same inherited
STDOUT, their output interleaves at whatever the OS scheduler decides. Coordinate with explicit locking or give each child its own output handle.
Differences from upstream#
Fully compatible with upstream Perl 5.42.
pperl targets Linux only, so fork is always a real
fork(2). Unlike
traditional perl5 on Windows — where fork is emulated with
interpreter-level pseudo-processes sharing a single OS process,
with the quirks documented in perlfork — the pperl fork creates
an independent OS process with its own PID, address space, and
process table entry. Code that assumed pseudo-process semantics
(shared global state across “forked” sides) will not port to pperl.
See also#
exec— replace the current process image; the typical next call in a child right afterforkwait— block until any child exits and return its PID and status in$?waitpid— reap a specific child, optionally non-blocking viaWNOHANGexit— terminate the current process; use in the child to end it without falling through into the parent’s codepipe— create a pair of connected filehandles before forking to give parent and child a communication channel$$— the current process ID, re-read afterforkso each side sees its own PID%SIG— in particular$SIG{CHLD}, controlling whether children are auto-reaped or must be collected withwait/waitpidperlipc— fuller treatment of forking, signals, pipes, and reaping moribund children