I/O · Filehandles, files, directories

select#

Either set the default output filehandle, or call the select(2) syscall for I/O multiplexing. Same name, two unrelated jobs — disambiguated by argument count.

The one-argument form (or no-argument form) is about where print, printf, say, and write send their output when you don’t name a filehandle. The four-argument form is a direct wrapper around the BSD select(2) syscall for waiting on sets of file descriptors.

Synopsis#

select FILEHANDLE              # set default output handle, return old
select                         # return current default output handle
select RBITS, WBITS, EBITS, TIMEOUT   # select(2) syscall

What you get back#

select FILEHANDLE returns the previously selected filehandle (as a typeglob), so the idiomatic save/restore pattern works:

my $old = select $fh;
$| = 1;
select $old;

select (no arguments) returns the currently selected filehandle. At program start this is STDOUT.

select RBITS, WBITS, EBITS, TIMEOUT returns different things in different contexts:

  • In scalar context: the number of filehandles ready, or -1 on error with $! set.

  • In list context: ($nfound, $timeleft) — the same $nfound plus the remaining time from TIMEOUT. Many kernels do not update $timeleft; on those systems it is always equal to the input TIMEOUT.

On timeout expiry $nfound is 0. The three bit masks (RBITS, WBITS, EBITS) are updated in place to indicate which descriptors are ready — pass copies if you want to preserve the original masks.

Global state it touches#

The one-argument form changes one piece of interpreter-global state: the currently selected output filehandle. While that handle is selected, these variables refer to it, not to STDOUT:

  • $| — autoflush flag

  • $, — output field separator

  • $\ — output record separator

  • $#, $%, $=, $-, $~, $^ — format-related variables

That is why my $old = select($fh); $| = 1; select($old); works: $| is a property of the selected handle, not a global scalar.

The four-argument form touches no per-interpreter state. It may leave $! set on error.

Examples#

Enable autoflush on a handle without disturbing the default:

my $old = select $fh; $| = 1; select $old;

Redirect print temporarily and restore on block exit:

{
    my $old = select $log_fh;
    print "entry: $msg\n";       # goes to $log_fh
    select $old;
}

Sleep for 250 milliseconds using the four-argument form:

select undef, undef, undef, 0.25;

Wait up to 5 seconds for STDIN to become readable:

my $rin = '';
vec($rin, fileno(STDIN), 1) = 1;
my $nfound = select my $rout = $rin, undef, undef, 5;
if ($nfound) {
    my $line = <STDIN>;
}

Build a read mask over multiple handles:

sub fhbits {
    my $bits = '';
    vec($bits, fileno($_), 1) = 1 for @_;
    return $bits;
}
my $rin = fhbits(\*STDIN, $sock);
my ($nfound, $timeleft) =
    select my $rout = $rin, undef, undef, 10;

Edge cases#

  • Method-style alternative. $fh->autoflush(1) sets the autoflush flag on $fh without changing the default handle. Prefer it when you only need to tweak one property of one handle — no global state involved.

  • Typeglob localization is cleaner for scoped redirects. local *STDOUT = $fh rebinds STDOUT itself for the dynamic scope; on die the original is restored automatically. select does not restore on die. The two are not equivalent: with select an explicit print STDOUT ... still reaches the real STDOUT; with the local *STDOUT form, even print STDOUT ... goes to the rebound handle.

  • Bit-mask form requires bytes, not a list of fileno integers. A bit mask is a packed bit string built with vec; passing a list of integers or a reference silently produces wrong results (or an empty mask).

  • undef for a mask means “don’t wait on this set.” Pass undef for masks you don’t care about. An empty string '' technically also works but is less idiomatic.

  • Fractional timeouts work. TIMEOUT is a float in seconds. 0.25 is 250 ms; 0 returns immediately (polling).

  • TIMEOUT of undef means block indefinitely until at least one descriptor is ready or a signal arrives.

  • Masks are modified in place. On return, each mask holds only the bits for descriptors that are ready. If you need the originals afterwards, pass copies — the idiom my $rout = $rin on the call line does exactly that.

  • Do not mix buffered I/O with the four-argument form. select reports readiness at the kernel level; readline, read, and the diamond operator <> read through PerlIO buffers that may already hold unread data the kernel doesn’t know about. Use sysread for data once select says a descriptor is ready.

  • Spurious readiness on sockets. On some Unixes select(2) may report a socket as “ready for reading” when a following read would still block. Set O_NONBLOCK on the socket if this matters.

  • Restart-after-signal behaviour is system-dependent. A signal (e.g. SIGALRM) may cause select to return early with $! == EINTR, or the kernel may restart the call transparently. Portable code checks for EINTR explicitly.

  • Argument count disambiguates the two forms. One argument (or none) is the default-handle form; exactly four is the syscall form. Other counts are a syntax error.

Differences from upstream#

Fully compatible with upstream Perl 5.42.

See also#

  • print — reads the selected handle when no filehandle is given; select is how you change which handle that is

  • printf — same default-handle rule as print

  • write — format-based output; the top-of-form and page-length variables follow the selected handle, not STDOUT

  • fileno — maps a filehandle to the integer descriptor number you need when building a mask

  • vec — packs descriptor numbers into the bit string the four-argument form expects

  • sysread — the unbuffered read you use after the four-argument select says a descriptor is ready

  • IO::Select — object wrapper over the four-argument form that hides the bit-mask arithmetic