I/O

binmode#

Set the I/O layer stack on a filehandle — typically to make it deliver raw bytes, or to attach a character encoding.

binmode reconfigures FILEHANDLE so that subsequent reads and writes go through a specific PerlIO layer stack. In the one-argument form it puts the handle into raw binary mode (no CRLF translation, no character decoding). In the two-argument form it pushes (or, for a handful of pseudo-layers, replaces or pops) the named I/O layers onto the handle’s stack. Call it after open and before any I/O on the handle, except for :encoding, which may be applied mid-stream.

Synopsis#

binmode FILEHANDLE
binmode FILEHANDLE, LAYER

What you get back#

1 on success, undef on failure, with $! set. Failure is rare in practice — passing a malformed layer string or an unknown layer name is the usual cause. The return value is worth checking when the layer argument is user-supplied:

binmode $fh, ":encoding($enc)"
    or die "bad encoding '$enc': $!";

If FILEHANDLE is an expression rather than a bareword or simple scalar, its value is taken as the name of the filehandle, matching the filehandle-resolution rules of open and print.

Global state it touches#

binmode itself reads no special variables, but the layers it installs change how later I/O on the handle interacts with them. The layer stack governs how $/ (input record separator) and $\ (output record separator) are matched and emitted against the external byte stream — on platforms where text mode translates \n to a multi-byte sequence, the translation happens inside the layer, not in the variable.

PerlIO layers#

A layer is a string beginning with : that names a filter in the PerlIO stack — see PerlIO for the full catalogue. Multiple layers may be given in one call by concatenating them:

binmode $fh, ":raw:utf8";

The order is bottom-up: :raw is applied first (closest to the OS file descriptor), then :utf8 sits on top of it. Reads traverse the stack from bottom to top; writes traverse it from top to bottom.

The commonly used layers:

  • :raw — strip the handle down to byte-for-byte I/O. Turns off CRLF translation, removes any character-encoding layers, and marks the handle as bytes. Despite occasional claims to the contrary, :raw is not simply the inverse of :crlf; it also disables any other layer that would alter the binary nature of the stream. This is what the one-argument form installs.

  • :crlf — translate \r\n sequences to \n on input and \n to \r\n on output. This is the default on Windows / DOS-family systems for text-mode handles; applying it explicitly forces CRLF translation on any platform.

  • :utf8 — mark data on the handle as UTF-8. No validation on input: bytes are accepted and flagged as characters. On output, characters are encoded to UTF-8 bytes. Fast, but trusts the source.

  • :encoding(NAME) — decode bytes to characters on input and encode characters to bytes on output, using the named encoding (UTF-8, iso-8859-1, shiftjis, etc.). Input that is not valid for the encoding triggers a warning and substitution. :encoding implicitly pushes :utf8 on top of itself because Perl works on UTF-8 internally. See PerlIO::encoding for details and tuning options.

  • :bytes — the inverse of :utf8; mark data on the handle as bytes, disabling any character-semantics flag that a higher layer set.

Two pseudo-layers control stack shape rather than filtering:

  • :pop — remove the topmost layer from the handle.

  • :push — used with :via(...) to stack a custom layer module.

:raw in the two-argument form acts as a reset: it pops layers until the handle is in its most minimal state, rather than pushing a new layer on top.

Examples#

The one-argument form, for reading a binary file portably:

open my $img, "<", "photo.jpg" or die $!;
binmode $img;
my $bytes = do { local $/; <$img> };

Read a UTF-8 text file with validation:

open my $fh, "<", "notes.txt" or die $!;
binmode $fh, ":encoding(UTF-8)";
while (my $line = <$fh>) {
    # $line contains decoded characters, not bytes
}

Write Windows-style line endings on any platform:

open my $out, ">", "dos.txt" or die $!;
binmode $out, ":crlf";
print $out "one\ntwo\nthree\n";    # written as one\r\ntwo\r\nthree\r\n

Strip a previously-applied encoding layer and drop back to bytes:

binmode $fh, ":pop";               # remove topmost layer
binmode $fh, ":raw";               # or: flatten to bare bytes

Validate a user-supplied encoding before committing to it:

my $enc = $ENV{INPUT_ENCODING} // "UTF-8";
binmode $fh, ":encoding($enc)"
    or die "unsupported encoding '$enc': $!";

Edge cases#

  • Call after open, before I/O. binmode on a handle that has already been read from or written to produces defined but implementation-dependent results; most layers flush pending buffers on install, which may lose data that crossed the layer boundary. :encoding is the documented exception — it may be pushed mid-stream without a flush.

  • One-argument form is :raw, not “do nothing”. On Unix the observable effect is often nil because handles are already in byte-oriented mode, but on Windows it disables CRLF translation. Write portable code as if binmode $fh is always meaningful.

  • CRLF on Windows / DOS. Without binmode, the C runtime on those platforms converts \r\n to \n on input and the reverse on output. For binary formats (images, compressed data, serialised structures) this silently corrupts the stream. binmode is mandatory there.

  • Ctrl-Z as end-of-file. On Microsoft-family systems, text-mode handles treat \cZ (byte 0x1A) as end-of-file. Binary data that happens to contain that byte will appear truncated unless binmode is used.

  • :utf8 is trust-only. An input handle with :utf8 but no :encoding(UTF-8) will happily hand you invalid UTF-8 flagged as characters. Use :encoding(UTF-8) when the source is untrusted.

  • :encoding pulls in :utf8. After binmode $fh, ":encoding(shiftjis)" the handle has both :encoding(shiftjis) and :utf8 in its stack — perl’s internal form is UTF-8 regardless of the wire encoding.

  • Filehandle expressions. binmode $handles[0], ":utf8" works; any expression that yields a filehandle value is accepted as the first argument and is evaluated to name the handle.

  • Affects every I/O primitive on the handle. The layer stack governs read, sysread, syswrite, seek, tell, print, and readline alike. In particular, sysread and syswrite on a :utf8-marked handle still go through the layer and may return partial characters.

  • Interaction with the open pragma. use open ":encoding(UTF-8)" installs a default layer set for subsequently-opened handles. Explicit binmode overrides that default for a given handle.

Differences from upstream#

Fully compatible with upstream Perl 5.42.

See also#

  • open — create the filehandle in the first place; accepts layers directly in the mode string (<:encoding(UTF-8)) to skip a follow-up binmode call

  • Encode — encode and decode strings in memory, independent of any filehandle

  • PerlIO — full layer catalogue, including less common layers like :scalar, :via, and :mmap

  • read / sysread — input primitives whose byte / character semantics depend on the layer stack

  • print / syswrite — output primitives whose encoding and line-ending behaviour depend on the layer stack

  • $/ — input record separator, matched against the post-layer byte stream

  • $\ — output record separator, written through the layer stack