binmode#
Set the I/O layer stack on a filehandle — typically to make it deliver raw bytes, or to attach a character encoding.
binmode reconfigures FILEHANDLE so that subsequent reads and writes
go through a specific PerlIO layer stack. In the one-argument form it
puts the handle into raw binary mode (no CRLF translation, no character
decoding). In the two-argument form it pushes (or, for a handful of
pseudo-layers, replaces or pops) the named I/O layers onto the handle’s
stack. Call it after open and before any I/O on the handle,
except for :encoding, which may be applied mid-stream.
Synopsis#
binmode FILEHANDLE
binmode FILEHANDLE, LAYER
What you get back#
1 on success, undef on failure, with $!
set. Failure is rare in practice — passing a malformed layer string or
an unknown layer name is the usual cause. The return value is worth
checking when the layer argument is user-supplied:
binmode $fh, ":encoding($enc)"
or die "bad encoding '$enc': $!";
If FILEHANDLE is an expression rather than a bareword or simple
scalar, its value is taken as the name of the filehandle, matching the
filehandle-resolution rules of open and print.
Global state it touches#
binmode itself reads no special variables, but the layers it installs
change how later I/O on the handle interacts with them. The layer stack
governs how $/ (input record separator) and
$\ (output record separator) are matched and emitted
against the external byte stream — on platforms where text mode
translates \n to a multi-byte sequence, the translation happens
inside the layer, not in the variable.
PerlIO layers#
A layer is a string beginning with : that names a filter in the
PerlIO stack — see PerlIO for the full catalogue.
Multiple layers may be given in one call by concatenating them:
binmode $fh, ":raw:utf8";
The order is bottom-up: :raw is applied first (closest to the OS
file descriptor), then :utf8 sits on top of it. Reads traverse the
stack from bottom to top; writes traverse it from top to bottom.
The commonly used layers:
:raw— strip the handle down to byte-for-byte I/O. Turns off CRLF translation, removes any character-encoding layers, and marks the handle as bytes. Despite occasional claims to the contrary,:rawis not simply the inverse of:crlf; it also disables any other layer that would alter the binary nature of the stream. This is what the one-argument form installs.:crlf— translate\r\nsequences to\non input and\nto\r\non output. This is the default on Windows / DOS-family systems for text-mode handles; applying it explicitly forces CRLF translation on any platform.:utf8— mark data on the handle as UTF-8. No validation on input: bytes are accepted and flagged as characters. On output, characters are encoded to UTF-8 bytes. Fast, but trusts the source.:encoding(NAME)— decode bytes to characters on input and encode characters to bytes on output, using the named encoding (UTF-8,iso-8859-1,shiftjis, etc.). Input that is not valid for the encoding triggers a warning and substitution.:encodingimplicitly pushes:utf8on top of itself because Perl works on UTF-8 internally. SeePerlIO::encodingfor details and tuning options.:bytes— the inverse of:utf8; mark data on the handle as bytes, disabling any character-semantics flag that a higher layer set.
Two pseudo-layers control stack shape rather than filtering:
:pop— remove the topmost layer from the handle.:push— used with:via(...)to stack a custom layer module.
:raw in the two-argument form acts as a reset: it pops layers until
the handle is in its most minimal state, rather than pushing a new
layer on top.
Examples#
The one-argument form, for reading a binary file portably:
open my $img, "<", "photo.jpg" or die $!;
binmode $img;
my $bytes = do { local $/; <$img> };
Read a UTF-8 text file with validation:
open my $fh, "<", "notes.txt" or die $!;
binmode $fh, ":encoding(UTF-8)";
while (my $line = <$fh>) {
# $line contains decoded characters, not bytes
}
Write Windows-style line endings on any platform:
open my $out, ">", "dos.txt" or die $!;
binmode $out, ":crlf";
print $out "one\ntwo\nthree\n"; # written as one\r\ntwo\r\nthree\r\n
Strip a previously-applied encoding layer and drop back to bytes:
binmode $fh, ":pop"; # remove topmost layer
binmode $fh, ":raw"; # or: flatten to bare bytes
Validate a user-supplied encoding before committing to it:
my $enc = $ENV{INPUT_ENCODING} // "UTF-8";
binmode $fh, ":encoding($enc)"
or die "unsupported encoding '$enc': $!";
Edge cases#
Call after
open, before I/O.binmodeon a handle that has already been read from or written to produces defined but implementation-dependent results; most layers flush pending buffers on install, which may lose data that crossed the layer boundary.:encodingis the documented exception — it may be pushed mid-stream without a flush.One-argument form is
:raw, not “do nothing”. On Unix the observable effect is often nil because handles are already in byte-oriented mode, but on Windows it disables CRLF translation. Write portable code as ifbinmode $fhis always meaningful.CRLF on Windows / DOS. Without
binmode, the C runtime on those platforms converts\r\nto\non input and the reverse on output. For binary formats (images, compressed data, serialised structures) this silently corrupts the stream.binmodeis mandatory there.Ctrl-Z as end-of-file. On Microsoft-family systems, text-mode handles treat
\cZ(byte 0x1A) as end-of-file. Binary data that happens to contain that byte will appear truncated unlessbinmodeis used.:utf8is trust-only. An input handle with:utf8but no:encoding(UTF-8)will happily hand you invalid UTF-8 flagged as characters. Use:encoding(UTF-8)when the source is untrusted.:encodingpulls in:utf8. Afterbinmode $fh, ":encoding(shiftjis)"the handle has both:encoding(shiftjis)and:utf8in its stack — perl’s internal form is UTF-8 regardless of the wire encoding.Filehandle expressions.
binmode $handles[0], ":utf8"works; any expression that yields a filehandle value is accepted as the first argument and is evaluated to name the handle.Affects every I/O primitive on the handle. The layer stack governs
read,sysread,syswrite,seek,tell,print, andreadlinealike. In particular,sysreadandsyswriteon a:utf8-marked handle still go through the layer and may return partial characters.Interaction with the
openpragma.use open ":encoding(UTF-8)"installs a default layer set for subsequently-opened handles. Explicitbinmodeoverrides that default for a given handle.
Differences from upstream#
Fully compatible with upstream Perl 5.42.
See also#
open— create the filehandle in the first place; accepts layers directly in the mode string (<:encoding(UTF-8)) to skip a follow-upbinmodecallEncode— encode and decode strings in memory, independent of any filehandlePerlIO— full layer catalogue, including less common layers like:scalar,:via, and:mmapread/sysread— input primitives whose byte / character semantics depend on the layer stackprint/syswrite— output primitives whose encoding and line-ending behaviour depend on the layer stack$/— input record separator, matched against the post-layer byte stream$\— output record separator, written through the layer stack