unpack#
Extract typed values from a binary or fixed-width string according to a template.
unpack is the inverse of pack. It walks EXPR left to
right, consuming the bytes described by each directive in TEMPLATE
and turning them into Perl values. The result is a list — one value
per directive, or per repetition when a directive has a count. If
EXPR is omitted, unpack reads from $_.
This page is a directive reference. For a narrative introduction with worked protocol and file-format examples, start at the pack/unpack tutorial.
Synopsis#
unpack TEMPLATE, EXPR
unpack TEMPLATE # reads from $_
my @fields = unpack "A10 A10 A*", $line;
my ($ver, $len, $payload) = unpack "n N A*", $msg;
What you get back#
A list of values, one per directive (or per repetition when a directive has a count). In scalar context, only the first value produced is returned:
my @all = unpack "A4 A4 A*", $rec; # three values
my $first = unpack "A4 A4 A*", $rec; # just the first — not a count
Each directive determines the Perl type of its result — see the “Result type” column in the directive table below.
Global state it touches#
unpack TEMPLATE with no EXPR reads $_. No other
interpreter globals are consulted. The C0 / U0 directives switch
template byte-vs-character interpretation locally within the template
but do not mutate any outside state.
Template syntax#
Identical to pack: a sequence of directive letters,
optionally followed by repeat counts N / * / […] and modifiers
! / < / >. Parentheses form groups; # introduces a template
comment. See pack for the full syntax rules — this page
covers only the unpack-specific semantics.
Directive table#
Every unpack directive in one place. W is the width in bytes consumed from the input per scalar produced.
Directive |
W |
Signed |
Endian |
Result type |
Modifiers |
Notes |
|---|---|---|---|---|---|---|
|
count |
— |
— |
string |
— |
Returns bytes untouched |
|
count |
— |
— |
string |
— |
Strips trailing whitespace and |
|
count |
— |
— |
string |
— |
Returns everything up to the first |
|
count/8 |
— |
— |
bit string |
— |
“0”/”1” characters, LSB of each byte first |
|
count/8 |
— |
— |
bit string |
— |
“0”/”1” characters, MSB of each byte first |
|
count/2 |
— |
— |
hex string |
— |
|
|
count/2 |
— |
— |
hex string |
— |
|
|
1 |
yes |
— |
integer |
— |
Signed char |
|
1 |
no |
— |
integer |
— |
Unsigned char |
|
1 |
no |
— |
integer |
— |
Unsigned char; yields codepoint in |
|
2 |
yes |
native |
integer |
|
|
|
2 |
no |
native |
integer |
|
|
|
4 |
yes |
native |
integer |
|
|
|
4 |
no |
native |
integer |
|
|
|
native |
yes |
native |
integer |
|
|
|
native |
no |
native |
integer |
|
|
|
8 |
yes |
native |
integer |
|
Requires 64-bit-integer Perl |
|
8 |
no |
native |
integer |
|
Requires 64-bit-integer Perl |
|
2 |
no |
big |
integer |
|
Network order; |
|
4 |
no |
big |
integer |
|
Network order; |
|
2 |
no |
little |
integer |
|
“VAX” order; |
|
4 |
no |
little |
integer |
|
“VAX” order; |
|
IV size |
yes |
native |
integer |
|
Perl-internal |
|
UV size |
no |
native |
integer |
|
Perl-internal |
|
4 |
— |
native |
number |
|
Single-precision IEEE 754 |
|
8 |
— |
native |
number |
|
Double-precision IEEE 754 |
|
NV size |
— |
native |
number |
|
Perl-internal float |
|
varies |
— |
native |
number |
|
Long double |
|
ptr size |
— |
native |
string |
|
Dereferences pointer to NUL-terminated string |
|
ptr size |
— |
native |
string |
|
Dereferences pointer; count = bytes to read |
|
varies |
— |
— |
string |
— |
Uudecoded bytes |
|
varies |
— |
— |
integer |
— |
Unicode codepoint number |
|
varies |
no |
— |
integer |
— |
BER-compressed integer |
|
1 |
— |
— |
(none) |
|
Skip one byte forward; |
|
−1 |
— |
— |
(none) |
|
Back up one byte; |
|
absolute |
— |
— |
(none) |
|
Jump to position N within the innermost group |
|
— |
— |
— |
integer |
|
Return current position (relative to start of group / string) |
|
— |
— |
— |
— |
|
Group: repeat count and endianness propagate inside |
|
— |
— |
— |
— |
— |
See “Length-prefixed payloads” below |
|
varies |
— |
— |
integer |
— |
Prefix. N-bit checksum of the following directive’s values |
The modifiers have the same meaning as in pack: ! selects
native sizes, alignment semantics, signedness for n / N / v / V,
or byte offsets for @ / .; < / > force endianness.
Differences from pack#
astrips nothing (returns bytes raw);Astrips trailing whitespace and NUL;Zstops at the first NUL. Withpackall three pad; withunpackall three un-pad differently.xskips forward by W,Xbacks up by W.Xbefore the start of the string is a fatal error./reads a count from the data, then applies it to the next directive:unpack("W/a", "\004Gurusamy") # ("Guru") unpack("a3/A A*", "007 Bond J ") # (" Bond", "J") unpack("n/a*", "\x00\x0chello, world") # ("hello, world")
With
packthe form is length-item/item and the length is computed from the value’s length. Withunpackthe form is/item (no length-item before) — the preceding integer directive supplies the count.%Nis unpack-only. It replaces the following directive’s normal output with theN-bit sum of the values that directive would otherwise have produced.%32W*is the System Vsumchecksum.%32b*counts set bits..returns the current byte position rather than zero-filling to one. Useful afterx/Xnavigation to learn where you are.pandPdereference pointers read from the input — almost always unsafe unless you know the input was produced bypack "p"/pack "P"in the same process.Under-run / over-run behaviour: if the template asks for more data than
EXPRcontains, the result is not well-defined (may yield empty strings, zeros, a decreased repeat count, or an exception). IfEXPRis longer than the template consumes, trailing bytes are silently ignored.
Examples#
Parse a minimal binary protocol header — a 16-bit version, a 32-bit length, then a variable-length payload, all network byte order:
my ($ver, $len, $payload) = unpack "n N A*", $msg;
Split a fixed-width text record — faster and clearer than chained
substr calls:
my ($date, $desc, $amount) = unpack "A10 A27 A*", $line;
# "2026-04-22 coffee at the station 3.50"
# $date = "2026-04-22", $desc = "coffee at the station",
# $amount = "3.50"
Extract a System V-style checksum with the % prefix — the 32-bit
sum of every byte, masked to 16 bits:
my $checksum = do {
local $/; # slurp
unpack "%32W*", readline $fh;
} % 65535;
Count set bits in a bit vector — %32b* reads the whole mask as
bits and returns their sum:
my $setbits = unpack "%32b*", $selectmask;
Read a sequence of little-endian 32-bit integers off a raw buffer:
my @ints = unpack "V*", $buf;
Use x to skip filler bytes, X to back up after peeking:
# 2-byte tag, skip 2 reserved bytes, then 4 32-bit big-endian values
my ($tag, @vals) = unpack "n x2 N4", $frame;
Edge cases#
Scalar context returns only the first value. Assigning
unpackto a scalar is almost never what you want. To count fields, assign to a list first, or use() = unpack ...in list context:my $n = () = unpack "A10 A10 A*", $line; # 3
*is greedy, not a placeholder.A*consumes all remaining bytes as one string; there is no backtracking. A*directive must be the last one in its sequence, or nothing after it runs.Templates are not regular expressions. Whitespace is ignored, but there is no alternation, no
|, no lookahead. If data shape depends on earlier fields, unpack in stages: decode a header, then callunpackagain on the payload using a template derived from what you just read.Byte vs. character semantics.
unpackruns in byte mode by default on a byte string and in character mode on a string with the UTF-8 flag set.C0forces byte mode from that point;U0forces UTF-8 mode. Mixing a UTF-8-flagged string with numeric directives likeNreads codepoints, not bytes — usually not what you want. Callutf8::encodefirst, or operate on byte-level data.Xpast the start of the string is fatal.p/Pon an arbitrary input is undefined behaviour. Never use them on data you did not pack yourself in the same process.No
EXPRargument:unpack TEMPLATEreads$_. Useful insidewhile (<$fh>) { ... }loops over fixed-width records.
Differences from upstream#
Fully compatible with upstream Perl 5.42.
See also#
pack— the inverse operation; same template languagesubstr— extract a single fixed slice; simpler for one field, slower for manysprintf— round-trip between binary and human-readableread— read raw bytes off a filehandle into a buffer forunpackvec— indexed access to a bit vector without a templatepack/unpack tutorial — task-oriented walkthrough with worked protocol and file-format examples