Lists · Fixed-length data

unpack#

Extract typed values from a binary or fixed-width string according to a template.

unpack is the inverse of pack. It walks EXPR left to right, consuming the bytes described by each directive in TEMPLATE and turning them into Perl values. The result is a list — one value per directive, or per repetition when a directive has a count. If EXPR is omitted, unpack reads from $_.

This page is a directive reference. For a narrative introduction with worked protocol and file-format examples, start at the pack/unpack tutorial.

Synopsis#

unpack TEMPLATE, EXPR
unpack TEMPLATE                             # reads from $_
my @fields = unpack "A10 A10 A*", $line;
my ($ver, $len, $payload) = unpack "n N A*", $msg;

What you get back#

A list of values, one per directive (or per repetition when a directive has a count). In scalar context, only the first value produced is returned:

my @all   = unpack "A4 A4 A*", $rec;    # three values
my $first = unpack "A4 A4 A*", $rec;    # just the first — not a count

Each directive determines the Perl type of its result — see the “Result type” column in the directive table below.

Global state it touches#

unpack TEMPLATE with no EXPR reads $_. No other interpreter globals are consulted. The C0 / U0 directives switch template byte-vs-character interpretation locally within the template but do not mutate any outside state.

Template syntax#

Identical to pack: a sequence of directive letters, optionally followed by repeat counts N / * / […] and modifiers ! / < / >. Parentheses form groups; # introduces a template comment. See pack for the full syntax rules — this page covers only the unpack-specific semantics.

Directive table#

Every unpack directive in one place. W is the width in bytes consumed from the input per scalar produced.

Directive

W

Signed

Endian

Result type

Modifiers

Notes

a

count

string

Returns bytes untouched

A

count

string

Strips trailing whitespace and NUL

Z

count

string

Returns everything up to the first NUL

b

count/8

bit string

“0”/”1” characters, LSB of each byte first

B

count/8

bit string

“0”/”1” characters, MSB of each byte first

h

count/2

hex string

[0-9a-f] characters, low nybble first

H

count/2

hex string

[0-9a-f] characters, high nybble first

c

1

yes

integer

Signed char

C

1

no

integer

Unsigned char

W

1

no

integer

Unsigned char; yields codepoint in U0 mode

s

2

yes

native

integer

! < >

s! = native short

S

2

no

native

integer

! < >

S! = native unsigned short

l

4

yes

native

integer

! < >

l! = native long

L

4

no

native

integer

! < >

L! = native unsigned long

i

native

yes

native

integer

! < >

sizeof(int)

I

native

no

native

integer

! < >

sizeof(unsigned int)

q

8

yes

native

integer

< >

Requires 64-bit-integer Perl

Q

8

no

native

integer

< >

Requires 64-bit-integer Perl

n

2

no

big

integer

!

Network order; n! = signed

N

4

no

big

integer

!

Network order; N! = signed

v

2

no

little

integer

!

“VAX” order; v! = signed

V

4

no

little

integer

!

“VAX” order; V! = signed

j

IV size

yes

native

integer

< >

Perl-internal IV

J

UV size

no

native

integer

< >

Perl-internal UV

f

4

native

number

< >

Single-precision IEEE 754

d

8

native

number

< >

Double-precision IEEE 754

F

NV size

native

number

< >

Perl-internal float

D

varies

native

number

< >

Long double

p

ptr size

native

string

< >

Dereferences pointer to NUL-terminated string

P

ptr size

native

string

< >

Dereferences pointer; count = bytes to read

u

varies

string

Uudecoded bytes

U

varies

integer

Unicode codepoint number

w

varies

no

integer

BER-compressed integer

x

1

(none)

!

Skip one byte forward; x!N aligns forward

X

−1

(none)

!

Back up one byte; X!N aligns backward

@

absolute

(none)

!

Jump to position N within the innermost group

.

integer

!

Return current position (relative to start of group / string)

( )

< > !

Group: repeat count and endianness propagate inside

/

See “Length-prefixed payloads” below

%N

varies

integer

Prefix. N-bit checksum of the following directive’s values

The modifiers have the same meaning as in pack: ! selects native sizes, alignment semantics, signedness for n / N / v / V, or byte offsets for @ / .; < / > force endianness.

Differences from pack#

  • a strips nothing (returns bytes raw); A strips trailing whitespace and NUL; Z stops at the first NUL. With pack all three pad; with unpack all three un-pad differently.

  • x skips forward by W, X backs up by W. X before the start of the string is a fatal error.

  • / reads a count from the data, then applies it to the next directive:

    unpack("W/a", "\004Gurusamy")          # ("Guru")
    unpack("a3/A A*", "007 Bond  J ")      # (" Bond", "J")
    unpack("n/a*", "\x00\x0chello, world") # ("hello, world")
    

    With pack the form is length-item/item and the length is computed from the value’s length. With unpack the form is /item (no length-item before) — the preceding integer directive supplies the count.

  • %N is unpack-only. It replaces the following directive’s normal output with the N-bit sum of the values that directive would otherwise have produced. %32W* is the System V sum checksum. %32b* counts set bits.

  • . returns the current byte position rather than zero-filling to one. Useful after x / X navigation to learn where you are.

  • p and P dereference pointers read from the input — almost always unsafe unless you know the input was produced by pack "p" / pack "P" in the same process.

  • Under-run / over-run behaviour: if the template asks for more data than EXPR contains, the result is not well-defined (may yield empty strings, zeros, a decreased repeat count, or an exception). If EXPR is longer than the template consumes, trailing bytes are silently ignored.

Examples#

Parse a minimal binary protocol header — a 16-bit version, a 32-bit length, then a variable-length payload, all network byte order:

my ($ver, $len, $payload) = unpack "n N A*", $msg;

Split a fixed-width text record — faster and clearer than chained substr calls:

my ($date, $desc, $amount) = unpack "A10 A27 A*", $line;
# "2026-04-22 coffee at the station           3.50"
# $date = "2026-04-22", $desc = "coffee at the station",
# $amount = "3.50"

Extract a System V-style checksum with the % prefix — the 32-bit sum of every byte, masked to 16 bits:

my $checksum = do {
    local $/;                           # slurp
    unpack "%32W*", readline $fh;
} % 65535;

Count set bits in a bit vector — %32b* reads the whole mask as bits and returns their sum:

my $setbits = unpack "%32b*", $selectmask;

Read a sequence of little-endian 32-bit integers off a raw buffer:

my @ints = unpack "V*", $buf;

Use x to skip filler bytes, X to back up after peeking:

# 2-byte tag, skip 2 reserved bytes, then 4 32-bit big-endian values
my ($tag, @vals) = unpack "n x2 N4", $frame;

Edge cases#

  • Scalar context returns only the first value. Assigning unpack to a scalar is almost never what you want. To count fields, assign to a list first, or use () = unpack ... in list context:

    my $n = () = unpack "A10 A10 A*", $line;   # 3
    
  • * is greedy, not a placeholder. A* consumes all remaining bytes as one string; there is no backtracking. A * directive must be the last one in its sequence, or nothing after it runs.

  • Templates are not regular expressions. Whitespace is ignored, but there is no alternation, no |, no lookahead. If data shape depends on earlier fields, unpack in stages: decode a header, then call unpack again on the payload using a template derived from what you just read.

  • Byte vs. character semantics. unpack runs in byte mode by default on a byte string and in character mode on a string with the UTF-8 flag set. C0 forces byte mode from that point; U0 forces UTF-8 mode. Mixing a UTF-8-flagged string with numeric directives like N reads codepoints, not bytes — usually not what you want. Call utf8::encode first, or operate on byte-level data.

  • X past the start of the string is fatal.

  • p / P on an arbitrary input is undefined behaviour. Never use them on data you did not pack yourself in the same process.

  • No EXPR argument: unpack TEMPLATE reads $_. Useful inside while (<$fh>) { ... } loops over fixed-width records.

Differences from upstream#

Fully compatible with upstream Perl 5.42.

See also#

  • pack — the inverse operation; same template language

  • substr — extract a single fixed slice; simpler for one field, slower for many

  • sprintf — round-trip between binary and human-readable

  • read — read raw bytes off a filehandle into a buffer for unpack

  • vec — indexed access to a bit vector without a template

  • pack/unpack tutorial — task-oriented walkthrough with worked protocol and file-format examples