--- name: unpack signature: 'unpack TEMPLATE,EXPR' signatures: - 'unpack TEMPLATE, EXPR' - 'unpack TEMPLATE' since: 5.0 status: documented categories: ["Lists", "Fixed-length data"] --- ```{index} single: unpack; Perl built-in ``` *[Lists](../perlfunc-by-category) · [Fixed-length data](../perlfunc-by-category)* # unpack Extract typed values from a binary or fixed-width string according to a template. `unpack` is the inverse of [`pack`](pack). It walks `EXPR` left to right, consuming the bytes described by each directive in `TEMPLATE` and turning them into Perl values. The result is a list — one value per directive, or per repetition when a directive has a count. If `EXPR` is omitted, `unpack` reads from [`$_`](../perlvar). This page is a **directive reference**. For a narrative introduction with worked protocol and file-format examples, start at the [pack/unpack tutorial](../../../tutorial/pack-unpack/index). ## Synopsis ```perl unpack TEMPLATE, EXPR unpack TEMPLATE # reads from $_ my @fields = unpack "A10 A10 A*", $line; my ($ver, $len, $payload) = unpack "n N A*", $msg; ``` ## What you get back A list of values, one per directive (or per repetition when a directive has a count). In **scalar context**, only the **first** value produced is returned: ```perl my @all = unpack "A4 A4 A*", $rec; # three values my $first = unpack "A4 A4 A*", $rec; # just the first — not a count ``` Each directive determines the Perl type of its result — see the "Result type" column in the directive table below. ## Global state it touches `unpack TEMPLATE` with no `EXPR` reads [`$_`](../perlvar). No other interpreter globals are consulted. The `C0` / `U0` directives switch template byte-vs-character interpretation locally within the template but do not mutate any outside state. ## Template syntax Identical to [`pack`](pack): a sequence of directive letters, optionally followed by repeat counts `N` / `*` / `[…]` and modifiers `!` / `<` / `>`. Parentheses form groups; `#` introduces a template comment. See [`pack`](pack) for the full syntax rules — this page covers only the unpack-specific semantics. ## Directive table Every unpack directive in one place. **W** is the width in bytes consumed from the input per scalar produced. | Directive | W | Signed | Endian | Result type | Modifiers | Notes | |-----------|----------|--------|----------|-------------|-------------|---------------------------------------------------------------------| | `a` | count | — | — | string | — | Returns bytes untouched | | `A` | count | — | — | string | — | Strips trailing whitespace and `NUL` | | `Z` | count | — | — | string | — | Returns everything up to the first `NUL` | | `b` | count/8 | — | — | bit string | — | "0"/"1" characters, LSB of each byte first | | `B` | count/8 | — | — | bit string | — | "0"/"1" characters, MSB of each byte first | | `h` | count/2 | — | — | hex string | — | `[0-9a-f]` characters, low nybble first | | `H` | count/2 | — | — | hex string | — | `[0-9a-f]` characters, high nybble first | | `c` | 1 | yes | — | integer | — | Signed char | | `C` | 1 | no | — | integer | — | Unsigned char | | `W` | 1 | no | — | integer | — | Unsigned char; yields codepoint in `U0` mode | | `s` | 2 | yes | native | integer | `!` `<` `>` | `s!` = native `short` | | `S` | 2 | no | native | integer | `!` `<` `>` | `S!` = native `unsigned short` | | `l` | 4 | yes | native | integer | `!` `<` `>` | `l!` = native `long` | | `L` | 4 | no | native | integer | `!` `<` `>` | `L!` = native `unsigned long` | | `i` | native | yes | native | integer | `!` `<` `>` | `sizeof(int)` | | `I` | native | no | native | integer | `!` `<` `>` | `sizeof(unsigned int)` | | `q` | 8 | yes | native | integer | `<` `>` | Requires 64-bit-integer Perl | | `Q` | 8 | no | native | integer | `<` `>` | Requires 64-bit-integer Perl | | `n` | 2 | no | big | integer | `!` | Network order; `n!` = signed | | `N` | 4 | no | big | integer | `!` | Network order; `N!` = signed | | `v` | 2 | no | little | integer | `!` | "VAX" order; `v!` = signed | | `V` | 4 | no | little | integer | `!` | "VAX" order; `V!` = signed | | `j` | IV size | yes | native | integer | `<` `>` | Perl-internal `IV` | | `J` | UV size | no | native | integer | `<` `>` | Perl-internal `UV` | | `f` | 4 | — | native | number | `<` `>` | Single-precision IEEE 754 | | `d` | 8 | — | native | number | `<` `>` | Double-precision IEEE 754 | | `F` | NV size | — | native | number | `<` `>` | Perl-internal float | | `D` | varies | — | native | number | `<` `>` | Long double | | `p` | ptr size | — | native | string | `<` `>` | Dereferences pointer to NUL-terminated string | | `P` | ptr size | — | native | string | `<` `>` | Dereferences pointer; count = bytes to read | | `u` | varies | — | — | string | — | Uudecoded bytes | | `U` | varies | — | — | integer | — | Unicode codepoint number | | `w` | varies | no | — | integer | — | BER-compressed integer | | `x` | 1 | — | — | (none) | `!` | Skip one byte forward; `x!N` aligns forward | | `X` | −1 | — | — | (none) | `!` | Back up one byte; `X!N` aligns backward | | `@` | absolute | — | — | (none) | `!` | Jump to position N within the innermost group | | `.` | — | — | — | integer | `!` | Return current position (relative to start of group / string) | | `( … )` | — | — | — | — | `<` `>` `!` | Group: repeat count and endianness propagate inside | | `/` | — | — | — | — | — | See "Length-prefixed payloads" below | | `%N` | varies | — | — | integer | — | **Prefix.** N-bit checksum of the following directive's values | The modifiers have the same meaning as in [`pack`](pack): `!` selects native sizes, alignment semantics, signedness for `n` / `N` / `v` / `V`, or byte offsets for `@` / `.`; `<` / `>` force endianness. ## Differences from `pack` - **`a` strips nothing** (returns bytes raw); **`A` strips trailing whitespace and NUL**; **`Z` stops at the first NUL**. With `pack` all three pad; with `unpack` all three un-pad differently. - **`x` skips forward by W, `X` backs up by W.** `X` before the start of the string is a fatal error. - **`/` reads a count from the data, then applies it to the next directive**: ```perl unpack("W/a", "\004Gurusamy") # ("Guru") unpack("a3/A A*", "007 Bond J ") # (" Bond", "J") unpack("n/a*", "\x00\x0chello, world") # ("hello, world") ``` With `pack` the form is *length-item*`/`*item* and the length is computed from the value's length. With `unpack` the form is `/`*item* (no length-item before) — the preceding integer directive supplies the count. - **`%N` is unpack-only.** It replaces the following directive's normal output with the `N`-bit sum of the values that directive would otherwise have produced. `%32W*` is the System V `sum` checksum. `%32b*` counts set bits. - **`.` returns the current byte position** rather than zero-filling to one. Useful after `x` / `X` navigation to learn where you are. - **`p` and `P` dereference pointers read from the input** — almost always unsafe unless you know the input was produced by `pack "p"` / `pack "P"` in the same process. - **Under-run / over-run behaviour:** if the template asks for more data than `EXPR` contains, the result is not well-defined (may yield empty strings, zeros, a decreased repeat count, or an exception). If `EXPR` is longer than the template consumes, trailing bytes are silently ignored. ## Examples Parse a minimal binary protocol header — a 16-bit version, a 32-bit length, then a variable-length payload, all network byte order: ```perl my ($ver, $len, $payload) = unpack "n N A*", $msg; ``` Split a fixed-width text record — faster and clearer than chained [`substr`](substr) calls: ```perl my ($date, $desc, $amount) = unpack "A10 A27 A*", $line; # "2026-04-22 coffee at the station 3.50" # $date = "2026-04-22", $desc = "coffee at the station", # $amount = "3.50" ``` Extract a System V-style checksum with the `%` prefix — the 32-bit sum of every byte, masked to 16 bits: ```perl my $checksum = do { local $/; # slurp unpack "%32W*", readline $fh; } % 65535; ``` Count set bits in a bit vector — `%32b*` reads the whole mask as bits and returns their sum: ```perl my $setbits = unpack "%32b*", $selectmask; ``` Read a sequence of little-endian 32-bit integers off a raw buffer: ```perl my @ints = unpack "V*", $buf; ``` Use `x` to skip filler bytes, `X` to back up after peeking: ```perl # 2-byte tag, skip 2 reserved bytes, then 4 32-bit big-endian values my ($tag, @vals) = unpack "n x2 N4", $frame; ``` ## Edge cases - **Scalar context returns only the first value.** Assigning `unpack` to a scalar is almost never what you want. To count fields, assign to a list first, or use `() = unpack ...` in list context: ```perl my $n = () = unpack "A10 A10 A*", $line; # 3 ``` - **`*` is greedy, not a placeholder.** `A*` consumes all remaining bytes as one string; there is no backtracking. A `*` directive must be the last one in its sequence, or nothing after it runs. - **Templates are not regular expressions.** Whitespace is ignored, but there is no alternation, no `|`, no lookahead. If data shape depends on earlier fields, unpack in stages: decode a header, then call `unpack` again on the payload using a template derived from what you just read. - **Byte vs. character semantics.** `unpack` runs in byte mode by default on a byte string and in character mode on a string with the UTF-8 flag set. `C0` forces byte mode from that point; `U0` forces UTF-8 mode. Mixing a UTF-8-flagged string with numeric directives like `N` reads codepoints, not bytes — usually not what you want. Call `utf8::encode` first, or operate on byte-level data. - **`X` past the start of the string is fatal.** - **`p` / `P` on an arbitrary input is undefined behaviour.** Never use them on data you did not pack yourself in the same process. - **No `EXPR` argument**: `unpack TEMPLATE` reads [`$_`](../perlvar). Useful inside `while (<$fh>) { ... }` loops over fixed-width records. ## Differences from upstream Fully compatible with upstream Perl 5.42. ## See also - [`pack`](pack) — the inverse operation; same template language - [`substr`](substr) — extract a single fixed slice; simpler for one field, slower for many - [`sprintf`](sprintf) — round-trip between binary and human-readable - [`read`](read) — read raw bytes off a filehandle into a buffer for `unpack` - [`vec`](vec) — indexed access to a bit vector without a template - [pack/unpack tutorial](../../../tutorial/pack-unpack/index) — task-oriented walkthrough with worked protocol and file-format examples