--- name: vec signature: 'vec EXPR, OFFSET, BITS' since: 5.0 status: documented categories: ["Fixed-length data"] --- ```{index} single: vec; Perl built-in ``` *[Fixed-length data](../perlfunc-by-category)* # vec Read or write a fixed-width slot inside a string treated as a packed bit vector. `vec` views `EXPR` as an array of unsigned integers, each `BITS` wide, packed back-to-back from the start of the string. `OFFSET` indexes into that array — not into bytes, not into bits, but into `BITS`-sized elements. Read form returns the integer at that slot; the lvalue form writes one. Widths are 1, 2, 4, 8, 16, 32, and on 64-bit builds 64. ## Synopsis ```perl my $n = vec($buf, $offset, $bits); vec($buf, $offset, $bits) = $n; ``` ## What you get back In rvalue context, an unsigned integer — the contents of the selected slot, zero-extended to a Perl number. In lvalue context, an assignable slot; assigning truncates the right-hand value to `BITS` and writes it into the string, extending the string with zero bytes if `OFFSET` lies past the current end. The parentheses around `vec(...)` in the lvalue form are required — without them, `vec $buf, $o, $b = 3` parses the `=` as part of the argument list. ## How the bits are laid out The layout depends on `BITS`, and is chosen so code is portable across big- and little-endian machines: - **`BITS == 8`**: each slot is one byte of the string. `vec($s, $i, 8)` is the unsigned value of `substr($s, $i, 1)`. - **`BITS == 16`, `32`, `64`**: bytes of the string are grouped into chunks of `BITS/8` and interpreted in **big-endian** order — equivalent to [`unpack`](unpack) with `n`, `N`, or (on 64-bit builds) `Q>` / the moral equivalent. `vec($s, 0, 32)` reads the first four bytes as a big-endian `uint32`. - **`BITS == 4, 2, 1`**: the string is broken into bytes, and each byte is split into `8/BITS` groups, numbered **little-endian-ish** within the byte. The bit values from low to high are `0x01`, `0x02`, `0x04`, `0x08`, `0x10`, `0x20`, `0x40`, `0x80`. So for `chr(0x36)` (`0b00110110`): - `BITS == 4` gives the two nibbles `(0x6, 0x3)`. - `BITS == 2` gives the four 2-bit groups `(0x2, 0x1, 0x3, 0x0)`. - `BITS == 1` gives the eight bits `(0, 1, 1, 0, 1, 1, 0, 0)`. A slot entirely off the end of the string reads as `0`. Writing past the end grows the string with zero bytes to reach the slot. A negative `OFFSET` is a fatal error. ## Global state it touches None. `vec` operates purely on its arguments. ## Examples Read a byte at a given index: ```perl my $s = "Perl"; print vec($s, 0, 8); # 80 (== ord 'P') print vec($s, 3, 8); # 108 (== ord 'l') ``` Build a string by writing 32-bit big-endian words: ```perl my $buf = ''; vec($buf, 0, 32) = 0x5065726C; # "Perl" vec($buf, 1, 32) = 0x50657270; # "PerlPerp" print $buf; # PerlPerp ``` Use `vec` as a compact boolean array — one bit per flag: ```perl my $flags = ''; vec($flags, 17, 1) = 1; vec($flags, 42, 1) = 1; print vec($flags, 17, 1); # 1 print vec($flags, 18, 1); # 0 (slot never set, still zero) print length $flags; # 6 (string auto-extended to fit bit 42) ``` Count the set bits in a bit vector without looping bit by bit — the idiomatic pattern uses [`unpack`](unpack): ```perl my $ones = unpack("%32b*", $flags); # population count ``` Convert a bit vector into a string of `0`s and `1`s for display: ```perl my $bits = unpack("b*", $flags); # "00...010...010..." ``` Combine two bit vectors with the bitwise string operators — those treat string operands as bit vectors of the same shape `vec` reads and writes: ```perl my $union = $flags_a | $flags_b; my $intersection = $flags_a & $flags_b; my $diff = $flags_a ^ $flags_b; ``` ## Edge cases - **Lvalue precedence**: `vec EXPR, O, B = N` is a syntax error. Always write `vec(EXPR, O, B) = N`. - **Off-the-end read**: `vec($short, 1_000_000, 8)` returns `0`, never dies, never warns. - **Off-the-end write**: the string is zero-padded up to the slot. For `BITS == 1`, writing bit 94 grows the string to 12 bytes. - **Negative `OFFSET`**: fatal — `"Negative offset to vec in lvalue context"` or the rvalue equivalent. - **`BITS` not a supported power of two**: fatal with `"Illegal number of bits in vec"`. Valid widths are 1, 2, 4, 8, 16, 32, and 64 on 64-bit builds. - **UTF-8 encoded strings**: `vec` wants a byte string. If the scalar is flagged UTF-8, Perl first tries to downgrade it to a one-byte-per-character representation. If any character has a codepoint of 256 or higher, that fails fatally with `"Wide character in vec"`. Call `utf8::downgrade` deliberately, or pack the data with [`pack`](pack) `"C*"` first, before reaching for `vec`. - **Read on `undef`**: under `use warnings`, triggers an `uninitialized` warning on the string argument; returns `0`. - **Assignment value wider than `BITS`**: the value is masked to the low `BITS` bits. `vec($s, 0, 4) = 0x1F` stores `0xF`. ## Differences from upstream Fully compatible with upstream Perl 5.42. ## See also - [`pack`](pack) — build multi-field binary structures; `vec` is the random-access counterpart when every field has the same width - [`unpack`](unpack) — pull fields out of a binary string; use `unpack("b*", $v)` or `unpack("%32b*", $v)` to render or popcount a `vec` bit vector - [`substr`](substr) — byte-level random access when `BITS` would be `8` and you also want the lvalue to grow or shrink the string - [`sprintf`](sprintf) — format the integer a `vec` read returns, e.g. `sprintf "%08b", vec($s, $i, 8)` - [`ord`](ord) — one-shot equivalent of `vec($s, $i, 8)` when you only need the byte value and never assign back