vec#
Read or write a fixed-width slot inside a string treated as a packed bit vector.
vec views EXPR as an array of unsigned integers, each BITS wide,
packed back-to-back from the start of the string. OFFSET indexes into
that array — not into bytes, not into bits, but into BITS-sized
elements. Read form returns the integer at that slot; the lvalue form
writes one. Widths are 1, 2, 4, 8, 16, 32, and on 64-bit builds 64.
Synopsis#
my $n = vec($buf, $offset, $bits);
vec($buf, $offset, $bits) = $n;
What you get back#
In rvalue context, an unsigned integer — the contents of the selected
slot, zero-extended to a Perl number. In lvalue context, an assignable
slot; assigning truncates the right-hand value to BITS and writes it
into the string, extending the string with zero bytes if OFFSET lies
past the current end.
The parentheses around vec(...) in the lvalue form are required —
without them, vec $buf, $o, $b = 3 parses the = as part of the
argument list.
How the bits are laid out#
The layout depends on BITS, and is chosen so code is portable across
big- and little-endian machines:
BITS == 8: each slot is one byte of the string.vec($s, $i, 8)is the unsigned value ofsubstr($s, $i, 1).BITS == 16,32,64: bytes of the string are grouped into chunks ofBITS/8and interpreted in big-endian order — equivalent tounpackwithn,N, or (on 64-bit builds)Q>/ the moral equivalent.vec($s, 0, 32)reads the first four bytes as a big-endianuint32.BITS == 4, 2, 1: the string is broken into bytes, and each byte is split into8/BITSgroups, numbered little-endian-ish within the byte. The bit values from low to high are0x01,0x02,0x04,0x08,0x10,0x20,0x40,0x80. So forchr(0x36)(0b00110110):BITS == 4gives the two nibbles(0x6, 0x3).BITS == 2gives the four 2-bit groups(0x2, 0x1, 0x3, 0x0).BITS == 1gives the eight bits(0, 1, 1, 0, 1, 1, 0, 0).
A slot entirely off the end of the string reads as 0. Writing past
the end grows the string with zero bytes to reach the slot. A negative
OFFSET is a fatal error.
Global state it touches#
None. vec operates purely on its arguments.
Examples#
Read a byte at a given index:
my $s = "Perl";
print vec($s, 0, 8); # 80 (== ord 'P')
print vec($s, 3, 8); # 108 (== ord 'l')
Build a string by writing 32-bit big-endian words:
my $buf = '';
vec($buf, 0, 32) = 0x5065726C; # "Perl"
vec($buf, 1, 32) = 0x50657270; # "PerlPerp"
print $buf; # PerlPerp
Use vec as a compact boolean array — one bit per flag:
my $flags = '';
vec($flags, 17, 1) = 1;
vec($flags, 42, 1) = 1;
print vec($flags, 17, 1); # 1
print vec($flags, 18, 1); # 0 (slot never set, still zero)
print length $flags; # 6 (string auto-extended to fit bit 42)
Count the set bits in a bit vector without looping bit by bit — the
idiomatic pattern uses unpack:
my $ones = unpack("%32b*", $flags); # population count
Convert a bit vector into a string of 0s and 1s for display:
my $bits = unpack("b*", $flags); # "00...010...010..."
Combine two bit vectors with the bitwise string operators — those treat
string operands as bit vectors of the same shape vec reads and
writes:
my $union = $flags_a | $flags_b;
my $intersection = $flags_a & $flags_b;
my $diff = $flags_a ^ $flags_b;
Edge cases#
Lvalue precedence:
vec EXPR, O, B = Nis a syntax error. Always writevec(EXPR, O, B) = N.Off-the-end read:
vec($short, 1_000_000, 8)returns0, never dies, never warns.Off-the-end write: the string is zero-padded up to the slot. For
BITS == 1, writing bit 94 grows the string to 12 bytes.Negative
OFFSET: fatal —"Negative offset to vec in lvalue context"or the rvalue equivalent.BITSnot a supported power of two: fatal with"Illegal number of bits in vec". Valid widths are 1, 2, 4, 8, 16, 32, and 64 on 64-bit builds.UTF-8 encoded strings:
vecwants a byte string. If the scalar is flagged UTF-8, Perl first tries to downgrade it to a one-byte-per-character representation. If any character has a codepoint of 256 or higher, that fails fatally with"Wide character in vec". Callutf8::downgradedeliberately, or pack the data withpack"C*"first, before reaching forvec.Read on
undef: underuse warnings, triggers anuninitializedwarning on the string argument; returns0.Assignment value wider than
BITS: the value is masked to the lowBITSbits.vec($s, 0, 4) = 0x1Fstores0xF.
Differences from upstream#
Fully compatible with upstream Perl 5.42.
See also#
pack— build multi-field binary structures;vecis the random-access counterpart when every field has the same widthunpack— pull fields out of a binary string; useunpack("b*", $v)orunpack("%32b*", $v)to render or popcount avecbit vectorsubstr— byte-level random access whenBITSwould be8and you also want the lvalue to grow or shrink the stringsprintf— format the integer avecread returns, e.g.sprintf "%08b", vec($s, $i, 8)ord— one-shot equivalent ofvec($s, $i, 8)when you only need the byte value and never assign back