--- name: pos signature: 'pos SCALAR' signatures: - 'pos SCALAR' - 'pos' since: 5.0 status: documented categories: ["Regular expressions and pattern matching"] --- ```{index} single: pos; Perl built-in ``` *[Regular expressions and pattern matching](../perlfunc-by-category)* # pos Report or set where the next `/g` regex match will resume in a string. `pos` reads (or, as an lvalue, writes) the offset that the regex engine stores on a scalar after a global match (`m//g`, `s///g`). The offset counts characters, not bytes, and is the position **after** the last successful match — the place the next `/g` iteration will start scanning. With no argument `pos` operates on [`$_`](../perlvar). ## Synopsis ```perl pos SCALAR pos $str = N pos ``` ## What you get back An integer offset, or [`undef`](undef) when no position is recorded. `0` is a valid offset and means "start of string"; it is not the same as [`undef`](undef), which means "no `/g` match has run, or the last one failed and reset the position." Always distinguish the two with `defined`: ```perl if (defined pos $str) { # a /g scan is in progress } ``` Used as an lvalue, `pos SCALAR` returns an assignable location: ```perl pos($str) = 5; # next /g match starts at char offset 5 ``` ## Global state it touches `pos` reads and writes the per-scalar regex position attached to its operand. With no argument it targets [`$_`](../perlvar). The stored offset is what [`\G`](../perlre) anchors against in the next match, so every call to `pos` potentially changes where `\G` binds. ## Examples Walk every word in a string with `/g` in scalar context, using `pos` to report progress: ```perl my $s = "one two three"; while ($s =~ /(\w+)/g) { printf "%-5s ends at %d\n", $1, pos $s; } # one ends at 3 # two ends at 7 # three ends at 13 ``` Skip ahead before starting the scan. The first match begins at offset `4`, not `0`: ```perl my $s = "AAA BBB CCC"; pos($s) = 4; $s =~ /(\w+)/g; print $1, "\n"; # BBB ``` Anchor a follow-up match to the previous one with [`\G`](../perlre). Without `\G` the engine would scan forward past any gap; with it, the match must start exactly where the last one ended: ```perl my $s = "12ab34cd"; while ($s =~ /\G(\d+)(\w+?)(?=\d|\z)/g) { print "num=$1 tail=$2\n"; } # num=12 tail=ab # num=34 tail=cd ``` Restart a scan by clearing the position: ```perl pos($s) = undef; # next /g starts from offset 0 again ``` ## Edge cases - **Bare `pos`** targets [`$_`](../perlvar). `pos` inside `while (<>) { ... }` therefore reports the position on the current input line. - **Characters, not bytes.** For a string containing multi-byte characters, `pos` returns the character offset. The (deprecated) `use bytes` pragma switches to byte offsets; new code should not rely on it. - **Failed `/g` match resets the position to [`undef`](undef)** — the next `/g` starts over at offset `0`. Add the `/c` modifier (`m//gc`) to preserve the position on failure, which is the usual idiom when composing several alternative `\G`-anchored patterns against the same string. - **Reads during a match are stale.** `pos` reflects the **previous** match's end. Expressions like `(?{ pos() = 5 })` or `s//pos() = 5/e` influence the next match, not the one currently running. - **Zero-length match flag.** Setting `pos` also clears the internal *matched with zero-length* flag, so a subsequent zero-width match at the same position is allowed again. See [`perlre`](../perlre) under *Repeated Patterns Matching a Zero-length Substring*. - **Non-lvalue operand.** `pos` requires a real scalar variable for the lvalue form; `pos("literal") = 3` is a compile-time error. - **Offset `0` vs [`undef`](undef).** `pos` of `0` means the next `/g` starts at the beginning; [`undef`](undef) means no position is set. They behave identically for the first match but differ after the zero-length-match flag state is considered. ## Differences from upstream Fully compatible with upstream Perl 5.42. ## See also - [`m`](m) — the `/g` modifier is what creates and advances the position `pos` reads; the `/c` modifier on failure preserves it - [`qr`](qr) — precompile a pattern once, then reuse it in `\G`-anchored `/g` loops without reparsing - [`split`](split) — the other everyday way to walk a string in pieces; no `pos` involved, but often a cleaner choice when the delimiters are simple - [`\G`](../perlre) — zero-width anchor that binds to the current `pos`; the main reason to touch `pos` in the first place - [`$_`](../perlvar) — default target of bare `pos`