--- name: regex modifiers --- # Modifiers Modifiers change how a pattern is matched without changing the pattern itself. They appear after the closing delimiter of a match, substitution, or `qr//`: ```perl "Hello" =~ /hello/i; # case-insensitive match $x =~ s/foo/bar/g; # global substitution my $re = qr/\d+/i; # compiled pattern with /i ``` The same modifiers can be embedded inside a pattern with `(?flags)` or `(?flags:…)`, localising their effect to part of the pattern. ## The common modifiers | Modifier | Effect | |----------|--------------------------------------------------------| | `/i` | case-insensitive matching | | `/m` | multi-line: `^` and `$` match at every `\n` | | `/s` | single-line: `.` matches `\n` too | | `/x` | extended: ignore whitespace and `#` comments | | `/xx` | also ignore whitespace inside `[…]` | | `/g` | global: match as many times as possible | | `/c` | do not reset `pos` on failure (with `/g`) | | `/r` | `s///r` returns the result instead of modifying | | `/e` | `s///e` evaluates the replacement as Perl code | | `/ee` | `s///ee` evaluates, then evaluates the result too | | `/n` | non-capturing: `(…)` act like `(?:…)` | | `/p` | preserved (no-op since 5.20, accepted for compat) | | `/o` | compile the pattern once (rarely needed) | | `/a` | `\d`, `\w`, `\s` restricted to ASCII | | `/u` | Unicode semantics regardless of `use utf8` | | `/l` | use current locale | | `/d` | default (pre-5.12) semantics — avoid in new code | The first eight are the everyday modifiers; the charset modifiers `/a`, `/u`, `/l`, `/d` are the subject of the Unicode chapter. ## /i — case-insensitive ```perl "Hello" =~ /hello/i; # matches "HELLO" =~ /[A-Z]/i; # matches — class is insensitive too "Grüße" =~ /GRÜSSE/i; # matches under Unicode semantics ``` Unicode case folding includes mappings like `ß → ss`, German Eszett casefolding, and so on. The full table lives in the Unicode standard; for ASCII, `/i` does what you expect. ## /m — multi-line Changes where `^` and `$` match. Without `/m`, they match only at the outer ends of the string. With `/m`, they match at every embedded newline too. ```perl my $x = "first\nsecond\nthird"; $x =~ /^second/; # does not match $x =~ /^second/m; # matches — second is at start of a line $x =~ /first$/m; # matches $x =~ /third$/m; # matches ``` `\A`, `\z`, `\Z` remain absolute string anchors even under `/m` — see the anchors chapter. ## /s — single-line Makes `.` match newline characters too. ```perl my $x = "a\nb"; $x =~ /a.b/; # does not match — . does not cross \n $x =~ /a.b/s; # matches ``` `/m` and `/s` are independent. They can both be used on the same match. Despite the names, they do not conflict: ```perl $x =~ /^a.b$/sm; # . matches newline AND ^,$ are line-aware ``` ## /x — extended pattern Ignores literal whitespace and lets `#` introduce end-of-line comments. Crucial for any pattern more than a line long. Before `/x`: ```perl /^[+-]?\d+(\.\d*)?([eE][+-]?\d+)?$/; ``` After: ```perl /^ [+-]? # optional sign \d+ # integer part (\.\d*)? # optional fraction ([eE][+-]?\d+)? # optional exponent $/x; ``` Whitespace in the *pattern* is ignored; whitespace you want to match becomes `\s`, `\ `, or `[ ]`: ```perl /\w+ \s+ \w+/x; # three tokens: word, space, word (literal spaces ignored) /\w+\s+\w+/x; # equivalent /key:[ ]value/x; # literal space via bracket class /key:\ value/x; # literal space via backslash ``` Inside `[…]` whitespace is *not* ignored — `[ab c]` matches `a`, `b`, `' '`, or `c`. To also ignore whitespace inside classes, use `/xx`: ```perl /[ab c]/xx; # matches 'a', 'b', or 'c' — space ignored /[ab\ c]/xx; # the \ is needed to match a literal space ``` `#` starts a comment that ends at the next newline. To match a literal `#` under `/x`, escape it or put it in a class. ## /g — global In scalar context, keeps a position in the string (`pos $x`) and advances each time the pattern matches, allowing iteration: ```perl my $x = "cat dog house"; while ($x =~ /(\w+)/g) { print "$1 at ", pos($x), "\n"; } # cat at 3 # dog at 7 # house at 13 ``` In list context, returns all matches at once: ```perl my @words = $x =~ /(\w+)/g; # ('cat', 'dog', 'house') ``` If the pattern contains no captures, list context returns the whole matched text for each match: ```perl my @digits = "abc123def456" =~ /\d+/g; # ('123', '456') ``` With multiple captures, each iteration returns the tuple in order: ```perl my @pairs = "a=1,b=2,c=3" =~ /(\w)=(\d)/g; # ('a', '1', 'b', '2', 'c', '3') ``` ## /c — preserve position on failure By default, a failed `/g` match resets `pos` to undef. Under `/gc`, `pos` stays at its previous value — crucial for hand-rolled lexers: ```perl my $s = "123abc"; while (1) { if ($s =~ /\G(\d+)/gc) { print "num $1\n"; next; } if ($s =~ /\G([a-z]+)/gc) { print "word $1\n"; next; } last; # nothing matched; exit } ``` See the anchors chapter for `\G`. ## /r — non-destructive substitution `s///` normally modifies the target string and returns the count. `s///r` leaves the target alone and returns the result: ```perl my $name = " Alice "; my $trimmed = $name =~ s/^\s+|\s+$//gr; # $name is still " Alice " # $trimmed is "Alice" ``` Enables substitution chains without intermediate variables: ```perl my $clean = $input =~ s/\s+/ /gr # collapse runs of whitespace =~ s/^ | $//gr # trim ends =~ s/[^\x00-\x7f]//gr; # drop non-ASCII ``` Each `s///r` returns the transformed string, which the next one receives. ## /e — evaluate the replacement The replacement half of `s///e` is Perl code, not a double-quoted string. The return value of the code replaces the match: ```perl my $x = "numbers: 1 2 3 4"; $x =~ s/(\d+)/$1 * 2/ge; # $x is now "numbers: 2 4 6 8" ``` `/ee` evaluates twice: the code returns a string, which is then evaluated as Perl again. Rarely useful and easy to misuse — only reach for it when you are sure. ## /n — non-capturing Makes every ordinary `(…)` behave like `(?:…)`. Useful when a long pattern has parentheses purely for grouping and you want to keep `$1`, `$2`, … unset: ```perl "hello" =~ /(hi|hello)/n; # matches, but $1 is not set ``` Named captures `(?…)` still capture under `/n`. ## /p — preserved (no-op) Historically, `/p` enabled `${^PREMATCH}`, `${^MATCH}`, `${^POSTMATCH}`. Since Perl 5.20, those variables are always available, and `/p` is a silent no-op. Accepted for backward compatibility, but contributes nothing in new code. ## /o — compile once Historically, `/o` disabled re-compilation of a pattern that interpolated variables. Modern Perl caches compiled patterns automatically and `qr//` gives you explicit control. `/o` is rarely needed today; `qr//` is clearer. ## Inline modifiers `(?flags)` turns flags on for the rest of the enclosing group (or pattern): ```perl /(?i)yes/; # case-insensitive, same as /yes/i ``` `(?flags:…)` scopes the flags to the inner group only: ```perl /Answer: ((?i)yes)/; # only the 'yes' is case-insensitive /Answer: ((?i:yes))/; # clearer: scope is the group's contents ``` `(?-flags)` turns flags off. They can be combined: ```perl /(?i-m:pattern)/; # turn on /i, turn off /m, within this group ``` Inline modifiers are the right tool when different parts of a long pattern need different modifiers. ## Order of modifiers On a match or substitution, the order of trailing modifiers is not significant. `/mgis` and `/sgmi` are identical. Pick a house style and stick to it. ## Summary | If you want to… | Reach for… | |---|---| | ignore case | `/i` | | treat the string as multiple lines | `/m` | | let `.` match newline | `/s` | | write a readable pattern with whitespace | `/x` | | iterate through all matches | `/g` | | keep a position across failed match | `/gc` | | substitute without mutating | `/r` | | use Perl code in the replacement | `/e` | | suppress all capturing | `/n` | ## See also - [`perlre`](../../p5/core/perlre) — complete modifier reference, including less-common modifiers. - [`m`](../../p5/core/perlfunc/m), [`s`](../../p5/core/perlfunc/s), [`qr`](../../p5/core/perlfunc/qr) — operator reference pages. - The [unicode](unicode) chapter — `/a`, `/u`, `/l`, `/d`.