Modifiers#
Modifiers change how a pattern is matched without changing the
pattern itself. They appear after the closing delimiter of a
match, substitution, or qr//:
"Hello" =~ /hello/i; # case-insensitive match
$x =~ s/foo/bar/g; # global substitution
my $re = qr/\d+/i; # compiled pattern with /i
The same modifiers can be embedded inside a pattern with (?flags)
or (?flags:…), localising their effect to part of the pattern.
The common modifiers#
Modifier |
Effect |
|---|---|
|
case-insensitive matching |
|
multi-line: |
|
single-line: |
|
extended: ignore whitespace and |
|
also ignore whitespace inside |
|
global: match as many times as possible |
|
do not reset |
|
|
|
|
|
|
|
non-capturing: |
|
preserved (no-op since 5.20, accepted for compat) |
|
compile the pattern once (rarely needed) |
|
|
|
Unicode semantics regardless of |
|
use current locale |
|
default (pre-5.12) semantics — avoid in new code |
The first eight are the everyday modifiers; the charset modifiers
/a, /u, /l, /d are the subject of the Unicode chapter.
/i — case-insensitive#
"Hello" =~ /hello/i; # matches
"HELLO" =~ /[A-Z]/i; # matches — class is insensitive too
"Grüße" =~ /GRÜSSE/i; # matches under Unicode semantics
Unicode case folding includes mappings like ß → ss, German Eszett
casefolding, and so on. The full table lives in the Unicode
standard; for ASCII, /i does what you expect.
/m — multi-line#
Changes where ^ and $ match. Without /m, they match only at
the outer ends of the string. With /m, they match at every
embedded newline too.
my $x = "first\nsecond\nthird";
$x =~ /^second/; # does not match
$x =~ /^second/m; # matches — second is at start of a line
$x =~ /first$/m; # matches
$x =~ /third$/m; # matches
\A, \z, \Z remain absolute string anchors even under /m —
see the anchors chapter.
/s — single-line#
Makes . match newline characters too.
my $x = "a\nb";
$x =~ /a.b/; # does not match — . does not cross \n
$x =~ /a.b/s; # matches
/m and /s are independent. They can both be used on the same
match. Despite the names, they do not conflict:
$x =~ /^a.b$/sm; # . matches newline AND ^,$ are line-aware
/x — extended pattern#
Ignores literal whitespace and lets # introduce end-of-line
comments. Crucial for any pattern more than a line long.
Before /x:
/^[+-]?\d+(\.\d*)?([eE][+-]?\d+)?$/;
After:
/^
[+-]? # optional sign
\d+ # integer part
(\.\d*)? # optional fraction
([eE][+-]?\d+)? # optional exponent
$/x;
Whitespace in the pattern is ignored; whitespace you want to
match becomes \s, \ , or [ ]:
/\w+ \s+ \w+/x; # three tokens: word, space, word (literal spaces ignored)
/\w+\s+\w+/x; # equivalent
/key:[ ]value/x; # literal space via bracket class
/key:\ value/x; # literal space via backslash
Inside […] whitespace is not ignored — [ab c] matches a,
b, ' ', or c. To also ignore whitespace inside classes, use
/xx:
/[ab c]/xx; # matches 'a', 'b', or 'c' — space ignored
/[ab\ c]/xx; # the \ is needed to match a literal space
# starts a comment that ends at the next newline. To match a
literal # under /x, escape it or put it in a class.
/g — global#
In scalar context, keeps a position in the string (pos $x) and
advances each time the pattern matches, allowing iteration:
my $x = "cat dog house";
while ($x =~ /(\w+)/g) {
print "$1 at ", pos($x), "\n";
}
# cat at 3
# dog at 7
# house at 13
In list context, returns all matches at once:
my @words = $x =~ /(\w+)/g; # ('cat', 'dog', 'house')
If the pattern contains no captures, list context returns the whole matched text for each match:
my @digits = "abc123def456" =~ /\d+/g; # ('123', '456')
With multiple captures, each iteration returns the tuple in order:
my @pairs = "a=1,b=2,c=3" =~ /(\w)=(\d)/g;
# ('a', '1', 'b', '2', 'c', '3')
/c — preserve position on failure#
By default, a failed /g match resets pos to undef. Under /gc,
pos stays at its previous value — crucial for hand-rolled
lexers:
my $s = "123abc";
while (1) {
if ($s =~ /\G(\d+)/gc) { print "num $1\n"; next; }
if ($s =~ /\G([a-z]+)/gc) { print "word $1\n"; next; }
last; # nothing matched; exit
}
See the anchors chapter for \G.
/r — non-destructive substitution#
s/// normally modifies the target string and returns the count.
s///r leaves the target alone and returns the result:
my $name = " Alice ";
my $trimmed = $name =~ s/^\s+|\s+$//gr;
# $name is still " Alice "
# $trimmed is "Alice"
Enables substitution chains without intermediate variables:
my $clean = $input
=~ s/\s+/ /gr # collapse runs of whitespace
=~ s/^ | $//gr # trim ends
=~ s/[^\x00-\x7f]//gr; # drop non-ASCII
Each s///r returns the transformed string, which the next one
receives.
/e — evaluate the replacement#
The replacement half of s///e is Perl code, not a double-quoted
string. The return value of the code replaces the match:
my $x = "numbers: 1 2 3 4";
$x =~ s/(\d+)/$1 * 2/ge;
# $x is now "numbers: 2 4 6 8"
/ee evaluates twice: the code returns a string, which is then
evaluated as Perl again. Rarely useful and easy to misuse — only
reach for it when you are sure.
/n — non-capturing#
Makes every ordinary (…) behave like (?:…). Useful when a
long pattern has parentheses purely for grouping and you want to
keep $1, $2, … unset:
"hello" =~ /(hi|hello)/n; # matches, but $1 is not set
Named captures (?<name>…) still capture under /n.
/p — preserved (no-op)#
Historically, /p enabled ${^PREMATCH}, ${^MATCH},
${^POSTMATCH}. Since Perl 5.20, those variables are always
available, and /p is a silent no-op. Accepted for backward
compatibility, but contributes nothing in new code.
/o — compile once#
Historically, /o disabled re-compilation of a pattern that
interpolated variables. Modern Perl caches compiled patterns
automatically and qr// gives you explicit control. /o is
rarely needed today; qr// is clearer.
Inline modifiers#
(?flags) turns flags on for the rest of the enclosing group (or
pattern):
/(?i)yes/; # case-insensitive, same as /yes/i
(?flags:…) scopes the flags to the inner group only:
/Answer: ((?i)yes)/; # only the 'yes' is case-insensitive
/Answer: ((?i:yes))/; # clearer: scope is the group's contents
(?-flags) turns flags off. They can be combined:
/(?i-m:pattern)/; # turn on /i, turn off /m, within this group
Inline modifiers are the right tool when different parts of a long pattern need different modifiers.
Order of modifiers#
On a match or substitution, the order of trailing modifiers is not
significant. /mgis and /sgmi are identical. Pick a house style
and stick to it.
Summary#
If you want to… |
Reach for… |
|---|---|
ignore case |
|
treat the string as multiple lines |
|
let |
|
write a readable pattern with whitespace |
|
iterate through all matches |
|
keep a position across failed match |
|
substitute without mutating |
|
use Perl code in the replacement |
|
suppress all capturing |
|