Regular expressions and pattern matching

qr//#

Compile a pattern once and hand back a reusable regex object.

qr// quotes its STRING as a regular expression, interpolates it the same way m does, and returns a Regexp object you can store, pass around, and splice into other patterns. The compiled pattern carries its modifiers with it, so every later use sees the flags the pattern was built with.

Synopsis#

my $re = qr/STRING/;
my $re = qr/STRING/msixpodualn;
my $re = qr{STRING};
my $re = qr'STRING';            # single-quote delimiter: no interpolation

What you get back#

A blessed Regexp object. ref returns the string "Regexp". Stringifying the object yields a normalised form of the pattern with its modifiers encoded in a (?flags:...) wrapper, which is exactly the form the regex engine embeds when the object is interpolated into another pattern:

my $re = qr/my.STRING/is;
print $re;                      # (?^si:my.STRING)

The object is opaque — do not dereference it. The stringified form is intended for interpolation and debugging, not for manual parsing.

Global state it touches#

  • $1, $2, …, $+, $&, $`, $', %+, %-not set by qr// itself. Compilation does not match. These are populated only when the compiled pattern is later used by m, s, or split.

  • $@ — set if the pattern contains a syntax error and the error is trapped with eval. An untrapped compile error raises a fatal Bareword "..." not allowed-style exception at the point of the qr//.

  • Locale and Unicode pragmas in effect at the qr// site (use locale, use utf8) are baked into the compiled object. Changing the pragma later does not change the object.

Modifiers#

All modifiers that apply to m also apply to qr//. They are captured in the returned object and propagate when the object is interpolated into another pattern.

Flag

Meaning

m

Multi-line: ^ and $ match at internal newlines.

s

Single-line: . matches a newline.

i

Case-insensitive match.

x

Extended: whitespace and # comments ignored in the pattern. xx also ignores whitespace inside [...].

p

Preserve-match (no-op on Perl 5.20+; ${^PREMATCH}, ${^MATCH}, ${^POSTMATCH} are always available).

o

Compile once; interpolated variables are frozen at first use. Rarely needed — qr// already caches.

a

ASCII-restrict \d, \s, \w, and POSIX classes. aa further forbids ASCII-to-non-ASCII matches under i.

l

Use the current locale’s rules.

u

Use Unicode rules.

d

Default dual rules (legacy; usually picked up automatically).

n

Non-capture: unnamed (...) groups do not fill $1, $2, ….

When a compiled pattern is embedded in a larger pattern, the character-set and msixn flags it was built with stay in effect for that span only. The o modifier is the one exception — it is not propagated.

Examples#

Compile once, use many times. Looping over patterns built ahead of time avoids recompiling on every iteration:

my @rules = map qr/$_/i, qw(error warn fail panic);
for my $line (@lines) {
    for my $rx (@rules) {
        print $line if $line =~ $rx;
    }
}

Interpolate a compiled pattern inside another pattern. The outer pattern sees the inner pattern’s flags via the (?flags:...) wrapper:

my $word = qr/\w+/;
my $csv  = qr/^$word(?:,$word)*$/;
"alpha,beta,gamma" =~ $csv;         # matches

Single-quote delimiter for a literal pattern with no interpolation — useful when the pattern contains $ or @ that must stay literal:

my $price = qr'\$\d+\.\d{2}';       # matches "$1.99", not a sigil
"total: \$9.95" =~ $price;          # matches

Pattern re-use across calls. Storing the compiled form in a closure amortises compilation across every call to the returned sub:

sub matcher_for {
    my $pat = shift;
    my $rx  = qr/\Q$pat\E/i;        # \Q...\E quotes regex metacharacters
    return sub { $_[0] =~ $rx };
}

my $is_error = matcher_for("ERROR:");
$is_error->($line);

Capture inside an interpolated qr// still populates $1 at match time, not at qr// time:

my $num = qr/(\d+)/;
"port 8080" =~ /:$num$/;
print $1;                           # 8080

Using /n to suppress capture in $1, $2, … while keeping named captures available:

my $rx = qr/(foo)(?<kw>bar)/n;
"foobar" =~ $rx;
print $+{kw};                       # "bar"
print defined $1 ? "yes" : "no";    # "no"

Edge cases#

  • No match at compile time. qr// only compiles. To test a string, use the object with m or =~: $string =~ $rx.

  • Interpolated variables are captured at qr// time, not at use time. my $rx = qr/$pat/; $pat = "other"; leaves $rx holding the pattern built from the original $pat. To re-capture, rebuild the qr//.

  • Compile-time pattern errors are fatal. qr/(/ raises an exception at the qr// site. Wrap in eval when the pattern comes from user input and an error should be recoverable:

    my $rx = eval { qr/$user_input/ };
    die "bad pattern: $@" if $@;
    
  • ref($rx) returns "Regexp" — not "SCALAR", not "CODE". Use this to detect a compiled pattern in a polymorphic argument slot: ref($arg) eq 'Regexp'.

  • Stringification is lossy for re-parsing. The normalised form is legal as an interpolation target but is not guaranteed to be a human-readable copy of the original source. Do not re-qr// the stringified output.

  • Locale and Unicode bake-in. A qr// built inside use locale behaves with locale rules forever, even when used outside that scope. Rebuild under the desired pragma if the behaviour needs to change.

  • \Q...\E for literal text. To compile a pattern that matches a user-provided string verbatim, wrap it with \Q...\E: qr/\Q$literal\E/. Without \Q, metacharacters in the string are interpreted as regex syntax.

  • Empty pattern. qr// compiles to the empty pattern, which is treated specially at match time: $str =~ $empty reuses the last successful pattern in that scope. This is a match-time behaviour of m, not a qr//-specific trait, but it is a common trap when passing a default qr// around.

Differences from upstream#

Fully compatible with upstream Perl 5.42.

See also#

  • m — match a compiled pattern or a literal pattern against a string; the primary consumer of a qr// object

  • s — substitute using a compiled pattern; accepts a qr// in the search slot

  • split — split on a compiled pattern; the qr// form is the most efficient

  • perlre — full regex syntax reference; read this for the meaning of the modifiers and escape sequences

  • ref — returns "Regexp" for a qr// object; the standard type check

  • eval — wrap qr// when the pattern source is untrusted and a compile error should be recoverable