# Modifiers

Modifiers change how a pattern is matched without changing the
pattern itself. They appear after the closing delimiter of a
match, substitution, or `qr//`:

```perl
"Hello" =~ /hello/i;        # case-insensitive match
$x =~ s/foo/bar/g;          # global substitution
my $re = qr/\d+/i;          # compiled pattern with /i
```

The same modifiers can be embedded inside a pattern with
`(?flags)` or `(?flags:…)`, localising their effect to part of
the pattern.

## The full modifier set

| Modifier   | Effect                                                  |
|------------|---------------------------------------------------------|
| `/i`       | case-insensitive matching                               |
| `/m`       | multi-line: `^` and `$` match at every `\n`             |
| `/s`       | single-line: `.` matches `\n` too                       |
| `/x`       | extended: ignore whitespace and `#` comments            |
| `/xx`      | extended-extended: also ignore whitespace inside `[…]`  |
| `/g`       | global: match as many times as possible                 |
| `/c`       | do not reset `pos` on failure (with `/g`)               |
| `/r`       | `s///r` returns the result instead of modifying         |
| `/e`       | `s///e` evaluates the replacement as Perl code          |
| `/ee`      | `s///ee` evaluates, then evaluates the result too       |
| `/n`       | non-capturing: `(…)` act like `(?:…)`                   |
| `/p`       | no-op; accepted for compat                              |
| `/o`       | compile the pattern once (rarely needed; prefer `qr//`) |
| `/a`       | `\d`, `\w`, `\s` restricted to ASCII                    |
| `/aa`      | `/a`, plus `/i` does not cross ASCII/non-ASCII boundary |
| `/u`       | Unicode semantics regardless of `use utf8`              |
| `/l`       | use current locale                                      |
| `/d`       | dual-mode semantics — avoid in new code                 |

The first eight are the everyday modifiers; the charset modifiers
`/a`, `/u`, `/l`, `/d` are covered in detail in the
[unicode](unicode.md) chapter and summarised again here.

## `/i` — case-insensitive

```perl
"Hello" =~ /hello/i;          # matches
"HELLO" =~ /[A-Z]/i;          # matches — class is insensitive too
"Grüße" =~ /GRÜSSE/i;         # matches under Unicode semantics
```

Unicode case folding includes mappings like `ß → ss`, German
Eszett casefolding, and so on. The full table lives in the
Unicode standard; for ASCII, `/i` does what you expect.

`/i` carries a negligible performance penalty for ASCII. The
implementation folds case at *compile time* — the pattern carries
the case-folded character set, the input is read once. Unicode
case-folding is more involved (sequences like `ß ↔ ss`), but the
extra work is bounded and rarely shows up in practice.

## `/m` — multi-line

Changes where `^` and `$` match. Without `/m`, they match only
at the outer ends of the string. With `/m`, they match at every
embedded newline too.

```perl
my $x = "first\nsecond\nthird";

$x =~ /^second/;    # does not match
$x =~ /^second/m;   # matches — second is at start of a line
$x =~ /first$/m;    # matches
$x =~ /third$/m;    # matches
```

`\A`, `\z`, `\Z` remain absolute string anchors even under `/m`
— see the [anchors and assertions](anchors-and-assertions.md)
chapter.

## `/s` — single-line

Makes `.` match newline characters too.

```perl
my $x = "a\nb";
$x =~ /a.b/;        # does not match — . does not cross \n
$x =~ /a.b/s;       # matches
```

`/m` and `/s` are independent. They can both be used on the same
match. Despite the names, they do not conflict:

```perl
$x =~ /^a.b$/sm;    # . matches newline AND ^,$ are line-aware
```

## `/x` — extended pattern

Ignores literal whitespace and lets `#` introduce end-of-line
comments. Crucial for any pattern more than a line long.

Before `/x`:

```perl
/^[+-]?\d+(\.\d*)?([eE][+-]?\d+)?$/;
```

After:

```perl
/^
    [+-]?           # optional sign
    \d+             # integer part
    (\.\d*)?        # optional fraction
    ([eE][+-]?\d+)? # optional exponent
 $/x;
```

Whitespace in the *pattern* is ignored; whitespace you want to
match becomes `\s`, `\ `, or `[ ]`:

```perl
/\w+ \s+ \w+/x;      # three tokens: word, space, word (literal spaces ignored)
/\w+\s+\w+/x;        # equivalent
/key:[ ]value/x;     # literal space via bracket class
/key:\ value/x;      # literal space via backslash
```

`#` starts a comment that ends at the next newline. To match a
literal `#` under `/x`, escape it or put it in a class.

The «Pattern White Space» set under `/x` follows Unicode UAX#31:
SPACE, CHARACTER TABULATION, LINE FEED, LINE TABULATION, FORM
FEED, CARRIAGE RETURN, NEXT LINE, PARAGRAPH SEPARATOR, LINE
SEPARATOR. In practice you only meet ASCII space, tab, and
newline.

## `/xx` — extended in classes too

Inside `[…]` whitespace is *not* ignored under plain `/x` —
`[ab c]` matches `a`, `b`, `' '`, or `c`. To also ignore
whitespace inside classes, use `/xx`:

```perl
/[ab c]/xx;          # matches 'a', 'b', or 'c' — space ignored
/[ab\ c]/xx;         # the \ is needed to match a literal space
/[ ! @ ]/xx;         # matches '!' or '@' — visible spacing
```

`/xx` is a superset of `/x`. It is convenient for character
classes laid out for readability, especially with shorthand
classes:

```perl
/[ \d \s \-_,. ]+/xx;   # digits, whitespace, common punctuation
```

`/xx` was added in Perl 5.26. Patterns built before that read
fine; new code can reach for it freely.

## `/g` — global

In scalar context, keeps a position in the string (`pos $x`)
and advances each time the pattern matches, allowing iteration:

```perl
my $x = "cat dog house";
while ($x =~ /(\w+)/g) {
    print "$1 at ", pos($x), "\n";
}
# cat at 3
# dog at 7
# house at 13
```

In list context, returns all matches at once:

```perl
my @words = $x =~ /(\w+)/g;   # ('cat', 'dog', 'house')
```

If the pattern contains no captures, list context returns the
whole matched text for each match:

```perl
my @digits = "abc123def456" =~ /\d+/g;   # ('123', '456')
```

With multiple captures, each iteration returns the tuple in
order:

```perl
my @pairs = "a=1,b=2,c=3" =~ /(\w)=(\d)/g;
# ('a', '1', 'b', '2', 'c', '3')
```

## `/c` — preserve position on failure

By default, a failed `/g` match resets `pos` to undef. Under
`/gc`, `pos` stays at its previous value — crucial for
hand-rolled lexers:

```perl
my $s = "123abc";
while (1) {
    if ($s =~ /\G(\d+)/gc)    { print "num $1\n"; next; }
    if ($s =~ /\G([a-z]+)/gc) { print "word $1\n"; next; }
    last;   # nothing matched; exit
}
```

See the [anchors and assertions](anchors-and-assertions.md) chapter
for `\G`.

## `/r` — non-destructive substitution

`s///` normally modifies the target string and returns the
count. `s///r` leaves the target alone and returns the result:

```perl
my $name = "  Alice  ";
my $trimmed = $name =~ s/^\s+|\s+$//gr;
# $name is still "  Alice  "
# $trimmed is "Alice"
```

Enables substitution chains without intermediate variables:

```perl
my $clean = $input
    =~ s/\s+/ /gr          # collapse runs of whitespace
    =~ s/^ | $//gr         # trim ends
    =~ s/[^\x00-\x7f]//gr; # drop non-ASCII
```

Each `s///r` returns the transformed string, which the next one
receives.

## `/e` — evaluate the replacement

The replacement half of `s///e` is Perl code, not a
double-quoted string. The return value of the code replaces the
match:

```perl
my $x = "numbers: 1 2 3 4";
$x =~ s/(\d+)/$1 * 2/ge;
# $x is now "numbers: 2 4 6 8"
```

`/ee` evaluates twice: the code returns a string, which is then
evaluated as Perl again. Rarely useful and easy to misuse — only
reach for it when you are sure. The [substitution](substitution.md)
chapter has worked examples.

## `/n` — non-capturing

Makes every ordinary `(…)` behave like `(?:…)`. Useful when a
long pattern has parentheses purely for grouping and you want to
keep `$1`, `$2`, … unset:

```perl
"hello" =~ /(hi|hello)/n;    # matches, but $1 is not set
```

Named captures `(?<name>…)` still capture under `/n`. Added in
Perl 5.22.

## `/p` — no-op

`/p` is a silent no-op. `${^PREMATCH}`, `${^MATCH}`,
`${^POSTMATCH}` are always available. Accepted for backward
compatibility, but contributes nothing in new code.

## `/o` — compile once

`/o` disables re-compilation of a pattern that interpolates
variables. Compiled patterns are also cached automatically, and
`qr//` gives explicit control. `/o` is rarely the right tool;
`qr//` is clearer and composes.

## Charset modifiers — `/a`, `/aa`, `/u`, `/l`, `/d`

These control how the shorthand classes (`\d`, `\w`, `\s`),
case folding, and the POSIX class set behave under Unicode and
locale considerations. Brief summary here; the full treatment is
in the [unicode](unicode.md) chapter.

| Modifier   | Behaviour                                           |
|------------|-----------------------------------------------------|
| `/u`       | full Unicode semantics. Default under `use v5.12+`. |
| `/a`       | restrict `\d`, `\w`, `\s`, `[[:…:]]` to ASCII.      |
| `/aa`      | `/a` plus `/i` does not cross ASCII/non-ASCII.      |
| `/l`       | follow the current `use locale` POSIX locale.       |
| `/d`       | dual-mode semantics. Avoid in new code.             |

`/a`, `/u`, `/l`, `/d` are mutually exclusive — at most one can
be in effect at a time. `/aa` is a refinement of `/a`. They are
*set-once* modifiers: an inner `(?d:…)` cannot un-set an outer
`/u`.

The everyday choice in modern code: `use v5.12` (or later)
selects `/u` automatically, which is what you want for
Unicode-correct text. Add `/a` only when the pattern must reject
non-ASCII characters — typically because the input is known to
be ASCII-only and you want to enforce that.

## Inline modifiers

`(?flags)` turns flags on for the rest of the enclosing group
(or pattern):

```perl
/(?i)yes/;         # case-insensitive, same as /yes/i
```

`(?flags:…)` scopes the flags to the inner group only and is
non-capturing:

```perl
/Answer: ((?i)yes)/;   # only the 'yes' is case-insensitive
/Answer: ((?i:yes))/;  # clearer: scope is the group's contents
```

`(?-flags)` turns flags off. They can be combined:

```perl
/(?i-m:pattern)/;   # turn on /i, turn off /m, within this group
```

Inline modifiers are the right tool when different parts of a
long pattern need different modifiers. They are also useful in
patterns built by interpolation, where the embedded `(?i)`
travels with the pattern fragment.

### Caret form: `(?^…)`

`(?^flags)` is shorthand for «reset all flags, then apply these».
The expansion is `d-imnsx` followed by the listed flags. So
`(?^x:…)` means «default flags except `/x` is on, regardless of
what flags were in scope outside».

The caret form is what Perl uses when stringifying compiled
patterns; you may see it in error messages or `qr//` output. In
hand-written code it is occasionally useful when interpolating a
pattern fragment that should not inherit modifiers from its
surroundings:

```perl
my $strict = qr/(?^:foo|bar)/;   # always default flags
# ... no matter how this is interpolated.
```

A negative flag is not legal after the caret (it would be
redundant — the caret already cleared everything).

### Mutually-exclusive flags

`/a` and `/aa` override each other (last one wins); same for
`/x` and `/xx`. They are not additive. `(?xx-x:…)` turns *all*
`x` behaviour off, not «subtract one `x` from two».

The charset family `/a`, `/d`, `/l`, `/u` is mutually exclusive;
specifying one un-specifies the others. They cannot be turned
*off* with a leading `-`, only switched between. So `(?-d:…)`
is a fatal error; `(?dl:…)` is also fatal (two charset flags
together).

`/p` is special: its presence anywhere in a pattern has a global
effect (and that effect is a no-op).

## Order of modifiers

On a match or substitution, the order of trailing modifiers is
not significant. `/mgis` and `/sgmi` are identical. Pick a house
style and stick to it.

## How a regex is read — the four phases

For modifiers and inline forms to work, you have to understand
that Perl reads a regex in *phases*. The phases:

1. **Phase A**: parser identifies the delimiter, finds the end
   of the pattern. `(?#…)` comments are removed here.
2. **Phase B**: pattern is parsed as a *double-quotish string*.
   Variables interpolate, escape sequences cook, `\Q…\E`
   translates to `quotemeta`-style escaping.
3. **Phase C**: under `/x` or `/xx`, unescaped whitespace and
   `#`-comments are stripped.
4. **Phase D**: the regex compiler reads the cooked, stripped
   pattern and turns it into the engine’s internal form.

The order matters because phases B and D operate on different
representations:

- `\Q$dir\E` cooks at Phase B, before the regex compiler sees
  it. By Phase D the variable’s contents have already been
  metaquoted; the regex compiler sees a literal pattern.
- `\U…\E` is interpreted by Phase B as a string-cook directive
  (uppercase the contents), which is almost certainly *not*
  what you want inside a regex. Use `\Q` and `\E` for regex
  purposes; `\U` only when you want the pattern’s *literal text*
  to be uppercased.
- `(?#comment)` is removed before Phase B sees the pattern at
  all. A literal `#` inside `(?#…)` is fine. A literal `)` is
  not — the comment ends at the first `)`.

The `/x` modifier applies at Phase C, between interpolation and
compilation. That means whitespace introduced by an interpolated
variable is *not* stripped by `/x`:

```perl
my $sub = " a b ";       # contains literal spaces
"axb" =~ / x $sub x /x;  # the spaces in $sub are NOT stripped
```

If you want the interpolated content to obey `/x`, you have to
strip its whitespace at the source.

## `use re 'strict'`

`use re 'strict'` raises a number of normally-tolerated regex
sloppiness conditions to compile-time errors. It is per-lexical-
scope. In strict mode:

- A `{` in a non-quantifier position is an error, not a literal.
- `[a-]` (dash at end of class) is an error.
- A useless negation of an always-on flag is an error.
- A `(?-d)` (trying to turn off `/d`) is an error rather than a
  warning.
- Unmatched `[` in a string is an error.

Strict mode is recommended for new regex-heavy code, especially
generated patterns where small typos go unnoticed in a normal
build. It is not default because it would break working older
patterns.

```perl
use re 'strict';

/abc{,1/;           # error in strict; warning otherwise
/[a-]/;             # error in strict
```

## See also

- The [unicode](unicode.md) chapter — `/a`, `/aa`, `/u`, `/l`, `/d`
  in detail, plus what they do to character classes and case
  folding.
- The [substitution](substitution.md) chapter — `/e`, `/r`, `/g`,
  and `/c` in their substitution-specific roles.
- The [basics](basics.md) chapter — the four-phase parsing model
  in its first appearance.
- The [anchors and assertions](anchors-and-assertions.md) chapter —
  `\A`, `\z`, `\Z` interactions with `/m`.
- The [performance](performance.md) chapter — `/o` discussion and
  why `qr//` superseded it.
- [`m`](../../p5/core/perlfunc/m.md), [`s`](../../p5/core/perlfunc/s.md),
  [`qr`](../../p5/core/perlfunc/qr.md) — operator references.