--- name: m// signature: 'm/PATTERN/msixpodualngc' signatures: - 'm/PATTERN/msixpodualngc' - '/PATTERN/msixpodualngc' - 'm?PATTERN?msixpodualngc' since: 5.0 status: documented categories: ["Regular expressions and pattern matching"] --- ```{index} single: m//; Perl built-in ``` *[Regular expressions and pattern matching](../perlfunc-by-category)* # m// Search a string for a pattern and report whether — and what — it matched. `m//` is the match operator. It compiles `PATTERN` as a regular expression (see [`perlre`](../perlre)), runs it against a target string, and returns a value shaped by calling context and by the modifiers you apply. The target is whatever sits on the left of [`=~`](../perlop) or [`!~`](../perlop); without a binding operator the target is [`$_`](../perlvar). The leading `m` is optional when the delimiter is `/`, so `/PATTERN/` and `m/PATTERN/` mean the same thing. ## Synopsis ```perl $str =~ m/PATTERN/flags $str =~ /PATTERN/flags m/PATTERN/flags # target is $_ /PATTERN/flags # target is $_ $str =~ m{PATTERN}flags # any paired non-word delimiters ``` ## What you get back Context decides the shape of the return value. - **Scalar context, no `/g`**: `1` on match, the empty string on no match. Both are usable as booleans; the empty string is a dual-value false that also equals `0` numerically. - **List context, no `/g`**: the list of capture values `($1, $2, $3, …)` on a successful match; if the pattern has no capture groups, the singleton `(1)`; on failure, the empty list. This is how `if (my ($x, $y) = $s =~ /(\w+)=(\w+)/)` works. - **Scalar context, `/g`**: each call advances through the string, returning true for the next match and false once there are no more. [`pos`](pos) on the target tracks where the next attempt will begin. - **List context, `/g`**: every match in one shot. With capture groups, the flat list of all captures from every match. Without captures, the list of every full match. Successful matches also populate the regex special variables ([`$1`](../perlvar) through [`$9`](../perlvar), [`$&`](../perlvar), [`` $` ``](../perlvar), [`$'`](../perlvar), [`$+`](../perlvar), [`%+`](../perlvar), [`%-`](../perlvar)) for the enclosing dynamic scope. A failed match leaves them holding their previous values — always test the match itself, never a capture variable, to decide whether a match happened. ## Global state it touches - [`$_`](../perlvar) — the default target when no [`=~`](../perlop) binding is given. - [`$1`](../perlvar), [`$2`](../perlvar), … — numbered captures, set on success, unchanged on failure. - [`$&`](../perlvar), [`` $` ``](../perlvar), [`$'`](../perlvar) — match, prematch, postmatch. - [`$+`](../perlvar) — the highest-numbered capture that actually matched (useful with alternations). - [`%+`](../perlvar), [`%-`](../perlvar) — named-capture hashes. - [`${^LAST_SUCCESSFUL_PATTERN}`](../perlvar) — the last pattern that matched in the current dynamic scope; also the pattern the empty form `m//` reuses (see *Edge cases*). - [`pos`](pos) on the target string — read and updated by `/g` matching; reset on failure unless `/c` is also set. - Locale / Unicode rule sources when `/l`, `/u`, or `/d` are in effect. ## Delimiters With `m`, any pair of non-whitespace characters works as the delimiter, and bracketing pairs nest: ```perl m/pattern/ m{pattern} m[pattern] m(pattern) m m!pattern! m#pattern# m,pattern, ``` Picking a delimiter that does not appear in the pattern avoids backslash-clutter — known as LTS, *leaning toothpick syndrome*. A path-matching pattern reads cleanly with `m{…}` or `m!…!` and badly with `m/…/`. Two delimiter choices change semantics: - **`'` (single quote)** — no variable interpolation inside `PATTERN`. `m'$foo'` matches the literal four characters. - **`?`** — `m?PATTERN?` matches only once between calls to [`reset`](reset). The leading `m` is mandatory; since Perl 5.22 the bare `?…?` form is a syntax error. When the delimiter is a word character (a letter or digit), a space is required after `m`: `m q foo q` is legal, `mqfooq` is not. ## Modifiers Pattern-compile modifiers (also accepted by [`qr`](qr), [`s`](s), and [`split`](split)): - **`m`** — multi-line: `^` and `$` match at every embedded newline, not only at string ends. - **`s`** — single-line: `.` matches every character including newline. - **`i`** — case-insensitive matching. - **`x`** — ignore whitespace and `#`-comments in the pattern; `xx` extends this into character classes. - **`p`** — preserve copies of the matched string. Since 5.20 this is a no-op — [`${^PREMATCH}`](../perlvar), [`${^MATCH}`](../perlvar), [`${^POSTMATCH}`](../perlvar) are always available after a successful match. - **`a`**, **`u`**, **`l`**, **`d`** — character-set rules for `\d`, `\s`, `\w`, and the POSIX classes. `/a` restricts them to ASCII; `/aa` additionally forbids ASCII/non-ASCII matching under `/i`. - **`n`** — non-capturing: `(…)` behaves like `(?:…)` and does not populate `$1`, `$2`, …. - **`o`** — compile the pattern exactly once even if interpolated variables change. Almost always the wrong tool; use [`qr`](qr) to build a reusable compiled pattern instead. Match-process modifiers (specific to `m//` and [`s///`](s)): - **`g`** — global matching. Scalar-context behavior is iterative (advances [`pos`](pos) on each call); list-context behavior returns every match at once. - **`c`** — only meaningful with `/g`. A failed `/g` match keeps [`pos`](pos) where it was instead of resetting to the start; required for `lex`-style scanners built around `\G`. ## Examples Test whether a string contains a pattern: ```perl if ($line =~ /error/i) { warn "matched: $line"; } ``` Bind a capture in one go: ```perl if (my ($key, $val) = $line =~ /^(\w+)\s*=\s*(.*)$/) { $config{$key} = $val; } ``` Pull every number out of a string with list-context `/g`: ```perl my @nums = "x=1 y=22 z=333" =~ /(\d+)/g; # @nums = (1, 22, 333) ``` Iterate matches one at a time with scalar-context `/g`, using [`pos`](pos) to see where the engine is: ```perl my $s = "foo 1 bar 22 baz 333"; while ($s =~ /(\d+)/g) { printf "matched %s at offset %d\n", $1, pos($s) - length($1); } ``` Extended form with the `x` modifier and named captures: ```perl if ($ts =~ m{ ^ (?\d{4}) - (? \d{2}) - (? \d{2}) $ }x) { printf "year=%s mon=%s day=%s\n", $+{year}, $+{mon}, $+{day}; } ``` Avoid LTS by picking a delimiter that does not appear in the pattern: ```perl next if $path =~ m{^/usr/local/}; ``` Use `\G` with `m//gc` to walk a string token-by-token without losing position on a failed arm: ```perl while (1) { if ($s =~ /\G(\d+)/gc) { push @tok, ['num', $1] } elsif ($s =~ /\G(\w+)/gc) { push @tok, ['word', $1] } elsif ($s =~ /\G(\s+)/gc) { next } else { last } } ``` ## Edge cases - **Empty pattern**: `//` and `m//` reuse the last successfully matched pattern in the current dynamic scope. If nothing has matched yet, an empty pattern matches everywhere. Passing user input straight into `m/$pat/` when `$pat` might be empty is a sharp edge — wrap it in a non-capturing group: `m/(?:$pat)/`. The last successful pattern is also readable as [`${^LAST_SUCCESSFUL_PATTERN}`](../perlvar). - **Defined-or ambiguity**: Perl resolves `$x // $y` as the defined-or operator, never as two empty matches. In pathological positions (`print $fh //`) Perl still assumes defined-or; force a match by writing `m//` explicitly or spacing out the delimiters. - **Failed match leaves captures stale**: `$1` after a failing `/…/` still holds the capture from the *previous* successful match — always gate capture use on the match result. - **`/o` with changing variables**: `m/$x/o` locks the first value of `$x` into the compiled pattern. Later changes to `$x` are silently ignored. Reach for [`qr`](qr) instead when you want an explicit, reusable compilation. - **Interpolation when the delimiter is `'`**: `m'$var'` is a literal-dollar-var match. This is rarely what you want, and the same effect is available with `\Q…\E` or [`quotemeta`](quotemeta) in any other delimiter. - **`/g` plus target modification**: modifying the target between `/g` iterations resets [`pos`](pos) to the start. Iterate over a copy if you need to mutate the original as you go. - **`\G` outside `/g`**: without `/g`, `\G` anchors at the [`pos`](pos) the target had at call time and matches at most once. On a string that has never had a `/g` applied, `\G` is equivalent to `\A`. - **`m?…?` reset scope**: `reset` clears `m??` state only for the current package. A `m??` in one package is not affected by `reset` called from another. - **Comparison operators near an empty regex**: `$x //= 1` is always the defined-or assignment; if you genuinely want the empty regex, write `m//` rather than `//`. ## Differences from upstream Fully compatible with upstream Perl 5.42. ## See also - [`qr`](qr) — compile a pattern once and reuse it; avoids `/o` and keeps the pattern a first-class value - [`s`](s) — same pattern syntax, replaces what it matches - [`tr`](tr) — character-by-character translation; a different tool with a superficially similar shape - [`split`](split) — when you want the pieces between matches rather than the matches themselves - [`pos`](pos) — read or set the position `/g` matching resumes from - [`perlre`](../perlre) — the regex language itself: assertions, character classes, backreferences, named captures