Regex binding operators#

Two operators that connect a string with a regular expression operation. They do not perform the match themselves; they merely say ”this expression’s input is that string“.

Operator

Reads as

Use

=~

matches

bind a regex op (m//, s///, tr///)

!~

doesn’t match

same, with the boolean result negated

$str =~ /pattern/                    # match: TRUE if $str contains a match
$str =~ s/foo/bar/                   # substitution: returns count of changes
$str =~ tr/a-z/A-Z/                  # transliteration: returns count
$str !~ /pattern/                    # match negated: TRUE if NO match

Without an explicit binding, regex ops act on $_:

$_ = "hello";
print "match\n" if /h/;              # implicit $_ binding

=~ and !~ are how you redirect that input to a different variable.

What they actually return#

=~ returns whatever the regex op on the right would return:

  • m// — boolean (true on match, false on no match) in scalar context; the captured groups in list context.

  • s/// — the number of substitutions performed (which is boolean-true when ≥ 1).

  • tr/// — the number of characters processed.

!~ returns the boolean negation, regardless of the underlying op. It is mostly used with m//:

print "no digit"   if $str !~ /\d/;
print "$n changes" if $str =~ s/foo/bar/g;
my @hits           = $str =~ /(\w+)/g;       # list context: captures

Three op partners#

=~ accepts three regex operations on its right:

  • m// — match. The m is optional when the delimiters are slashes: $s =~ /pattern/ and $s =~ m{pattern} both work.

  • s/// — substitution. Three pieces: pattern, replacement, flags. Returns the count of replacements.

  • tr/// (also spellable y///) — transliteration. Replaces characters one-for-one between two character sets. Returns the count of characters processed.

$line =~ /(\d+)/             # extract first run of digits
$line =~ s/^\s+//             # strip leading whitespace
$line =~ s/\s+/ /g            # collapse all whitespace to single spaces
$line =~ tr/A-Z/a-z/          # ASCII lowercase

!~ only with m//#

!~ is meaningful only with the match operation, since substitution and transliteration return counts and the ”no-changes-made“ case is a meaningful zero, not a ”didn’t match“ boolean. Perl will let you write $s !~ s/.../.../, but you almost never want it — the result !s/// is ”true if zero substitutions“ which reads strangely. Stick to !~ /.../.

Lvalue vs rvalue#

=~ itself does not assign. It only routes the regex op on its right at the string on its left. The op itself may then mutate that string — s/// and tr/// do, m// does not — but the mutation comes from the op, not from =~:

my $s = "hello";
$s =~ s/l/L/g;          # mutates $s — now "heLLo"
my $n = $s =~ /(\w+)/;  # does NOT mutate $s; $n is the boolean result

The string on the left of =~ must be modifiable for s/// and tr///. A literal string or $1 (a regex capture variable) will fail with ”Modification of a read-only value attempted“:

"hello" =~ s/l/L/g;     # FATAL — literal is read-only
$1 =~ s/x/y/;           # FATAL — capture variable is read-only

Copy first if you need to modify a read-only source:

(my $copy = $1) =~ s/x/y/;     # idiom for "modify a copy of $1"

Precedence#

=~ and !~ sit at row 6 of the precedence table — quite tight, between unary and the multiplicative operators. This is why you can write $s =~ /foo/ && $t =~ /bar/ without parens around either match.

Tutorial cross-reference#

The boolean-logic tutorial covers regex-as-set-algebra in its applications chapter — alternation as union, lookarounds as intersection and complement:

The full regex language reference lives in the regex guide:

See also#