---
name: regex basics
---
# Basics

The smallest useful regexp is a plain string. `"Hello World" =~
/World/` asks: does the string on the left contain the pattern on the
right? It does, so the expression is true.

```perl
if ("Hello World" =~ /World/) {
    print "matched\n";
}
```

The `//` enclose the pattern. The `=~` operator binds the pattern to
the string you want to test. Without a binding operator, Perl applies
the pattern to `$_` instead.

## The match operator

The long form is `m//`:

```perl
"Hello World" =~ m/World/;
"Hello World" =~ m!World!;    # alternate delimiters
"Hello World" =~ m{World};    # paired delimiters
```

`m` lets you pick any delimiter. That matters when the pattern itself
contains the default delimiter `/` — compare

```perl
"/usr/bin/perl" =~ /\/usr\/bin\/perl/;  # "leaning toothpick syndrome"
"/usr/bin/perl" =~ m!/usr/bin/perl!;    # clearer
```

Paired delimiters (`{}`, `()`, `[]`, `<>`) nest, which is useful when
your pattern contains the delimiter character escaped or not.

Without `m`, the leading slash is required: `/pat/` only. With `m`,
the leading `m` is required: `m{pat}`, not `{pat}`.

## Binding: =~ and !~

`=~` asks "does it match?". `!~` asks "does it fail to match?".

```perl
$s = "Hello World";

print "yes\n" if $s =~ /World/;   # yes
print "no\n"  if $s !~ /planet/;  # no
```

`!~` is not a separate regexp construct — it is the negated binding.
It is equivalent to `not ($s =~ /pat/)`.

## Matching against $_

If you omit the binding, the match is against `$_`:

```perl
for ("cat", "dog", "bird") {
    print "has an 'o'\n" if /o/;   # implicit: $_ =~ /o/
}
```

This is idiomatic in `while (<>)` loops, inside `grep` and `map`,
and inside `for` loops that set `$_`.

## Case sensitivity and the default anchor

Matches are case-sensitive and unanchored:

```perl
"Hello" =~ /hello/;    # does not match — case differs
"Hello" =~ /ell/;      # matches — inside the string is fine
```

To match case-insensitively, append `/i`. To constrain the match to
the start or end of the string, use anchors. Both are covered in
their own chapters.

When a pattern could match at several positions, Perl tries from the
left and takes the first one that works:

```perl
"That hat is red" =~ /hat/;   # matches 'hat' in 'That', not in 'hat'
```

## Metacharacters

Most characters in a pattern match themselves. These do not:

    { } [ ] ( ) ^ $ . | * + ? - # \

Each has a special meaning covered later. To match a literal copy of
one, put a backslash in front:

```perl
"2+2=4" =~ /2+2/;    # fails — '+' is a quantifier, needs escaping
"2+2=4" =~ /2\+2/;   # matches

"end." =~ /end\./;   # matches a literal dot
"end." =~ /end./;    # also matches — but . matches any character,
                     # so this would also match "endx", "end ", etc.
```

The backslash itself is a metacharacter, so a literal backslash in a
pattern needs `\\`:

```perl
'C:\WIN32' =~ /C:\\WIN/;    # matches
```

A metacharacter that has nothing special to do in its context reverts
to matching itself. `}` only closes a `{…}` quantifier; outside that
context it is a literal `}`. This is convenient but easy to misread;
`use re 'strict'` catches many such cases.

## Escape sequences

Non-printing characters use the same escapes as in double-quoted
strings:

| Sequence | Matches                            |
|----------|------------------------------------|
| `\t`     | tab                                |
| `\n`     | newline                            |
| `\r`     | carriage return                    |
| `\f`     | form feed                          |
| `\e`     | escape (`\x1B`)                    |
| `\0`     | NUL byte                           |
| `\xHH`   | byte with hex value HH             |
| `\x{…}`  | Unicode codepoint with hex value   |
| `\o{…}`  | octal codepoint                    |
| `\cX`    | control-X                          |

```perl
"1000\t2000" =~ /0\t2/;      # matches
"a\x{263a}b" =~ /\x{263a}/;  # matches U+263A, WHITE SMILING FACE
```

## Variables in patterns

A pattern is (by default) interpolated like a double-quoted string,
so variables are substituted before matching:

```perl
my $word = "house";
"housecat" =~ /$word/;       # matches
"housecat" =~ /${word}cat/;  # matches — braces disambiguate
```

To match a literal `$` or `@`, escape it:

```perl
'price: $10' =~ /\$10/;      # matches a literal dollar sign
```

If a user-supplied string will be interpolated into a pattern and you
want its metacharacters treated literally, use
[`quotemeta`](../../p5/core/perlfunc/quotemeta) — or its in-pattern
equivalent `\Q…\E`:

```perl
my $input = "1+1";
"1+1=2" =~ /\Q$input\E/;     # matches the literal string
```

Without `\Q…\E` the `+` would be read as a quantifier.

## Substitution at a glance

Replacing text uses the `s///` operator, which takes a pattern and a
replacement string:

```perl
my $x = "feed the cat";
$x =~ s/cat/dog/;            # $x is now "feed the dog"
```

Substitution is covered in depth in its own chapter; it is mentioned
here so you can combine it with the facts above. Most everything that
applies to `m//` patterns applies inside `s///` patterns too.

## Where to go next

Literal matches get you surprisingly far, but every real regexp uses
character classes, anchors, or quantifiers. Character classes come
next — they let one position in the pattern accept any of several
characters.

## See also

- [`m`](../../p5/core/perlfunc/m) — full reference for the match
  operator.
- [`s`](../../p5/core/perlfunc/s) — full reference for substitution.
- [`quotemeta`](../../p5/core/perlfunc/quotemeta) — escape a string
  for safe pattern interpolation.
- [`perlre`](../../p5/core/perlre) — complete regexp syntax.