# Substitution `s///` is the search-and-replace operator. Its general form is ```default $target =~ s/pattern/replacement/modifiers; ``` It finds the first position where *pattern* matches in *target*, replaces that match with *replacement*, and returns the number of substitutions made. With `/g`, it does that for every non-overlapping match. ```perl my $x = "Time to feed the cat!"; $x =~ s/cat/hacker/; # $x is now "Time to feed the hacker!" ``` If the pattern does not match, the target is unchanged and `s///` returns a false value. ## Operating on `$_` Like `m//`, `s///` defaults to `$_` when no `=~` is given: ```perl for (@lines) { s/\s+$//; # trim trailing whitespace from each line } ``` ## Counting substitutions The return value of `s///` in scalar context is the number of replacements: ```perl my $count = ($text =~ s/\btodo\b/done/gi); print "cleared $count todos\n"; ``` In boolean context, zero means no match. ## The replacement string The replacement is a double-quoted string, so variables and escape sequences interpolate: ```perl my $suffix = ".bak"; $file =~ s/$/$suffix/; # append .bak to the name $x =~ s/\t/ /g; # each tab becomes four spaces ``` Inside the replacement, `$1`, `$2`, `$&`, `$+` and friends refer to what the *current* match just captured: ```perl my $y = "'quoted words'"; $y =~ s/^'(.*)'$/$1/; # strip surrounding single quotes # $y is now "quoted words" ``` Named captures work the same way through `$+{name}`: ```perl "2026-04-23" =~ s/(?\d{4})-(?\d{2})-(?\d{2})/$+{d}\/$+{m}\/$+{y}/; # result: "23/04/2026" ``` ## Escaping in the replacement The replacement is interpreted as a double-quoted string, so the double-quote rules apply to *it*, independently of the pattern. Metacharacters like `*`, `+`, `?`, `$` retain their double-quoted meanings — only `$` and `@` trigger interpolation: ```perl $x =~ s/(\w+)/[$1]/g; # wrap every word in brackets $x =~ s/\$/USD/g; # replace literal '$' with 'USD' ``` Backslash escape sequences in the replacement follow the double-quote rules — `\t`, `\n`, `\x{…}` all work. The special case-modification escapes `\l`, `\u`, `\L`, `\U`, `\E`, `\Q` apply in the replacement too: ```perl $x =~ s/(\w+)/\u$1/g; # capitalise the first letter of each word $x =~ s/(\w+)/\U$1\E/g; # uppercase whole word ``` ### Warning: `\1` versus `$1` in the replacement A long-grandfathered idiom: writing `\1`, `\2`, … in the replacement to mean «the first capture», «the second capture»: ```perl $pattern =~ s/(\W)/\\\1/g; # works, but is a trap ``` Perl accepts `\1` through `\9` in the replacement of `s///` for historical compatibility with `sed`. **Use `$1` through `$9` instead.** The reason is that the replacement is a double-quoted string, where `\1` *normally* means a control-A character. The substitution operator’s special-case rule kludges that double-quote interpretation to mean «first capture», but the edges are sharp. Specifically: combining `\1` with `/e` re-introduces the double-quote meaning. Under `/e`, the replacement is *Perl code*; in Perl code, `"\1"` is control-A: ```perl s/(\d+)/ \1 + 1 /eg; # warning: \1 better written as $1 ``` A second sharp edge: `\1` followed by digit characters is ambiguous. `\1000` could mean «first capture, then literal `000`», or it could mean «octal 0x40 (`@`) — and you cannot disambiguate by writing `\{1}000`»: ```perl s/(\d+)/\1000/; # ambiguous, best avoided s/(\d+)/${1}000/; # unambiguous ``` The rule for new code: **always use `$1` in the replacement, never `\1`**. The `\1` form is grandfathered, not recommended. ## `/g` — replace all Without `/g`, `s///` replaces only the first match. With `/g`, it replaces every non-overlapping match from left to right: ```perl my $x = "I batted 4 for 4"; $x =~ s/4/four/; # "I batted four for 4" $x = "I batted 4 for 4"; $x =~ s/4/four/g; # "I batted four for four" ``` `/g` in substitution is unrelated to `/g` in matching. Here it just means «do this again and again until the pattern stops matching.» ### Zero-length matches and `s///g` A pattern that can match the empty string presents a problem under `/g`: every position would match again forever. Perl breaks this loop with a specific rule: **after a zero-length match, the next attempt at the same position is forbidden to also be zero-length**. The engine takes the second-best match instead. The cleanest demonstration: ```perl $_ = 'bar'; s/\w??/<$&>/g; # Result: <><><><> ``` The non-greedy `\w??` prefers the empty match. After producing one, the next iteration is forbidden from being zero-length too, so the engine takes the second-best match — a single `\w` character. Then another empty match, then another character, and so on, alternating until the string is consumed. The pattern: zero-length matches alternate with one-character matches, and the substitution does not loop forever. A useful corollary: `s/(\d{3})/$1,/g` does not insert a comma after every set of three digits *correctly* — it inserts after the first three from the left, then the next three, regardless of where you wanted the commas. To insert thousands separators, match from the right or use `1 while`: ```perl my $n = "1234567"; 1 while $n =~ s/^(\d+)(\d{3})/$1,$2/; # n is now "1,234,567" ``` The `1 while` repeats until the substitution returns false (no more matches), which is exactly what we want for «every position, after settling». ## `/e` — evaluate the replacement Under `/e`, the replacement is Perl code, not a string. The code runs for every match and its return value replaces the match. ```perl my $x = "numbers: 1 2 3 4"; $x =~ s/(\d+)/$1 * 2/ge; # "numbers: 2 4 6 8" ``` The match variables `$1`, `$2`, … are visible inside the code block. Practical uses: ```perl # Uppercase the first letter of every sentence. $text =~ s/(^|\.\s+)(\w)/$1\U$2/g; # Hex-encode non-ASCII bytes. $bytes =~ s/([\x80-\xff])/sprintf "\\x%02x", ord $1/ge; # Apply a lookup table. my %subst = (red => 0xff0000, green => 0x00ff00, blue => 0x0000ff); $css =~ s/\b(red|green|blue)\b/sprintf "#%06x", $subst{$1}/ge; ``` ### `/ee` — double-eval `/ee` evaluates *twice* — first as code, then the result is itself evaluated as Perl. The use case is dynamic variable lookup by name: ```perl our $greeting = "hello"; my $s = "greeting"; $s =~ s/(\w+)/"\$$1"/ee; # first: '$greeting'; second: 'hello' # $s is now "hello" ``` The replacement is a Perl expression — the explicit double-quotes make it produce the *string* `"$greeting"`. The first `e` evaluates that expression to the string `$greeting`. The second `e` treats that string as Perl, looking up the variable `$greeting` and yielding `hello`. The match position is replaced with the value `hello`. The use case is narrow and the security implications are substantial: any input that reaches an `/ee` substitution is effectively executed as Perl. Treat `/ee` like `eval` — never on user-supplied data. ## `/r` — non-destructive substitution `/r` returns the new string instead of modifying the target. The target is untouched: ```perl my $orig = "I like dogs"; my $new = $orig =~ s/dogs/cats/r; # $orig is still "I like dogs" # $new is "I like cats" ``` If the pattern does not match, `/r` returns the original string unchanged: ```perl my $x = "I like dogs"; my $y = $x =~ s/elephants/cougars/r; # $y eq "I like dogs" ``` The big win is chaining: ```perl my $slug = $title =~ s/[^\w\s-]//gr # drop punctuation =~ s/\s+/-/gr # collapse spaces to dash =~ s/--+/-/gr; # collapse runs of dashes ``` Each arrow returns a new string, which the next `s///r` receives. ## `\K` in substitution `\K` (covered in detail in the [anchors and assertions](anchors-and-assertions.md) chapter) shines in substitution. The pattern matches a prefix that establishes context, then `\K` excludes the prefix from the replacement target: ```perl # Without \K — the prefix has to be re-output: $_ =~ s/(foo)bar/$1/g; # With \K — the prefix is matched but not part of $&: $_ =~ s/foo\Kbar//g; ``` Both replace `foobar` with `foo`. The `\K` form is cleaner: no capture is needed, and the substitution pattern is shorter and faster. `\K` works similarly in any substitution where a fixed prefix should match but not be replaced. It is the substitution-side twin of lookbehind. ## Substitution and `pos()` Successful `s///g` advances `pos($string)` past each match. A zero-length match advances `pos` by one to break the loop. After substitution, `pos` is reset. A subtle interaction: `s///g` will not re-scan a region that has just been matched. After each successful replacement the match position advances past the *matched* text — the length of the replacement does not matter. So: ```perl my $x = "aaa"; $x =~ s/a/AA/g; # $x is "AAAAAA" — three a's, each replaced with AA # It is NOT an infinite loop: pos() moves past each matched 'a', # so the just-inserted 'A's are never re-scanned. ``` This is sometimes surprising for substitutions whose replacement contains characters that the pattern would also match. The rule is «advance past the matched text, never re-scan it». ## Alternate delimiters Both halves of `s///` can use matched brackets; the delimiters do not have to be the same: ```perl s{pattern}{replacement}g; sg; s[pattern][replacement]g; s(pattern)[replacement]g; # legal, but avoid mixing for clarity ``` Any printable punctuation works in a single-character form: ```perl s!/usr/local/!/opt/!g; s#pattern#replacement#; ``` Useful when the pattern or replacement contains `/`. ## Single-quoted substitution `s'pattern'replacement'` treats both halves as single-quoted — no variable interpolation, no backslash escapes apart from `\'` and `\\`: ```perl s'@users'@admins'g; # literal '@users' becomes literal '@admins' ``` Rarely needed; mention it for completeness. ## Combining with `/m`, `/s`, `/x` Substitution modifiers compose with match modifiers, so you can get the full toolbox: ```perl # Trim each line. $x =~ s/^\s+|\s+$//mg; # Collapse blank lines. $x =~ s/\n{2,}/\n\n/g; # Replace C-style comments across newlines. $src =~ s{/\*.*?\*/}{}gs; # /s lets . cross newlines ``` With `/x` the pattern is readable, with `/s` it spans newlines, with `/g` it repeats. ## A real example Turn a block of English into a slug — lowercase, hyphens instead of spaces, drop non-letters, collapse and trim: ```perl sub slug { my ($s) = @_; return lc($s) =~ s/[^\w\s-]//gr # remove punctuation =~ s/\s+/-/gr # spaces to hyphens =~ s/-+/-/gr # collapse hyphens =~ s/^-|-$//gr; # trim leading/trailing hyphens } print slug("Hello, World! -- Regex Tutorial"); # prints: hello-world-regex-tutorial ``` ## Friedl’s framing — «anything that isn’t required will always be considered successful» A pattern’s optional pieces will always succeed if they can match nothing. This applies inside substitution as forcefully as in matching: ```perl my $x = "no horizontal rule here"; $x =~ s/-*/
/; # Result: "
no horizontal rule here" ``` The pattern `-*` matches zero or more dashes — and zero is always available, at the start of the string. `
` is inserted at position 0. The pattern has succeeded by the engine’s definition; it has failed by yours. The fix is to require at least one dash: ```perl $x =~ s/-+/
/; # no match; $x unchanged ``` The slogan: **always consider what will happen if there is no match**. Optional pieces are not free; they always succeed by matching nothing somewhere. ## Things that look like substitution but aren’t `tr///` (also written `y///`) does character-by-character transliteration, not regexp substitution. It accepts two lists of characters; each occurrence of the nth character in the first list becomes the nth character in the second list. No regexp features apply: ```perl my $x = "Hello"; $x =~ tr/A-Za-z/a-zA-Z/; # swap case (my $hex = $bytes) =~ tr/\x00-\xff//d; # delete all bytes — pointless ``` `tr///` is faster than `s///` for character-level work because it is not a regexp engine. See [`tr`](../../p5/core/perlfunc/tr.md) for full semantics. ## See also - [`s`](../../p5/core/perlfunc/s.md) — substitution operator reference. - [`tr`](../../p5/core/perlfunc/tr.md) — character transliteration; not substitution but often confused for it. - [`quotemeta`](../../p5/core/perlfunc/quotemeta.md) — escape a string for safe interpolation into a pattern. - The [modifiers](modifiers.md) chapter — `/g`, `/e`, `/r`, `/ee`, and how they interact. - The [anchors and assertions](anchors-and-assertions.md) chapter — `\K` for «match this prefix but don’t replace it». - The [performance](performance.md) chapter — `s///g` zero-length match termination in detail.