# Progressive recipes This chapter works from trivial substitutions up through reductions, extraction, and structured output. Each section reuses the switches introduced in the previous one, so a reader going straight through builds up the idiom vocabulary without ever meeting a new flag mid- recipe. Readers coming from `awk`/`sed` should recognise the first sections; the interesting material starts around [Reductions across lines](reductions) and [Context-windowed filters](context-windows). Every recipe is copy-pasteable. When a sample input matters, it is shown inline so the expected output is unambiguous. (baseline)= ## The baseline: `-e` and `-E` The shortest one-liner. No I/O loop, no field splitting — just a program that runs once. ```bash pperl -e 'print 2 ** 32, "\n"' # 4294967296 pperl -E 'say 2 ** 32' # same, with -E/say pperl -E 'say for 1 .. 5' # 1\n2\n3\n4\n5 ``` The `,` in `print` separates arguments; the output field separator `$,` (default empty) goes between them. (filtering)= ## Filtering: `-n` + `print if` `-n` wraps the program in a read loop over every line of every input file (or STDIN). Nothing is printed unless the program says so. ```bash # Sample input $ printf 'gate\napple\nwhat\nkite\n' > words.txt pperl -ne 'print if /at/' words.txt # gate, what pperl -ne 'print unless /e/' words.txt # what pperl -ne 'print if /^[aeiou]/' words.txt # apple ``` The regex match is against `$_` by default; `/at/` is short for `$_ =~ m/at/`. ### Multiple conditions ```bash # Lines containing both "ERROR" and "timeout" pperl -ne 'print if /ERROR/ && /timeout/' app.log # Lines containing neither pperl -ne 'print if !/ERROR/ && !/timeout/' app.log # Events A, B, C appearing in order on one line pperl -ne 'print if /A.*B.*C/' events.log ``` ### By line number ```bash pperl -ne 'print if $. == 1' # head -n 1 pperl -ne 'print if $. <= 10' # head -n 10 pperl -ne 'print if 17 .. 30' # lines 17-30 (range op) pperl -ne 'print if /START/ .. /END/' # regex range, inclusive pperl -ne 'print unless $. % 2' # even lines ``` The range operator `..` in scalar context is a flip-flop: compared against `$.` when both endpoints are integers, against `$_` when they are patterns. See [`perlop`](../../p5/core/perlop). ### By length Use `-l` so the counted length excludes the trailing newline. ```bash pperl -lne 'print if length >= 80' src.c # long lines pperl -lne 'print if length() < 5' words.txt # short lines ``` Bare `length` with no argument applies to `$_`. (substitution)= ## Substitution: `-p` + `s///` `-p` is `-n` plus an implicit [`print`](../../p5/core/perlfunc/print) after each iteration. Use it when the output is the (possibly modified) input. ```bash # Replace every "foo" with "bar" pperl -pe 's/foo/bar/g' input.txt # First occurrence per line only pperl -pe 's/foo/bar/' input.txt # Conditional: replace only on lines that also match BAZ pperl -pe 's/foo/bar/g if /BAZ/' input.txt ``` ### Case transformations ```bash pperl -nle 'print uc' # to uppercase pperl -nle 'print lc' # to lowercase pperl -nle 'print ucfirst lc' # Sentence case pperl -ple 's/(\w+)/\u$1/g' # Title Case Per Word ``` [`uc`](../../p5/core/perlfunc/uc), [`lc`](../../p5/core/perlfunc/lc), and [`ucfirst`](../../p5/core/perlfunc/ucfirst) with no argument operate on `$_`. ### Whitespace normalisation ```bash pperl -ple 's/^\s+//' # trim left pperl -ple 's/\s+$//' # trim right pperl -ple 's/^\s+|\s+$//g' # trim both pperl -ple 's/\s+/ /g' # collapse runs pperl -ne 'print if /\S/' # drop blank lines ``` ### Line-ending conversion ```bash pperl -pe 's/\r\n/\n/g' win.txt # DOS → UNIX pperl -pe 's/(? length $l; END { print $l }' file pperl -ne '$s = $_ if $. == 1 || length < length $s; END { print $s }' file ``` ### Unique lines (first sighting only) ```bash pperl -ne 'print unless $seen{$_}++' file # Or the List::Util one-liner (slurps): pperl -MList::Util=uniq -e 'print uniq <>' file ``` The first form streams — memory scales with distinct lines, not total lines. The second slurps everything into memory, then emits. Prefer the first for large inputs. ### Duplicates only ```bash # Print each duplicate line once, when first seen for the second time pperl -ne 'print if ++$seen{$_} == 2' file ``` (context-windows)= ## Context-windowed filters Printing the line *around* a match, not just the match itself. ```bash # Line immediately before each match (grep -B 1, with caveats) pperl -ne '/pattern/ && $prev && print $prev; $prev = $_' file # Line immediately after each match (grep -A 1) pperl -ne 'print if $p; $p = /pattern/' file # tail -n 10 pperl -ne 'push @a, $_; shift @a if @a > 10; END { print @a }' file ``` For rich `-A` / `-B` / `-C` behaviour, `grep` is the right tool. One- liners fit when the window rule is unusual (e.g. "print the line matching X, plus every line until the next blank line"). ```bash # Print from /SECTION/ to the next blank line pperl -ne 'print if /SECTION/ .. /^$/' file ``` (records)= ## Record mode: `-0`, `-00`, `-0777` Switch away from line-based reading when the natural record is not a line. ### Slurp with `-0777` ```bash # Single-shot regex across the whole file pperl -0777 -ne 'print "match\n" if /BEGIN.*?END/s' text.txt # Replace only the first occurrence across the file pperl -0777 -i.bak -pe 's/TODO/DONE/' notes.txt ``` `-0777` sets `$/ = undef`. The entire file comes in as one `$_`. The `/s` modifier makes `.` match newlines. See [regex-recipes](multiline) for the full multi-line story. ### Paragraph mode with `-00` `-00` sets `$/ = ""`. Each read returns everything up to the next blank-line delimiter. ```bash # Paragraphs containing "TODO" pperl -00 -ne 'print if /TODO/' notes.md # Reverse paragraph order pperl -00 -e 'print reverse <>' notes.md ``` ### NUL-delimited records with `-0` ```bash # Count files produced by find -print0 find . -type f -print0 | pperl -0 -ne '$n++; END { print "$n\n" }' # Convert NUL-delimited stream to newline-delimited find . -type f -print0 | pperl -0 -pe 's/\0/\n/' ``` (in-place)= ## In-place editing with `-i` `-i` overwrites each input file with what `-p` prints. `-i.bak` keeps a backup. ```bash # Lowercase every occurrence of HOSTNAME in every .conf (keep backup) pperl -i.bak -pe 's/HOSTNAME/\L$&/g' *.conf # Delete every line matching DEPRECATED (no backup — only do this # after you've validated the pattern on one file) pperl -i -ne 'print unless /DEPRECATED/' file.txt ``` The backup files pile up. Clean them up with `rm *.bak` once verified. See [gotchas](in-place-edit) for the classic in-place traps. ## Structured output ### CSV-like (lightweight — commas are not quoted) ```bash # Rebuild a TSV as CSV (naive, assumes no commas in fields) pperl -F'\t' -lane 'print join ",", @F' data.tsv # Add a column of line numbers pperl -lane 'print join "\t", $., @F' data.tsv ``` ### JSON (stdlib) ```bash # Emit one JSON object per line (no commas in values assumed) pperl -MJSON::PP -F: -lane ' print JSON::PP->new->encode({ user => $F[0], uid => 0+$F[2], shell => $F[-1] }) ' /etc/passwd ``` Multi-line one-liners like this are where an [alias](json-passwd) starts to pay for itself. ### Numbered input ```bash pperl -pe '$_ = "$. $_"' # 1 gate, 2 apple, ... pperl -pe 's/^/sprintf "%5d ", $./e' # padded width ``` The `/e` modifier on `s///` treats the replacement as Perl code. See [regex-recipes](e-modifier). ## Find out more - [regex-recipes](regex-recipes) — the regex-dependent one-liners: captures, lookaround, named groups, `/e`, `/r`. - [numeric](numeric) — arithmetic, statistics, randoms, date math. - [aliases](aliases) — turn the recipes you use weekly into shell functions. - [gotchas](gotchas) — quoting, encoding, `-i` backups, record- separator surprises. - [`perlop`](../../p5/core/perlop) — the flip-flop operator, the diamond operator, `s///`, `tr///`, quoting forms.