Progressive recipes#

This chapter works from trivial substitutions up through reductions, extraction, and structured output. Each section reuses the switches introduced in the previous one, so a reader going straight through builds up the idiom vocabulary without ever meeting a new flag mid- recipe.

Readers coming from awk/sed should recognise the first sections; the interesting material starts around Reductions across lines and Context-windowed filters.

Every recipe is copy-pasteable. When a sample input matters, it is shown inline so the expected output is unambiguous.

The baseline: -e and -E#

The shortest one-liner. No I/O loop, no field splitting — just a program that runs once.

pperl -e 'print 2 ** 32, "\n"'                    # 4294967296
pperl -E 'say 2 ** 32'                            # same, with -E/say
pperl -E 'say for 1 .. 5'                         # 1\n2\n3\n4\n5

The , in print separates arguments; the output field separator $, (default empty) goes between them.

Filtering: -n + print if#

-n wraps the program in a read loop over every line of every input file (or STDIN). Nothing is printed unless the program says so.

# Sample input
$ printf 'gate\napple\nwhat\nkite\n' > words.txt

pperl -ne 'print if /at/' words.txt               # gate, what
pperl -ne 'print unless /e/' words.txt            # what
pperl -ne 'print if /^[aeiou]/' words.txt         # apple

The regex match is against $_ by default; /at/ is short for $_ =~ m/at/.

Multiple conditions#

# Lines containing both "ERROR" and "timeout"
pperl -ne 'print if /ERROR/ && /timeout/' app.log

# Lines containing neither
pperl -ne 'print if !/ERROR/ && !/timeout/' app.log

# Events A, B, C appearing in order on one line
pperl -ne 'print if /A.*B.*C/' events.log

By line number#

pperl -ne 'print if $. == 1'                      # head -n 1
pperl -ne 'print if $. <= 10'                     # head -n 10
pperl -ne 'print if 17 .. 30'                     # lines 17-30 (range op)
pperl -ne 'print if /START/ .. /END/'             # regex range, inclusive
pperl -ne 'print unless $. % 2'                   # even lines

The range operator .. in scalar context is a flip-flop: compared against $. when both endpoints are integers, against $_ when they are patterns. See perlop.

By length#

Use -l so the counted length excludes the trailing newline.

pperl -lne 'print if length >= 80' src.c          # long lines
pperl -lne 'print if length() < 5' words.txt      # short lines

Bare length with no argument applies to $_.

Substitution: -p + s///#

-p is -n plus an implicit print after each iteration. Use it when the output is the (possibly modified) input.

# Replace every "foo" with "bar"
pperl -pe 's/foo/bar/g' input.txt

# First occurrence per line only
pperl -pe 's/foo/bar/' input.txt

# Conditional: replace only on lines that also match BAZ
pperl -pe 's/foo/bar/g if /BAZ/' input.txt

Case transformations#

pperl -nle 'print uc'                              # to uppercase
pperl -nle 'print lc'                              # to lowercase
pperl -nle 'print ucfirst lc'                      # Sentence case
pperl -ple 's/(\w+)/\u$1/g'                        # Title Case Per Word

uc, lc, and ucfirst with no argument operate on $_.

Whitespace normalisation#

pperl -ple 's/^\s+//'                              # trim left
pperl -ple 's/\s+$//'                              # trim right
pperl -ple 's/^\s+|\s+$//g'                        # trim both
pperl -ple 's/\s+/ /g'                             # collapse runs
pperl -ne  'print if /\S/'                         # drop blank lines

Line-ending conversion#

pperl -pe 's/\r\n/\n/g' win.txt                    # DOS  → UNIX
pperl -pe 's/(?<!\r)\n/\r\n/g' unix.txt            # UNIX → DOS

The lookbehind (?<!\r) avoids producing \r\r\n when the input is already DOS-encoded. See regex-recipes for lookarounds in depth.

Field-oriented work: -a and -F#

-a auto-splits each line into @F. -F changes the delimiter.

$ cat table.txt
brown bread mat hair 42
blue cake mug shirt -7
yellow banana window shoes 3.14

# Second field
pperl -lane 'print $F[1]' table.txt
# bread, cake, banana

# Last field
pperl -lane 'print $F[-1]' table.txt
# 42, -7, 3.14

# Count fields per line
pperl -lane 'print scalar @F' table.txt
# 5, 5, 5

# Lines whose last field is negative
pperl -lane 'print if $F[-1] < 0' table.txt
# blue cake mug shirt -7

Custom delimiter#

# /etc/passwd: print username (field 1) and shell (last field)
pperl -F: -lane 'print "$F[0] $F[-1]"' /etc/passwd

# Tab-separated
pperl -F'\t' -lane 'print $F[2]' metrics.tsv

# Comma or semicolon, one or more
pperl -F'[,;]+' -lane 'print scalar @F' mixed.csv

Proper CSV (quoted fields with embedded commas) is not a job for -F. Use -MText::CSV — see numeric for the pattern.

Rejoining fields#

Interpolating @F inside double quotes uses $" (default: one space) as the separator.

# Reverse field order, space-separated
pperl -lane 'print "@{[reverse @F]}"' table.txt

# Reverse field order, colon-separated
pperl -F: -lane 'local $" = ":"; print "@{[reverse @F]}"' /etc/passwd

@{[ EXPR ]} is the array-ref-then-deref idiom that lets arbitrary expressions be interpolated into a double-quoted string.

Reductions across lines#

The move from per-line filters to cross-line computation.

Counts#

# wc -l
pperl -lne 'END { print $. }' big.txt

# grep -c pattern
pperl -lne '$n++ if /pattern/; END { print $n + 0 }' file

# Count blank lines
pperl -lne '$n++ if /^$/; END { print $n + 0 }' file

# Total fields across the file
pperl -lane '$t += @F; END { print $t }' table.txt

The $n + 0 idiom prints 0 rather than empty when the counter was never incremented.

Sums, minima, maxima#

Use List::Util via -M.

# Per line
pperl -MList::Util=sum -lane 'print sum @F' data.txt
pperl -MList::Util=min -lane 'print min @F' data.txt
pperl -MList::Util=max -lane 'print max @F' data.txt

# Across all lines
pperl -MList::Util=sum -lane '$s += sum @F; END { print $s }' data.txt
pperl -MList::Util=max -alne '$m = max($m // (), @F); END { print $m }' data.txt

$m // () returns an empty list when $m is undefined, so the first iteration does not pass undef to max. See numeric for the full arithmetic chapter.

Longest and shortest line#

pperl -ne '$l = $_ if length > length $l; END { print $l }' file
pperl -ne '$s = $_ if $. == 1 || length < length $s; END { print $s }' file

Unique lines (first sighting only)#

pperl -ne 'print unless $seen{$_}++' file

# Or the List::Util one-liner (slurps):
pperl -MList::Util=uniq -e 'print uniq <>' file

The first form streams — memory scales with distinct lines, not total lines. The second slurps everything into memory, then emits. Prefer the first for large inputs.

Duplicates only#

# Print each duplicate line once, when first seen for the second time
pperl -ne 'print if ++$seen{$_} == 2' file

Context-windowed filters#

Printing the line around a match, not just the match itself.

# Line immediately before each match (grep -B 1, with caveats)
pperl -ne '/pattern/ && $prev && print $prev; $prev = $_' file

# Line immediately after each match (grep -A 1)
pperl -ne 'print if $p; $p = /pattern/' file

# tail -n 10
pperl -ne 'push @a, $_; shift @a if @a > 10; END { print @a }' file

For rich -A / -B / -C behaviour, grep is the right tool. One- liners fit when the window rule is unusual (e.g. “print the line matching X, plus every line until the next blank line”).

# Print from /SECTION/ to the next blank line
pperl -ne 'print if /SECTION/ .. /^$/' file

Record mode: -0, -00, -0777#

Switch away from line-based reading when the natural record is not a line.

Slurp with -0777#

# Single-shot regex across the whole file
pperl -0777 -ne 'print "match\n" if /BEGIN.*?END/s' text.txt

# Replace only the first occurrence across the file
pperl -0777 -i.bak -pe 's/TODO/DONE/' notes.txt

-0777 sets $/ = undef. The entire file comes in as one $_. The /s modifier makes . match newlines. See regex-recipes for the full multi-line story.

Paragraph mode with -00#

-00 sets $/ = "". Each read returns everything up to the next blank-line delimiter.

# Paragraphs containing "TODO"
pperl -00 -ne 'print if /TODO/' notes.md

# Reverse paragraph order
pperl -00 -e 'print reverse <>' notes.md

NUL-delimited records with -0#

# Count files produced by find -print0
find . -type f -print0 | pperl -0 -ne '$n++; END { print "$n\n" }'

# Convert NUL-delimited stream to newline-delimited
find . -type f -print0 | pperl -0 -pe 's/\0/\n/'

In-place editing with -i#

-i overwrites each input file with what -p prints. -i.bak keeps a backup.

# Lowercase every occurrence of HOSTNAME in every .conf (keep backup)
pperl -i.bak -pe 's/HOSTNAME/\L$&/g' *.conf

# Delete every line matching DEPRECATED (no backup — only do this
# after you've validated the pattern on one file)
pperl -i -ne 'print unless /DEPRECATED/' file.txt

The backup files pile up. Clean them up with rm *.bak once verified. See gotchas for the classic in-place traps.

Structured output#

CSV-like (lightweight — commas are not quoted)#

# Rebuild a TSV as CSV (naive, assumes no commas in fields)
pperl -F'\t' -lane 'print join ",", @F' data.tsv

# Add a column of line numbers
pperl -lane 'print join "\t", $., @F' data.tsv

JSON (stdlib)#

# Emit one JSON object per line (no commas in values assumed)
pperl -MJSON::PP -F: -lane '
    print JSON::PP->new->encode({
        user => $F[0], uid => 0+$F[2], shell => $F[-1]
    })
' /etc/passwd

Multi-line one-liners like this are where an alias starts to pay for itself.

Numbered input#

pperl -pe '$_ = "$. $_"'                          # 1 gate, 2 apple, ...
pperl -pe 's/^/sprintf "%5d  ", $./e'             # padded width

The /e modifier on s/// treats the replacement as Perl code. See regex-recipes.

Find out more#

  • regex-recipes — the regex-dependent one-liners: captures, lookaround, named groups, /e, /r.

  • numeric — arithmetic, statistics, randoms, date math.

  • aliases — turn the recipes you use weekly into shell functions.

  • gotchas — quoting, encoding, -i backups, record- separator surprises.

  • perlop — the flip-flop operator, the diamond operator, s///, tr///, quoting forms.