Progressive recipes#
This chapter works from trivial substitutions up through reductions, extraction, and structured output. Each section reuses the switches introduced in the previous one, so a reader going straight through builds up the idiom vocabulary without ever meeting a new flag mid- recipe.
Readers coming from awk/sed should recognise the first sections;
the interesting material starts around
Reductions across lines and
Context-windowed filters.
Every recipe is copy-pasteable. When a sample input matters, it is shown inline so the expected output is unambiguous.
The baseline: -e and -E#
The shortest one-liner. No I/O loop, no field splitting — just a program that runs once.
pperl -e 'print 2 ** 32, "\n"' # 4294967296
pperl -E 'say 2 ** 32' # same, with -E/say
pperl -E 'say for 1 .. 5' # 1\n2\n3\n4\n5
The , in print separates arguments; the output field separator
$, (default empty) goes between them.
Filtering: -n + print if#
-n wraps the program in a read loop over every line of every input
file (or STDIN). Nothing is printed unless the program says so.
# Sample input
$ printf 'gate\napple\nwhat\nkite\n' > words.txt
pperl -ne 'print if /at/' words.txt # gate, what
pperl -ne 'print unless /e/' words.txt # what
pperl -ne 'print if /^[aeiou]/' words.txt # apple
The regex match is against $_ by default; /at/ is short for
$_ =~ m/at/.
Multiple conditions#
# Lines containing both "ERROR" and "timeout"
pperl -ne 'print if /ERROR/ && /timeout/' app.log
# Lines containing neither
pperl -ne 'print if !/ERROR/ && !/timeout/' app.log
# Events A, B, C appearing in order on one line
pperl -ne 'print if /A.*B.*C/' events.log
By line number#
pperl -ne 'print if $. == 1' # head -n 1
pperl -ne 'print if $. <= 10' # head -n 10
pperl -ne 'print if 17 .. 30' # lines 17-30 (range op)
pperl -ne 'print if /START/ .. /END/' # regex range, inclusive
pperl -ne 'print unless $. % 2' # even lines
The range operator .. in scalar context is a flip-flop: compared
against $. when both endpoints are integers, against $_ when
they are patterns. See perlop.
By length#
Use -l so the counted length excludes the trailing newline.
pperl -lne 'print if length >= 80' src.c # long lines
pperl -lne 'print if length() < 5' words.txt # short lines
Bare length with no argument applies to $_.
Substitution: -p + s///#
-p is -n plus an implicit print
after each iteration. Use it when the output is the (possibly modified)
input.
# Replace every "foo" with "bar"
pperl -pe 's/foo/bar/g' input.txt
# First occurrence per line only
pperl -pe 's/foo/bar/' input.txt
# Conditional: replace only on lines that also match BAZ
pperl -pe 's/foo/bar/g if /BAZ/' input.txt
Case transformations#
pperl -nle 'print uc' # to uppercase
pperl -nle 'print lc' # to lowercase
pperl -nle 'print ucfirst lc' # Sentence case
pperl -ple 's/(\w+)/\u$1/g' # Title Case Per Word
Whitespace normalisation#
pperl -ple 's/^\s+//' # trim left
pperl -ple 's/\s+$//' # trim right
pperl -ple 's/^\s+|\s+$//g' # trim both
pperl -ple 's/\s+/ /g' # collapse runs
pperl -ne 'print if /\S/' # drop blank lines
Line-ending conversion#
pperl -pe 's/\r\n/\n/g' win.txt # DOS → UNIX
pperl -pe 's/(?<!\r)\n/\r\n/g' unix.txt # UNIX → DOS
The lookbehind (?<!\r) avoids producing \r\r\n when the input is
already DOS-encoded. See
regex-recipes for lookarounds in depth.
Field-oriented work: -a and -F#
-a auto-splits each line into @F. -F changes the delimiter.
$ cat table.txt
brown bread mat hair 42
blue cake mug shirt -7
yellow banana window shoes 3.14
# Second field
pperl -lane 'print $F[1]' table.txt
# bread, cake, banana
# Last field
pperl -lane 'print $F[-1]' table.txt
# 42, -7, 3.14
# Count fields per line
pperl -lane 'print scalar @F' table.txt
# 5, 5, 5
# Lines whose last field is negative
pperl -lane 'print if $F[-1] < 0' table.txt
# blue cake mug shirt -7
Custom delimiter#
# /etc/passwd: print username (field 1) and shell (last field)
pperl -F: -lane 'print "$F[0] $F[-1]"' /etc/passwd
# Tab-separated
pperl -F'\t' -lane 'print $F[2]' metrics.tsv
# Comma or semicolon, one or more
pperl -F'[,;]+' -lane 'print scalar @F' mixed.csv
Proper CSV (quoted fields with embedded commas) is not a job for -F.
Use -MText::CSV — see numeric for the pattern.
Rejoining fields#
Interpolating @F inside double quotes uses $" (default: one space)
as the separator.
# Reverse field order, space-separated
pperl -lane 'print "@{[reverse @F]}"' table.txt
# Reverse field order, colon-separated
pperl -F: -lane 'local $" = ":"; print "@{[reverse @F]}"' /etc/passwd
@{[ EXPR ]} is the array-ref-then-deref idiom that lets arbitrary
expressions be interpolated into a double-quoted string.
Reductions across lines#
The move from per-line filters to cross-line computation.
Counts#
# wc -l
pperl -lne 'END { print $. }' big.txt
# grep -c pattern
pperl -lne '$n++ if /pattern/; END { print $n + 0 }' file
# Count blank lines
pperl -lne '$n++ if /^$/; END { print $n + 0 }' file
# Total fields across the file
pperl -lane '$t += @F; END { print $t }' table.txt
The $n + 0 idiom prints 0 rather than empty when the counter was
never incremented.
Sums, minima, maxima#
Use List::Util via -M.
# Per line
pperl -MList::Util=sum -lane 'print sum @F' data.txt
pperl -MList::Util=min -lane 'print min @F' data.txt
pperl -MList::Util=max -lane 'print max @F' data.txt
# Across all lines
pperl -MList::Util=sum -lane '$s += sum @F; END { print $s }' data.txt
pperl -MList::Util=max -alne '$m = max($m // (), @F); END { print $m }' data.txt
$m // () returns an empty list when $m is undefined, so the first
iteration does not pass undef to max. See
numeric for the full arithmetic chapter.
Longest and shortest line#
pperl -ne '$l = $_ if length > length $l; END { print $l }' file
pperl -ne '$s = $_ if $. == 1 || length < length $s; END { print $s }' file
Unique lines (first sighting only)#
pperl -ne 'print unless $seen{$_}++' file
# Or the List::Util one-liner (slurps):
pperl -MList::Util=uniq -e 'print uniq <>' file
The first form streams — memory scales with distinct lines, not total lines. The second slurps everything into memory, then emits. Prefer the first for large inputs.
Duplicates only#
# Print each duplicate line once, when first seen for the second time
pperl -ne 'print if ++$seen{$_} == 2' file
Context-windowed filters#
Printing the line around a match, not just the match itself.
# Line immediately before each match (grep -B 1, with caveats)
pperl -ne '/pattern/ && $prev && print $prev; $prev = $_' file
# Line immediately after each match (grep -A 1)
pperl -ne 'print if $p; $p = /pattern/' file
# tail -n 10
pperl -ne 'push @a, $_; shift @a if @a > 10; END { print @a }' file
For rich -A / -B / -C behaviour, grep is the right tool. One-
liners fit when the window rule is unusual (e.g. “print the line
matching X, plus every line until the next blank line”).
# Print from /SECTION/ to the next blank line
pperl -ne 'print if /SECTION/ .. /^$/' file
Record mode: -0, -00, -0777#
Switch away from line-based reading when the natural record is not a line.
Slurp with -0777#
# Single-shot regex across the whole file
pperl -0777 -ne 'print "match\n" if /BEGIN.*?END/s' text.txt
# Replace only the first occurrence across the file
pperl -0777 -i.bak -pe 's/TODO/DONE/' notes.txt
-0777 sets $/ = undef. The entire file comes in as one $_. The
/s modifier makes . match newlines. See
regex-recipes for the full multi-line story.
Paragraph mode with -00#
-00 sets $/ = "". Each read returns everything up to the next
blank-line delimiter.
# Paragraphs containing "TODO"
pperl -00 -ne 'print if /TODO/' notes.md
# Reverse paragraph order
pperl -00 -e 'print reverse <>' notes.md
NUL-delimited records with -0#
# Count files produced by find -print0
find . -type f -print0 | pperl -0 -ne '$n++; END { print "$n\n" }'
# Convert NUL-delimited stream to newline-delimited
find . -type f -print0 | pperl -0 -pe 's/\0/\n/'
In-place editing with -i#
-i overwrites each input file with what -p prints. -i.bak keeps
a backup.
# Lowercase every occurrence of HOSTNAME in every .conf (keep backup)
pperl -i.bak -pe 's/HOSTNAME/\L$&/g' *.conf
# Delete every line matching DEPRECATED (no backup — only do this
# after you've validated the pattern on one file)
pperl -i -ne 'print unless /DEPRECATED/' file.txt
The backup files pile up. Clean them up with rm *.bak once verified.
See gotchas for the classic in-place traps.
Structured output#
CSV-like (lightweight — commas are not quoted)#
# Rebuild a TSV as CSV (naive, assumes no commas in fields)
pperl -F'\t' -lane 'print join ",", @F' data.tsv
# Add a column of line numbers
pperl -lane 'print join "\t", $., @F' data.tsv
JSON (stdlib)#
# Emit one JSON object per line (no commas in values assumed)
pperl -MJSON::PP -F: -lane '
print JSON::PP->new->encode({
user => $F[0], uid => 0+$F[2], shell => $F[-1]
})
' /etc/passwd
Multi-line one-liners like this are where an alias starts to pay for itself.
Numbered input#
pperl -pe '$_ = "$. $_"' # 1 gate, 2 apple, ...
pperl -pe 's/^/sprintf "%5d ", $./e' # padded width
The /e modifier on s/// treats the replacement as Perl code. See
regex-recipes.
Find out more#
regex-recipes — the regex-dependent one-liners: captures, lookaround, named groups,
/e,/r.numeric — arithmetic, statistics, randoms, date math.
aliases — turn the recipes you use weekly into shell functions.
gotchas — quoting, encoding,
-ibackups, record- separator surprises.perlop— the flip-flop operator, the diamond operator,s///,tr///, quoting forms.