String comparison operators#
The string-flavoured counterparts of the numeric comparison family. Same shape, same precedence row, but operating lexicographically (Unicode code-point order by default) instead of numerically.
Operator | Question | Returns |
|---|---|---|
| less than | true / false |
| less or equal | true / false |
| equal | true / false |
| greater or equal | true / false |
| greater than | true / false |
| not equal | true / false |
| three-way (sort) |
|
Both operands are coerced to strings before comparing.
$name eq "John" # exact string match
$kind ne "guest" # negated match
$word lt "m" # alphabetically before "m"
$a cmp $b # sort comparator
Lexicographic order#
Strings compare character by character, code-point by code-point. The first differing character decides; if one string is a prefix of the other, the shorter wins.
"abc" lt "abd" # TRUE -- 'c' < 'd' at position 2
"abc" lt "abcd" # TRUE -- prefix loses
"ABC" lt "abc" # TRUE -- ASCII: uppercase < lowercase
"10" lt "9" # TRUE -- '1' < '9' at position 0 (lex, not numeric!)
The last example is the canonical reason eq/lt/gt exist: when you want digit-strings sorted numerically, you must use <=> or pre-coerce.
Unicode#
Code-point order is not the same as locale-aware ”alphabetical“ order:
"ä" gt "z"is TRUE under code-point order because U+00E4 is beyond U+007A.Under German DIN 5007-1 (”dictionary“) order,
"ä"should sort with"a"— long before"z".
For locale-correct collation, use Unicode::Collate from perlfunc or use locale with a suitable locale set. The bare lt/gt/cmp give you ordered, stable, language-independent comparison — which is exactly the right thing for sort keys, hash bucketing, deterministic test output, and so on. It is the wrong thing for human-facing alphabetical listings in any non-English language.
cmp for sorting#
cmp is the string-comparison spaceship. It returns -1, 0, or +1 and chains the same way <=> does:
my @sorted = sort { $a cmp $b } @names; # ascending lex order
my @cased = sort {
lc($a) cmp lc($b) || $a cmp $b # case-insensitive,
# ties broken by case
} @names;
Mixing flavours: a worked bug#
The compound-key sort idiom from numeric comparison showed ||-chaining of <=> and cmp. The bug to avoid is using the wrong operator for the type of the key:
# version strings like "1.10", "1.2", "1.20", ...
sort @versions # ASCII order: "1.10","1.2","1.20"
sort { $a <=> $b } @versions # numeric mash: works only by accident
# (everything past first dot ignored)
sort { sortkey($a) <=> sortkey($b) } @versions # parse first, then compare
The right answer for version strings is a parser like Sort::Versions or hand-rolled (\d+)-tokenisation; neither cmp nor <=> does it correctly on its own.
Precedence#
String comparison shares row 11 of the precedence table with numeric comparison. They are non-associative — the chaining caveat from numeric comparison applies here too:
"a" lt "b" lt "c" # parses as ("a" lt "b") lt "c"
# = 1 lt "c"
# = TRUE (because "1" < "c" lexically)
# — accidentally right for the wrong reason.
Write the conjunction explicitly with &&.
See also#
Numeric comparison — the parallel family.
sort,reverse,lc,uc,fc— perlfunc tools that pair with string comparison.Unicode in Perl — the locale / code-point-order distinction in depth.