fc#
Return the Unicode casefolded form of a string for case-insensitive comparison.
fc is the string operation that powers case-insensitive string
equality. It produces a form of the string where case distinctions
have been erased, so two strings are considered case-insensitively
equal when — and only when — their fc results are byte-for-byte
identical. It is also the function behind the \F escape in
double-quoted strings.
Synopsis#
use feature 'fc'; # or: use v5.16;
fc EXPR
fc # operates on $_
"\F...\E" # same casefold, inside a double-quoted string
fc is gated behind the fc feature. Enable it explicitly with
use feature 'fc', pull it in via a version bundle (use v5.16 or
newer, use feature ':5.16'), or call it fully qualified as
CORE::fc without any pragma.
What you get back#
A string containing the casefolded form of EXPR. The result is a
fresh string; the argument is never modified. Length may change —
casefolding "\x{1E9E}" (LATIN CAPITAL LETTER SHARP S) normally
expands to "ss", two characters from one.
Treat the return value as opaque: it is a key suitable for equality
comparison, not for display. fc("Hello") is "hello" for ASCII,
but in the general case the output is not something you’d show to a
user.
Why not lc or uc?#
Lowercasing and uppercasing are not reliable for case-insensitive comparison. Both of these are wrong:
lc($a) eq lc($b) # Wrong
uc($a) eq uc($b) # Also wrong
They fail on characters whose lower/upper mapping is not symmetric —
most famously the German sharp S. lc("\x{1E9E}") is "\x{1E9E}"
(no lowercase form), but uc("ß") is "SS". Casefolding
sidesteps this by mapping both sides into a dedicated equality form:
fc($a) eq fc($b) # Right
The regex-based equivalent that was correct before fc existed:
$a =~ /^\Q$b\E\z/i
fc is the direct, non-regex way to get the same answer.
Global state it touches#
$_— used as the argument whenEXPRis omitted.use locale— inside ause localescope, casefolding of characters crossing the 255/256 boundary is disabled (see Edge cases below for the U+1E9E rule).use feature 'unicode_strings'— affectsfcthe same way it affectslc: forces full Unicode semantics on byte strings that would otherwise be treated under legacy 8-bit rules.
Examples#
Case-insensitive equality, the canonical use:
use feature 'fc';
fc("Hello") eq fc("HELLO"); # true
fc("café") eq fc("CAFÉ"); # true
The German sharp S — the textbook case where lc and uc both fail
but fc succeeds:
use feature 'fc';
my $a = "straße";
my $b = "STRASSE";
fc($a) eq fc($b); # true
lc($a) eq lc($b); # false — "straße" ne "strasse"
Using fc as a hash key for case-insensitive lookup:
use feature 'fc';
my %seen;
for my $word (@words) {
$seen{ fc $word }++; # groups "Foo", "FOO", "foo"
}
No argument — operates on $_:
use feature 'fc';
for (@lines) {
next unless fc eq "quit"; # matches "QUIT", "Quit", ...
last;
}
Inside a double-quoted string via \F (same operation, inline):
use feature 'fc';
my $name = "Alice";
my $key = "\F$name\E"; # "alice" — same as fc($name)
Calling without the feature pragma, fully qualified:
my $folded = CORE::fc($input); # works in any scope
Edge cases#
No argument:
fcwith no argument folds$_.undefargument: stringifies to the empty string, which folds to the empty string. Emits anuninitializedwarning underuse warnings.Feature not enabled:
fc EXPRwithoutuse feature 'fc'(or ause v5.16+ bundle) is a compile-time error — the parser does not recognisefcas a keyword.CORE::fc(EXPR)always works.Length can change: full casefolds may expand one character to several.
fc("\x{1E9E}")is"ss". Never rely onlength(fc $s) == length($s).Not a round-trip:
fcis one-way. There is no “uncasefold” operation; the original case is gone.U+1E9E under
use locale:fcof LATIN CAPITAL LETTER SHARP S (U+1E9E) normally folds to"ss". Underuse localethat mapping is suppressed because it crosses the 255/256 codepoint boundary, which locale rules do not handle cleanly. Insteadfcreturns"\x{17F}\x{17F}"(two LATIN SMALL LETTER LONG S). Since each long s itself folds to"s", two of them compare equal to a single U+1E9E folded outside the locale scope — so equality semantics are preserved even though the byte form differs.Turkic and “simple” folds are not provided: Perl implements only the full, non-Turkic form of casefolding. For the simple form or the Turkic variant, use
Unicode::UCD::casefoldor the CPAN moduleUnicode::Casing.Not the same as NFKC or NFC:
fcerases case, not compatibility differences.fc("ffi")(LATIN SMALL LIGATURE FFI) is"ffi"after fold, not"ffi". Combine withUnicode::Normalizeif you need both.
Differences from upstream#
Fully compatible with upstream Perl 5.42.
See also#
lc— lowercases a string; use for display, not for case-insensitive comparisonlcfirst— lowercases only the first characterucfirst— uppercases only the first characterindex— substring search; pair withfcon both operands for a case-insensitive variant$_— the default argument whenEXPRis omitted