lc#

Return a lowercased copy of a string.

lc takes a string, converts every cased character to its lowercase equivalent, and returns the result. The original is never modified. If the argument is omitted, lc operates on $_. The exact set of characters that change depends on the string’s encoding and the pragmas in scope at the call site - ASCII-only by default, full Unicode under use feature 'unicode_strings' or when the input is already a character string with the UTF-8 flag.

Synopsis#

lc EXPR
lc

What you get back#

A new string. The return value is always a fresh scalar - modifying it does not affect the argument, and modifying the argument afterward does not affect the returned string. Length in characters is preserved: lc never adds or removes characters, even in the rare Unicode cases where upper- and lower-case forms have different lengths in other directions (see fc for folding, which can change length).

my $str = lc("Perl is GREAT");   # "perl is great"

Global state it touches#

Reads $_ when called without an argument.
Observes use locale for LC_CTYPE - with locale in effect the current locale’s lowercasing table applies to code points below 256.
Observes use bytes - under use bytes only A-Z change, to a-z.
Observes use feature 'unicode_strings' - forces Unicode rules regardless of the UTF-8 flag on the input.

Which casing rules apply#

Perl picks one of four rulesets, in priority order. The first one whose condition holds wins:

use bytes in effect - ASCII rules. Only A-Z map to a-z; every other byte is left alone, including bytes in the 128-255 range that would otherwise be Latin-1 letters.
use locale for LC_CTYPE in effect - the current locale’s tables apply to code points below 256; code points 256 and above (only reachable when the string already carries the UTF-8 flag) use Unicode rules. From Perl 5.20 onward, a UTF-8 locale uses full Unicode rules throughout.
The argument has the UTF-8 flag set - Unicode rules apply to every character.
use feature 'unicode_strings' or use locale ':not_characters' in effect - Unicode rules apply to every character, regardless of the UTF-8 flag.
Otherwise - ASCII rules. Characters outside A-Z are returned unchanged, including Latin-1 uppercase letters like À-Þ.

The upshot: if you want predictable Unicode behaviour on every string regardless of how it was constructed, enable use feature 'unicode_strings' (or use v5.12 and above, which turns it on for you).

Examples#

Basic ASCII lowercasing:

my $s = lc("Hello, World!");        # "hello, world!"

Default to $_:

for ("FOO", "Bar", "BAZ") {
    print lc, "\n";                 # foo / bar / baz
}

Unicode characters only lowercase when the rules allow it:

use feature 'unicode_strings';
my $s = lc("ÄÖÜ");                  # "äöü"

Without unicode_strings and without the UTF-8 flag on the input, non-ASCII characters pass through unchanged:

my $bytes = "\xC4\xD6\xDC";         # Ä Ö Ü as Latin-1 bytes
my $lower = lc $bytes;              # unchanged: "\xC4\xD6\xDC"

lc is what backs the \L...\E escape inside double-quoted strings:

my $str = "Perl is \LGREAT\E";      # "Perl is great"

Case-insensitive compare by lowercasing both sides:

sub ieq { lc($_[0]) eq lc($_[1]) }
ieq("Perl", "PERL");                # true

For correct case-insensitive comparison across Unicode, prefer fc (the Unicode folding function) over lc - see See also.

Edge cases#

undef argument raises an uninitialized warning under use warnings and returns the empty string.
Empty string returns the empty string.
Numbers are stringified first: lc(42) returns "42".
LATIN CAPITAL LETTER SHARP S (U+1E9E) lowercases to U+00DF (ß) under Unicode rules, but only if the result can be represented without crossing the 255/256 boundary in a way the current ruleset accepts. Under use locale on a non-UTF-8 locale, Perl leaves the character unchanged rather than guessing - and from Perl 5.22 onward this raises a locale warning.
Locale and the UTF-8 flag interact: under use locale on a non-UTF-8 locale, characters below 256 follow the locale while characters 256 and above follow Unicode. A string containing both ranges will get two different casing tables applied to it.
Tied variables have their FETCH called once. lc does not modify the tied value; it returns a plain (non-tied) scalar.
lc is a unary named operator, not a list operator. lc $a, $b parses as (lc $a), $b - only $a is lowercased, and the comma operator discards the result. Use parentheses or map when you mean to lowercase multiple strings:
```
my @lower = map { lc } @words;
```

Differences from upstream#

Fully compatible with upstream Perl 5.44.

lc#