ucfirst#
Return a copy of a string with its first character titlecased.
ucfirst takes a string, converts its first character to the
Unicode titlecase form, and returns the result. Every other
character is left untouched. The original is never modified. If the
argument is omitted, ucfirst operates on $_. Titlecase usually
coincides with uppercase — "a" becomes "A" — but differs for a
small number of digraphs in scripts that distinguish the two (see
Edge cases). The same pragma, locale, and UTF-8-flag rules that
govern lc and uc govern ucfirst.
Synopsis#
ucfirst EXPR
ucfirst
What you get back#
A new string of the same length in characters as the input — only the first character is altered. The return value is always a fresh scalar; modifying it does not affect the argument, and modifying the argument afterward does not affect the returned string. The byte length may change by a few bytes if the first character’s titlecase form has a different UTF-8 encoding from its original form.
my $str = ucfirst("hello world!"); # "Hello world!"
Global state it touches#
Reads
$_when called without an argument.Observes
use localeforLC_CTYPE— with locale in effect the current locale’s uppercasing table applies to a first character whose code point is below 256.Observes
use bytes— underuse bytesonly a leadinga-zchanges, toA-Z.Observes
use feature 'unicode_strings'— forces Unicode rules regardless of the UTF-8 flag on the input.
Which casing rules apply#
ucfirst picks one of five rulesets for the first character, in
priority order. The first one whose condition holds wins. Subsequent
characters are always passed through unchanged, regardless of which
ruleset applied to the first.
use bytesin effect — ASCII rules. A leadinga-zmaps toA-Z; every other leading byte is left alone, including bytes in the 128-255 range that would otherwise be Latin-1 letters.use localeforLC_CTYPEin effect — the current locale’s tables apply to a first character below code point 256; a first character at 256 or above (only reachable when the string already carries the UTF-8 flag) uses Unicode rules. From Perl 5.20 onward, a UTF-8 locale uses full Unicode rules throughout.The argument has the UTF-8 flag set — Unicode titlecase rules apply to the first character.
use feature 'unicode_strings'oruse locale ':not_characters'in effect — Unicode titlecase rules apply to the first character, regardless of the UTF-8 flag.Otherwise — ASCII rules. A leading character outside
a-zis returned unchanged, including Latin-1 lowercase letters likeà-þ.
The upshot: if you want predictable Unicode behaviour on every
string regardless of how it was constructed, enable
use feature 'unicode_strings' (or use v5.12 and above, which
turns it on for you).
Titlecase vs uppercase#
For the overwhelming majority of characters, titlecase and uppercase
are the same code point — ucfirst("abc") and uc(substr("abc", 0, 1)) . substr("abc", 1) produce identical results. They differ only for
a handful of Unicode characters whose uppercase form is a sequence of
two capital letters but whose titlecase form is a single
capital-followed-by-lowercase letter. The standard example is the
Latin digraph dz (U+01F3):
uppercase is
DZ(U+01F1) —"DZ"rendered as one capital glyph.titlecase is
Dz(U+01F2) —"Dz", capitalDfollowed by smallz.
ucfirst returns titlecase, which is the correct form when the
character stands at the start of a capitalised word. uc on the same
first character would return the two-capital form, which reads as if
the whole word were in caps.
Examples#
Basic ASCII first-character uppercasing:
my $s = ucfirst("hello, world!"); # "Hello, world!"
Default to $_:
for ("foo", "bar", "baz") {
print ucfirst, "\n"; # Foo / Bar / Baz
}
Unicode characters only titlecase when the rules allow it:
use feature 'unicode_strings';
my $s = ucfirst("äpfel"); # "Äpfel"
Without unicode_strings and without the UTF-8 flag on the input, a
non-ASCII first character passes through unchanged:
my $bytes = "\xE4pfel"; # ä as a single Latin-1 byte
my $cap = ucfirst $bytes; # unchanged: "\xE4pfel"
ucfirst is what backs the \u escape inside double-quoted strings:
my $str = "\uperl\E is great"; # "Perl is great"
Capitalise every word — combine with split,
map, and join:
my $title = join " ", map { ucfirst lc $_ } split / /, "HELLO world";
# "Hello World"
Titlecase a digraph that distinguishes titlecase from uppercase:
use feature 'unicode_strings';
my $t = ucfirst("\x{01F3}xyz"); # "\x{01F2}xyz" — Dz, not DZ
Edge cases#
undefargument raises anuninitializedwarning underuse warningsand returns the empty string.Empty string returns the empty string — there is no first character to change.
Single-character string behaves exactly like
ucwould for that one character under the same ruleset, except in the digraph cases described above where titlecase and uppercase differ.Numbers are stringified first:
ucfirst(42)returns"42"— the leading"4"is not a cased character, so nothing changes.Leading whitespace or punctuation is not skipped.
ucfirst(" hello")returns" hello"— the first character is the space, which has no titlecase form. If you need to capitalise the first letter, strip leading non-letters first or use a regex (s/\b(\w)/\u$1/).LATIN SMALL LETTER SHARP S(U+00DF,ß) titlecases to"Ss"under full Unicode rules — a length change that contradicts the usual “same length in characters” guarantee. Underuse localeon a non-UTF-8 locale, Perl leaves the character unchanged rather than guessing, and from Perl 5.22 onward this raises a locale warning.Locale and the UTF-8 flag interact: under
use localeon a non-UTF-8 locale, a first character below 256 follows the locale while a first character 256 or above follows Unicode. In practiceucfirstonly ever touches one character, so the interaction is subtler than forlc/uc, but it is still observable in strings that begin with supplementary-plane characters.Tied variables have their
FETCHcalled once.ucfirstdoes not modify the tied value; it returns a plain (non-tied) scalar.ucfirstis a unary named operator, not a list operator.ucfirst $a, $bparses as(ucfirst $a), $b— only$ais transformed, and the comma operator discards the result. Use parentheses ormapwhen you mean to titlecase multiple strings:my @capped = map { ucfirst } @words;
Differences from upstream#
Fully compatible with upstream Perl 5.42.
See also#
lcfirst— the lowercasing counterpart; turns the first character into its lowercase formuc— uppercase every character in the string; use when you want the whole string capitalised, not just the first letterfc— Unicode casefold, the correct choice for case-insensitive comparison when input may contain non-ASCII$_— the default subjectucfirstreads when called with no argumentuse locale— controls whether locale tables or ASCII/Unicode rules apply to the first character when it is below code point 256use feature 'unicode_strings'— forces full Unicode titlecasing regardless of the UTF-8 flag on the input