---
name: uc
signature: 'uc EXPR'
signatures:
  - 'uc EXPR'
  - 'uc'
since: 5.0
status: documented
categories: ["SCALARs and strings"]
---

```{index} single: uc; Perl built-in
```

*[SCALARs and strings](../perlfunc-by-category)*

# uc

Return an uppercased copy of a string.

`uc` walks `EXPR` character by character and returns a new string in
which every cased character has been replaced by its uppercase
counterpart. Characters that have no uppercase mapping — digits,
punctuation, already-uppercase letters, symbols — are passed through
unchanged. If `EXPR` is omitted, `uc` operates on [`$_`](../perlvar). The input is
never modified; `uc` always returns a fresh string.

`uc` is the internal function implementing the `\U` escape in
double-quoted strings, so `"\Ufoo\E"` and `uc("foo")` produce the
same result.

## Synopsis

```perl
uc EXPR
uc
```

## What you get back

A new scalar string containing the uppercased version of `EXPR`. The
length in characters is unchanged for all cased letters in the Basic
Multilingual Plane; a few characters fold to a longer form under
Unicode rules (see *Edge cases*), so the byte length and character
length of the result can both grow.

```perl
my $s = uc("Perl is GREAT");   # "PERL IS GREAT"
```

## Global state it touches

- [`$_`](../perlvar) — used as the implicit argument when `EXPR` is omitted.
- `LC_CTYPE` locale — consulted when `use locale` is active. `uc`
  otherwise ignores the environment.
- `use bytes` / `use feature 'unicode_strings'` / `use locale` —
  the lexical pragmas in scope at the **call site** select which of
  the casing rule sets below is applied.

## Which rules apply

`uc` picks one of three rule sets based on the pragmas in effect and
the internal representation of the string. The logic matches [`lc`](lc)
exactly; the mapping table is just the reverse.

- **`use bytes` in effect** — ASCII rules. Only `a-z` change, to
  `A-Z` respectively. Every byte outside `0x61..0x7A` is passed
  through unchanged, regardless of what it would mean as Latin-1 or
  UTF-8. Use this only when you have deliberately opted out of
  Unicode handling.
- **`use locale` for `LC_CTYPE` in effect** — the current locale
  governs code points below 256; Unicode rules govern the rest (the
  latter only reachable if the string has the UTF-8 flag set). From
  v5.20 onward, a UTF-8 locale gives full Unicode rules for the
  whole string. Under non-UTF-8 locales, case changes that cross
  the 255/256 boundary are not well-defined and, since v5.22, trigger
  a locale warning. See `perllocale`.
- **String has the UTF-8 flag set** — Unicode rules.
- **`use feature 'unicode_strings'` or `use locale ':not_characters'`
  in effect** — Unicode rules, even for byte strings.
- **Otherwise** — ASCII rules. Anything outside `a-z` is passed
  through unchanged. This is the historic default for strings that
  have neither the UTF-8 flag nor a lexical pragma opting into
  Unicode.

The practical takeaway: if you want predictable Unicode uppercasing
of arbitrary input, put `use feature 'unicode_strings';` (or a
modern `use v5.12;` or higher) at the top of the file and stop
worrying about which representation the string happens to have.

## Examples

Basic ASCII:

```perl
my $s = uc("hello");           # "HELLO"
```

Omitted argument operates on [`$_`](../perlvar):

```perl
for ("alpha", "beta") {
    print uc, "\n";            # "ALPHA\n", "BETA\n"
}
```

`\U` in a double-quoted string is the same operation:

```perl
my $s = "Perl is \Ugreat\E";   # "Perl is GREAT"
```

Unicode uppercasing under `unicode_strings`:

```perl
use feature 'unicode_strings';
my $s = uc("straße");          # "STRASSE"
```

The German sharp s has no single-character uppercase in Unicode; it
maps to the two-character sequence `SS`. The returned string is
therefore one character longer than the input.

Greek lowercase letters uppercase into capitals, with unchanged
characters passed through:

```perl
use feature 'unicode_strings';
my $s = uc("Καλημέρα, world!"); # "ΚΑΛΗΜΈΡΑ, WORLD!"
```

Byte mode confines the operation to ASCII, which is occasionally
what you want for protocol tokens:

```perl
use bytes;
my $s = uc("héllo");           # "HéLLO" — only the ASCII letters change
```

## Edge cases

- **[`undef`](undef)**: `uc(undef)` returns `""` and triggers an
  `uninitialized` warning under `use warnings`. Guard inputs that may
  be undefined if that warning matters.
- **Empty string**: `uc("")` returns `""`. No warning.
- **One-to-many mappings**: A small number of characters expand to
  multiple characters on uppercase — the best-known is U+00DF
  (`ß`) → `SS`, and the Greek U+0390 / U+03B0 expansions. The result
  string is longer than the input in those cases. Do not assume
  `length(uc($s)) == length($s)`.
- **Titlecase vs uppercase**: `uc` does **not** titlecase the first
  letter. Titlecase is a distinct Unicode category; [`ucfirst`](ucfirst) applies
  it to the first character and leaves the rest alone.
- **Non-cased characters**: Digits, punctuation, whitespace, symbols,
  and CJK ideographs are returned unchanged. `uc("42!")` is `"42!"`.
- **In-place update is not a thing**: `uc` returns a new value; it
  never modifies its argument. To uppercase in place, assign back:
  `` $s = uc $s ``.
- **Context**: `uc` is always scalar. Calling it in list context
  still produces a single string.
- **Byte-string / Unicode-flag surprises**: A string built from a
  [`read`](read) without a `:utf8` layer has no UTF-8 flag, so `uc` without
  `use feature 'unicode_strings'` will apply **ASCII rules** even if
  the bytes are valid UTF-8. Decode first, or enable the pragma, to
  get the casing the bytes look like they should get.
- **`use locale` and non-UTF-8 locales (v5.22+)**: Case changes that
  would cross the 255/256 boundary emit a
  `Can't do uc("…") on non-UTF-8 locale` warning and return the
  input character unchanged.

## Differences from upstream

Fully compatible with upstream Perl 5.42.

## See also

- [`lc`](lc) — the inverse operation; lowercases every cased
  character using the same rule selection
- [`ucfirst`](ucfirst) — uppercase (titlecase) only the first
  character of the string; for a Unicode-aware
  [`ucfirst`](ucfirst) `. lc $rest` pattern when titlecasing a whole
  word
- [`lcfirst`](lcfirst) — lowercase only the first character
- [`fc`](fc) — Unicode casefolding for case-insensitive comparison;
  use this, not `uc` or [`lc`](lc), when comparing strings for equivalence
- [`sprintf`](sprintf) — the `%s` conversion does not casefold;
  combine with `uc` when a format needs an uppercased field
- [`tr///`](../perlop) — the `tr/a-z/A-Z/` idiom uppercases only
  ASCII and is faster than `uc` for guaranteed-ASCII input