--- name: ord signature: 'ord EXPR' since: 5.0 status: documented categories: ["SCALARs and strings"] --- ```{index} single: ord; Perl built-in ``` *[SCALARs and strings](../perlfunc-by-category)* # ord Return the Unicode code point of the first character of a string. `ord` takes a string, looks at its first character, and returns that character's numeric code point as an integer. Note *character*, not byte — if the string is a wide-character string containing the letter `é` (U+00E9), `ord` returns `233`, regardless of how many bytes `é` occupies in memory. For the empty string, `ord` returns `0`. If `EXPR` is omitted, `ord` operates on [`$_`](../perlvar). For the reverse direction — turning a code point back into a character — see [`chr`](chr). ## Synopsis ```perl ord EXPR ord ord($str) ``` ## What you get back A non-negative integer. For an ASCII input it is in the range `0..127`; for a Latin-1 input `0..255`; for a general Unicode string anything up to `0x10FFFF`. `ord` never returns a negative number and never returns a non-integer. Only the first character of the argument is consulted — trailing characters are ignored. To walk every code point of a string, combine with `split //` or `unpack "U*"`. ## Global state it touches With no argument, `ord` reads [`$_`](../perlvar). It neither writes nor reads any other special variable. The interpretation of the input as characters (not bytes) depends on whether the scalar is internally flagged as UTF-8 — see *Edge cases* below. ## Examples Basic ASCII lookup: ```perl print ord("A"); # 65 print ord("0"); # 48 print ord("\n"); # 10 ``` Empty string returns zero, matching the "no first character" contract: ```perl print ord(""); # 0 ``` Default-argument form inside a `while` loop over [`$_`](../perlvar): ```perl while () { last if ord == 4; # stop on EOT (Ctrl-D) as first char print; } ``` Decode a wide character. `ord` sees the character, not the bytes: ```perl use utf8; print ord("é"); # 233 (U+00E9) print ord("€"); # 8364 (U+20AC) print ord("😀"); # 128512 (U+1F600) ``` Walk every code point in a string — useful for debugging encoding surprises: ```perl use utf8; my $s = "héllo"; print join(",", map { ord } split //, $s), "\n"; # 104,233,108,108,111 ``` Pair with [`chr`](chr) for a no-op round-trip on any single character: ```perl my $c = "Z"; print chr(ord($c)) eq $c ? "same\n" : "changed\n"; # same ``` ## Edge cases - **Character vs byte semantics**: `ord` returns the code point of the first *character*, which for a UTF-8-flagged scalar is not the same as the first *byte*. `ord("\x{100}")` returns `256`, even though that character occupies two bytes in memory. To inspect the first byte instead, force byte semantics: ```perl use bytes; print ord("\x{100}"); # 196 (first UTF-8 byte, 0xC4) no bytes; print ord("\x{100}"); # 256 ``` Reach for `use bytes` only when you genuinely need byte-level inspection; it is a local, surgical pragma. - **Non-UTF-8 scalar containing high bytes**: if the scalar is a plain byte string (no UTF-8 flag) and its first byte is `0xC4`, `ord` returns `196`, not `256`. The byte is interpreted as Latin-1. Whether a given scalar is UTF-8-flagged depends on how it was built — `use utf8` in source, `decode_utf8`, a `:utf8` I/O layer, [`chr`](chr) of a value above `255`, and so on. - **[`undef`](undef)**: `ord(undef)` returns `0` and, under `use warnings`, emits `Use of uninitialized value in ord`. - **Numeric argument**: `ord(65)` stringifies `65` to `"65"` first, then takes the first character — so it returns `ord("6") == 54`, not `65`. This surprises people; if you have an integer and want to round-trip through a character, use [`chr`](chr) first or skip `ord` entirely. - **Multi-character argument**: only the first character matters. `ord("ABC")` returns `65`; the `B` and `C` are never consulted. - **Surrogate and non-character code points**: `ord` will happily return `0xD800..0xDFFF` or `0xFFFE`/`0xFFFF` if those appear in the input. Perl does not refuse to store them; validation, if needed, is the caller's job. - **Maximum value**: on a standard build, `ord` can return up to `0x7FFFFFFF` (31-bit) for a string produced by [`chr`](chr) of a value in that range. Real-world Unicode text tops out at `0x10FFFF`. - **Precedence**: the unary form `ord $x` binds tighter than `,` but looser than most arithmetic operators, so `ord $x + 1` parses as `ord($x) + 1`, not `ord($x + 1)`. Parenthesise when in doubt. ## Differences from upstream Fully compatible with upstream Perl 5.42. ## See also - [`chr`](chr) — the inverse operation; turns a code point back into a one-character string - [`sprintf`](sprintf) — use `%c` to format an integer as its character, or `%x` / `%04x` to render an `ord` result in hex - [`unpack`](unpack) — `unpack "U*"` for every code point of a string; `unpack "C*"` for every byte - [`lc`](lc) / [`uc`](uc) — case-fold a character before taking its code point when you want case-insensitive comparisons - [`$_`](../perlvar) — the default argument when `ord` is called without one