Numbers and strings#

A scalar can hold a number, a string, or both at once. Perl silently converts between the two whenever an operator demands one shape — + wants numbers, . wants strings, comparison operators have separate numeric and string forms. Understanding the conversion rules and where they bite is essential to writing predictable Perl.

Numeric literals#

Integer literals come in five bases:

0                    # zero
12345                # decimal
4_294_967_296        # underscores for legibility (decimal 4.29 billion)
0xff                 # hex      — 255
0Xdead_beef          # hex with underscores
0b01_1011            # binary   — 27
0377                 # octal    — 255 (leading 0, NOT followed by [bBxX])
0o12_345             # octal, alternative spelling (5.33.5+)

Float literals require either a dot or an exponent:

3.14
0.5                  # leading zero ok
.5                   # leading dot ok
1e3                  # 1000
1.5E-10              # exponent — uppercase or lowercase 'e'

Underscores are accepted between digits anywhere — 3.14_15_92, 1_000_000. Multiple consecutive underscores warn: 23__500 warns, 23_500 does not.

Hex and octal floats need the p (power-of-2) exponent:

0x1.999ap-4          # hex float
0b101.01p-1          # binary float
0o7.65p2             # octal float

The p exponent is required for non-decimal floats. Plain 0x1F.5 is a parse error.

String → number#

When Perl converts a string to a number, it consumes the longest prefix that looks like a number and ignores the rest:

"42"        + 0      # 42
"42abc"     + 0      # 42       — leading digits parsed
"abc42"     + 0      # 0        — no leading digits
" \t  42"   + 0      # 42       — leading whitespace ok
"4_2"       + 0      # 4        — underscores in literals only, NOT strings
"0xFF"      + 0      # 0        — hex prefix NOT auto-recognised
"3.14e2"    + 0      # 314      — scientific notation parsed
"Inf"       + 0      # Inf      — case-insensitive infinity
"NaN"       + 0      # NaN

The mismatch between numeric literals (which honour _, 0x, 0b, 0) and numeric strings (which honour none of those) is the single most-tripped-over conversion rule. To convert a string that’s in non-decimal form, use hex or oct:

my $bytes = hex "FF";          # 255
my $bytes = hex "0xFF";        # 255  — hex() ignores the 0x
my $perms = oct "0755";        # 493
my $perms = oct "0o755";       # 493  — explicit octal prefix
my $bytes = oct "0xFF";        # 255  — oct() recognises any prefix

Under use warnings, an arithmetic operation on a string that doesn’t start with a number prefix emits Argument «…» isn’t numeric in addition (+). The result is still 0; the warning is the signal. Writing tolerant code that accepts user input often means filtering with Scalar::Util::looks_like_number or a regex first.

Number → string#

Perl converts a number to a string using the same logic as sprintf("%g", $n) for floats and the obvious decimal representation for integers:

my $i = 42;             "$i"   # "42"
my $f = 3.14;           "$f"   # "3.14"
my $z = 0.0;            "$z"   # "0"        — not "0.0"
my $b = 1e10;           "$b"   # "10000000000"
my $s = 1e20;           "$s"   # "1e+20"    — switches to scientific past ~15 digits
my $n = "Inf" + 0;      "$n"   # "Inf"

The default precision is enough digits to roundtrip most doubles. For exact control, format explicitly with sprintf:

sprintf "%.2f", 3.14159           # "3.14"
sprintf "%05d", 42                # "00042"
sprintf "%e", 1234567             # "1.234567e+06"

Concatenation forces strings; arithmetic forces numbers#

"42" + "3"           # 45    — both coerced to numbers
42 . 3               # "423" — both coerced to strings
"3" x "5"            # "33333" — left operand string, right operand number

The . operator is the explicit-stringify operator; + is the explicit-numerify operator. A scalar that holds a number you want as a string can be concatenated with "":

my $s = "" . $n;     # idiomatic "stringify"
my $s = "$n";        # same; usually clearer

A scalar that holds a string you want as a number adds 0:

my $n = $s + 0;      # idiomatic "numify"

Numeric vs string comparison#

Two parallel families of comparison operators with different semantics:

Numeric	String	Meaning
`==`	`eq`	equal
`!=`	`ne`	not equal
`<`	`lt`	less
`<=`	`le`	less or equal
`>`	`gt`	greater
`>=`	`ge`	greater or equal
`<=>`	`cmp`	three-way compare

The sort order is dramatically different on values that look like numbers but are large:

"10" <  "9"          # FALSE — 10 < 9 numerically
"10" lt "9"          # TRUE  — '1' < '9' lexicographically

Sort numbers numerically:

my @sorted = sort { $a <=> $b } @numbers;       # numeric
my @sorted = sort                @strings;       # default — string
my @sorted = sort { $a cmp $b }  @strings;       # explicit string

Picking the wrong operator is the source of «my sort is wrong» bugs. See numeric comparison and string comparison.

Floating-point: precision, `Inf`, `NaN`#

Most Perls use IEEE 754 double-precision floats — about 15–17 decimal digits of precision. The classic surprise:

0.1 + 0.2 == 0.3      # FALSE — neither side is exactly representable
                      # 0.1 + 0.2 = 0.30000000000000004

For money or anything that must be exact, work in integers (cents instead of dollars) or use a bignum module:

use bignum;            # auto-promotes to arbitrary precision
0.1 + 0.2 == 0.3      # TRUE under bignum

use Math::BigInt;
my $big = Math::BigInt->new("100000000000000000000") + 1;

Inf and NaN are real values:

my $inf = 9 ** 9 ** 9;       # Inf — overflow
my $nan = $inf - $inf;       # NaN — undefined arithmetic
my $nan = "NaN" + 0;         # NaN — accepted from string

$nan == $nan                  # FALSE — NaN never equals anything
$nan != $nan                  # TRUE  — including itself

Inf and NaN are case-insensitive on input and have several spellings (Infinity, 1.#INF, …); on output Perl normalises to the short forms.

Note that perl5 generates fatal errors for 1/0 and sqrt(-1); those don’t produce NaN. They are exceptional, not arithmetic.

Locale-affected number parsing#

Under use locale plus an active POSIX locale, the decimal point is whatever the locale says. This affects parsing and stringifying of floats:

use locale;
use POSIX qw(setlocale LC_NUMERIC);
setlocale LC_NUMERIC, 'de_DE.UTF-8';   # German uses ',' as decimal

my $x = 3.14;
print "$x\n";                            # "3,14"  — locale formatting
"3,14" + 0                                # 3.14    — locale parsing

This is rarely what you want for data interchange. Most code that reads or writes structured data should explicitly not be in the use locale scope, so that . is always the decimal point. Use locale formatting only at the boundary where output is for a human reader.

«Looks like a number»#

The rules for whether a string is «numeric enough» are not simple to inline; use Scalar::Util::looks_like_number:

use Scalar::Util qw(looks_like_number);

looks_like_number("42")        # true
looks_like_number("3.14e10")   # true
looks_like_number("Inf")       # true
looks_like_number("NaN")       # true
looks_like_number("0xFF")      # FALSE — hex strings are not "numeric"
looks_like_number("")          # FALSE — empty string
looks_like_number(undef)       # FALSE
looks_like_number("3.14abc")   # FALSE — trailing junk

This is the validator function — strict, no leading garbage, no trailing garbage, no hex/oct prefixes. It matches the conditions under which Perl will use a number’s cached numeric form rather than re-parsing on every arithmetic op.