Applications#

The previous chapters were about boolean logic in the abstract. This one is about where it shows up most often in working Perl: bitwise arithmetic, regular expressions, and the short-circuit idioms that take the place of explicit conditionals.

Bitwise: logic on integer bits#

The bitwise operators apply boolean operations to each pair of bits in two integers in parallel. A 32-bit integer is, from the operators’ point of view, thirty-two parallel one-bit values.

Op

Reads as

Per-bit rule

&

bitwise AND

each output bit = a∧b

|

bitwise OR

each output bit = a∨b

^

bitwise XOR

each output bit = a⊕b

~

bitwise NOT

each output bit = ¬a

<<

left shift

shift bits left, zero-fill from right

>>

right shift

shift bits right

The same boolean operators — , , , ¬ — appear here in purely numeric form. That is not a coincidence; it is the definition.

Setting, clearing, toggling, testing a flag#

The four basic flag operations on a single bit:

use constant FLAG_VERBOSE  => 0x01;
use constant FLAG_DRY_RUN  => 0x02;
use constant FLAG_RECURSE  => 0x04;
use constant FLAG_FORCE    => 0x08;

my $flags = 0;

$flags |=  FLAG_VERBOSE;       # SET    -- OR with the bit
$flags |=  FLAG_DRY_RUN;       # SET another
$flags &= ~FLAG_DRY_RUN;       # CLEAR  -- AND with the inverted bit
$flags ^=  FLAG_VERBOSE;       # TOGGLE -- XOR with the bit
my $on = $flags & FLAG_RECURSE;# TEST   -- AND, then test truthiness

Each of these is a one-bit application of a boolean operator. Set is bit flag, clear is bit ¬flag, toggle is bit flag, test is bit flag.

XOR-swap: swapping without a temporary#

XOR has two properties that combine into a memorable trick: a a = 0, and a b b = a. Apply them in sequence and you can swap two integers without a third variable:

my ($a, $b) = (0xFEED, 0xBEEF);

$a ^= $b;     # a := a ⊕ b
$b ^= $a;     # b := b ⊕ (a ⊕ b) = a
$a ^= $b;     # a := (a ⊕ b) ⊕ a  = b

print "a=$a b=$b\n";   # a=48879 b=65261  (0xBEEF, 0xFEED)

The trick is not actually useful in Perl — ($a, $b) = ($b, $a) is faster, clearer, and works on any scalar including strings and references. But it shows up in two places that matter:

  • Embedded code without a free register. A microcontroller with three values to juggle in two registers reaches for this.

  • Interview folklore. Knowing it exists and why it works (XOR is its own inverse) is worth the thirty seconds it takes to read.

The reason it works is exactly the boolean identity from the truth-table chapter: x y y = x. Each of the three lines above is one application of that identity.

Common bit tricks#

A handful of patterns you will see in performance-sensitive code:

$x &  ($x - 1)         # $x with its lowest set bit cleared
($x & ($x - 1)) == 0   # true when $x is a power of two (and non-zero)
$x | -$x               # signed: zero iff $x was zero, non-zero otherwise
($x >> 31) & 1         # the sign bit, on a 32-bit signed integer
1 << $n                # the integer with only bit $n set
$x & (1 << $n)         # is bit $n set in $x?

Each of these is an exercise in tracking what the bits do — pure boolean reasoning applied 32 (or 64) times in parallel.

Regular expressions: logic on sets of strings#

A regex matches a set of strings. The empty set, the singleton {"foo"}, the infinite set “anything that begins with a digit” — all are sets, and the regex denotes one of them.

Once you see a regex as a set, the boolean operations have geometric meaning:

Boolean operation

On sets

In regex syntax

OR (∨)

union

alternation: foo|bar

AND (∧)

intersection

lookahead pair: (?=foo)(?=bar)

NOT (¬)

complement

negative lookahead: (?!foo)

AND-NOT (a ∧ ¬b)

difference

(?=A)(?!B)A

Two specifics worth pulling out.

Alternation is OR over sets#

$s =~ /yes|no|maybe/;

Matches the union of three singleton sets. There is nothing more to it; | in regex syntax is precisely the boolean ∨.

Inside a character class the meaning is the same:

$s =~ /[abc]/;        # union of {"a"}, {"b"}, {"c"}
$s =~ /[^abc]/;       # complement: anything NOT in {"a","b","c"}

[^...] is the regex syntax for ¬ applied to a character set.

Lookarounds are AND and NOT#

A regex without lookaround consumes characters as it matches; the two ends of an alternation A|B cannot both match the same span (only one branch wins). To express both A and B at the same position, you need lookaround: a zero-width assertion that demands a property without consuming.

# matches only if the rest of the string is BOTH digits AND ≤ 4 chars
$s =~ /^(?=\d+$)(?=.{1,4}$)/;
#       \_____/\_______/
#         A         B          intersection: A ∧ B

(?=...) is positive lookahead; (?!...) is negative lookahead (the boolean NOT). Combining them gives you the full set algebra on regex predicates:

# "starts with a digit but is not the literal '0'":
# (digit) ∧ ¬(literal "0")
$s =~ /^(?=\d)(?!0$)/;

Why the framing helps#

Once you read regex this way, the readability of complicated patterns improves dramatically. A regex with two lookaheads is not “two regexes glued together” — it is the intersection of two sets. A negative lookahead followed by a positive match is not “first reject, then match” — it is set difference. The boolean algebra you already know is the algebra of regexes; only the notation changes.

Control flow: short-circuit logic as conditional execution#

The chapter on operators introduced the operand-return rule for &&, ||, and //. That rule, plus precedence, generates a small family of idioms that replace explicit if/else for short conditional logic:

# defaulting
my $port = $cfg{port} // 8080;

# guard with side-effect
open my $fh, '<', $path  or die "open $path: $!";

# lazy initialisation (set if currently false)
$cache{$key} ||= compute($key);

# lazy initialisation (set if currently undef — a real `0` is kept)
$cfg{retries} //= 3;

# guard chain: do step 2 only if step 1 succeeded
my $rc = step_one() && step_two();

# logical selection inside an expression
my $label = $count == 1 ? "1 item" : "$count items";

Each of these is a boolean expression chosen for its side-effects — the short-circuit decides whether the right operand even runs. open ... or die works exactly because or does not evaluate its right side when the left is true.

Two pieces of advice.

Use // and //= when 0 or "" is a valid value. This single change has prevented more bugs than any other modern Perl idiom; || defaulting on a configured zero is one of the classic ways to lose data.

Reach for ?: when you would otherwise build an if/else/scalar-assign triplet. The ternary keeps the expression as an expression and the assignment in one line:

# verbose
my $kind;
if ($n == 1) { $kind = 'singular' }
else         { $kind = 'plural'   }

# idiomatic
my $kind = $n == 1 ? 'singular' : 'plural';

Don’t nest ?: more than two levels deep. Past two, the chained form is harder to read than the if/elsif/else. The ternary is for choosing; sequential decisions belong in a block.

What you should remember from this chapter#

  • Bitwise &, |, ^, ~ are the boolean operators applied bit-by-bit in parallel; flag set/clear/toggle/test are the four one-bit applications.

  • A regex denotes a set of strings; | is union, [^...] is complement, (?=) and (?!) make intersection and difference.

  • &&, ||, //, ?: give you most of conditional control flow without writing if. Use // when 0 is real; use ?: for short selection inside an expression.