--- name: logic in bits, regex, and control flow --- # Applications The previous chapters were about boolean logic in the abstract. This one is about where it shows up most often in working Perl: bitwise arithmetic, regular expressions, and the short-circuit idioms that take the place of explicit conditionals. ## Bitwise: logic on integer bits The bitwise operators apply boolean operations to each pair of bits in two integers in parallel. A 32-bit integer is, from the operators' point of view, thirty-two parallel one-bit values. | Op | Reads as | Per-bit rule | |-------|-----------------------|---------------------------------------| | `&` | bitwise AND | each output bit = `a∧b` | | `\|` | bitwise OR | each output bit = `a∨b` | | `^` | bitwise XOR | each output bit = `a⊕b` | | `~` | bitwise NOT | each output bit = `¬a` | | `<<` | left shift | shift bits left, zero-fill from right | | `>>` | right shift | shift bits right | The same boolean operators — `∧`, `∨`, `⊕`, `¬` — appear here in purely numeric form. That is not a coincidence; it is the definition. ### Setting, clearing, toggling, testing a flag The four basic flag operations on a single bit: ```perl use constant FLAG_VERBOSE => 0x01; use constant FLAG_DRY_RUN => 0x02; use constant FLAG_RECURSE => 0x04; use constant FLAG_FORCE => 0x08; my $flags = 0; $flags |= FLAG_VERBOSE; # SET -- OR with the bit $flags |= FLAG_DRY_RUN; # SET another $flags &= ~FLAG_DRY_RUN; # CLEAR -- AND with the inverted bit $flags ^= FLAG_VERBOSE; # TOGGLE -- XOR with the bit my $on = $flags & FLAG_RECURSE;# TEST -- AND, then test truthiness ``` Each of these is a one-bit application of a boolean operator. Set is `bit ∨ flag`, clear is `bit ∧ ¬flag`, toggle is `bit ⊕ flag`, test is `bit ∧ flag`. ### XOR-swap: swapping without a temporary XOR has two properties that combine into a memorable trick: `a ⊕ a = 0`, and `a ⊕ b ⊕ b = a`. Apply them in sequence and you can swap two integers without a third variable: ```perl my ($a, $b) = (0xFEED, 0xBEEF); $a ^= $b; # a := a ⊕ b $b ^= $a; # b := b ⊕ (a ⊕ b) = a $a ^= $b; # a := (a ⊕ b) ⊕ a = b print "a=$a b=$b\n"; # a=48879 b=65261 (0xBEEF, 0xFEED) ``` The trick is not actually useful in Perl — `($a, $b) = ($b, $a)` is faster, clearer, and works on any scalar including strings and references. But it shows up in two places that matter: - **Embedded code without a free register.** A microcontroller with three values to juggle in two registers reaches for this. - **Interview folklore.** Knowing it exists and *why* it works (XOR is its own inverse) is worth the thirty seconds it takes to read. The reason it works is exactly the boolean identity from the truth-table chapter: `x ⊕ y ⊕ y = x`. Each of the three lines above is one application of that identity. ### Common bit tricks A handful of patterns you will see in performance-sensitive code: ```perl $x & ($x - 1) # $x with its lowest set bit cleared ($x & ($x - 1)) == 0 # true when $x is a power of two (and non-zero) $x | -$x # signed: zero iff $x was zero, non-zero otherwise ($x >> 31) & 1 # the sign bit, on a 32-bit signed integer 1 << $n # the integer with only bit $n set $x & (1 << $n) # is bit $n set in $x? ``` Each of these is an exercise in tracking what the bits do — pure boolean reasoning applied 32 (or 64) times in parallel. ## Regular expressions: logic on sets of strings A regex matches a *set* of strings. The empty set, the singleton `{"foo"}`, the infinite set "anything that begins with a digit" — all are sets, and the regex denotes one of them. Once you see a regex as a set, the boolean operations have geometric meaning: | Boolean operation | On sets | In regex syntax | |--------------------------|----------------|---------------------------------------| | OR (∨) | union | alternation: `foo\|bar` | | AND (∧) | intersection | lookahead pair: `(?=foo)(?=bar)` | | NOT (¬) | complement | negative lookahead: `(?!foo)` | | AND-NOT (a ∧ ¬b) | difference | `(?=A)(?!B)A` | Two specifics worth pulling out. ### Alternation is OR over sets ```perl $s =~ /yes|no|maybe/; ``` Matches the union of three singleton sets. There is nothing more to it; `|` in regex syntax is precisely the boolean ∨. Inside a character class the meaning is the same: ```perl $s =~ /[abc]/; # union of {"a"}, {"b"}, {"c"} $s =~ /[^abc]/; # complement: anything NOT in {"a","b","c"} ``` `[^...]` is the regex syntax for `¬` applied to a character set. ### Lookarounds are AND and NOT A regex without lookaround consumes characters as it matches; the two ends of an alternation `A|B` cannot both match the same span (only one branch wins). To express *both* `A` *and* `B` at the same position, you need lookaround: a zero-width assertion that demands a property without consuming. ```perl # matches only if the rest of the string is BOTH digits AND ≤ 4 chars $s =~ /^(?=\d+$)(?=.{1,4}$)/; # \_____/\_______/ # A B intersection: A ∧ B ``` `(?=...)` is positive lookahead; `(?!...)` is negative lookahead (the boolean NOT). Combining them gives you the full set algebra on regex predicates: ```perl # "starts with a digit but is not the literal '0'": # (digit) ∧ ¬(literal "0") $s =~ /^(?=\d)(?!0$)/; ``` ### Why the framing helps Once you read regex this way, the readability of complicated patterns improves dramatically. A regex with two lookaheads is not "two regexes glued together" — it is the intersection of two sets. A negative lookahead followed by a positive match is not "first reject, then match" — it is set difference. The boolean algebra you already know is the algebra of regexes; only the notation changes. ## Control flow: short-circuit logic as conditional execution The chapter on operators introduced the operand-return rule for `&&`, `||`, and `//`. That rule, plus precedence, generates a small family of idioms that replace explicit `if`/`else` for short conditional logic: ```perl # defaulting my $port = $cfg{port} // 8080; # guard with side-effect open my $fh, '<', $path or die "open $path: $!"; # lazy initialisation (set if currently false) $cache{$key} ||= compute($key); # lazy initialisation (set if currently undef — a real `0` is kept) $cfg{retries} //= 3; # guard chain: do step 2 only if step 1 succeeded my $rc = step_one() && step_two(); # logical selection inside an expression my $label = $count == 1 ? "1 item" : "$count items"; ``` Each of these is a boolean expression chosen for its *side-effects* — the short-circuit decides whether the right operand even runs. `open ... or die` works exactly because `or` does not evaluate its right side when the left is true. Two pieces of advice. **Use `//` and `//=` when `0` or `""` is a valid value.** This single change has prevented more bugs than any other modern Perl idiom; `||` defaulting on a configured zero is one of the classic ways to lose data. **Reach for `?:` when you would otherwise build an `if`/`else`/scalar-assign triplet.** The ternary keeps the expression as an expression and the assignment in one line: ```perl # verbose my $kind; if ($n == 1) { $kind = 'singular' } else { $kind = 'plural' } # idiomatic my $kind = $n == 1 ? 'singular' : 'plural'; ``` Don't nest `?:` more than two levels deep. Past two, the chained form is harder to read than the `if`/`elsif`/`else`. The ternary is for *choosing*; sequential decisions belong in a block. ## What you should remember from this chapter - Bitwise `&`, `|`, `^`, `~` are the boolean operators applied bit-by-bit in parallel; flag set/clear/toggle/test are the four one-bit applications. - A regex denotes a set of strings; `|` is union, `[^...]` is complement, `(?=)` and `(?!)` make intersection and difference. - `&&`, `||`, `//`, `?:` give you most of conditional control flow without writing `if`. Use `//` when `0` is real; use `?:` for short selection inside an expression.