Regular expressions and pattern matching

study#

A no-op retained for source compatibility with older Perl code.

study accepts a scalar argument (or operates on $_ when called with none) and does nothing with it. The call is valid, the argument is evaluated, and control returns immediately. No index is built, no regex engine state is altered, no later match runs any faster or slower because of it.

The name survives because decades of Perl code — books, FAQs, CPAN distributions, in-house scripts — sprinkle study in front of long strings about to be matched against many patterns. Those programs still parse and still run. They just no longer get the optimisation the keyword once promised.

Synopsis#

study SCALAR
study

What you get back#

Always 1. The return value is not documented as useful and no idiomatic code inspects it. Treat study as returning nothing worth capturing.

History#

Through Perl 5.14 study built an inverted index of the bytes in SCALAR, then biased the regex engine to anchor each subsequent match on the rarest character in the pattern (rarity estimated from a static frequency table baked into the interpreter). The intended workload was “one long string, many m// against it”: the index amortised across the matches.

The optimisation was removed in Perl 5.16. It interacted badly with Unicode, /i case folding, and the post-5.10 regex engine, and the maintenance cost had long outweighed the narrow workload it helped. The keyword was kept as a no-op so existing source kept compiling.

Examples#

Legacy idiom — still parses, no longer has any effect:

study $big_text;
while ($big_text =~ /$pattern/g) {
    # ...
}

No argument — applied to $_, also a no-op:

for (@lines) {
    study;
    print if /error/;
}

Replacement guidance: if the original code relied on study for speed, the modern answer is to compile the pattern once with qr and reuse it, or to restructure so the repeated work is the pattern side, not the string side:

my $re = qr/$pattern/;
for my $line (@lines) {
    print $line if $line =~ $re;
}

Edge cases#

  • No effect on regex performance. Removing every study call from a program changes nothing observable, including timing beyond measurement noise.

  • Argument is still evaluated. study some_expensive_call() runs some_expensive_call() for its side effects and return value, then discards the result. This is almost never what the author intended.

  • No warning on use. study does not warn under use warnings and is not deprecated, so a linter is the only way to flag stale occurrences in a codebase.

  • Not a pragma, not a hint. study is a regular built-in expression. It does not persist across statements and there is no “studied” state attached to the scalar.

  • Tainting and magic are untouched. study $tainted does not launder taint, does not fire FETCH on a tied scalar more than a plain read would, and does not mark the scalar in any way.

Differences from upstream#

Fully compatible with upstream Perl 5.42. study is a no-op in both.

See also#

  • qr — compile a pattern once and reuse the compiled form; this is the modern answer to “make many matches against the same text faster”

  • m — the match operator whose performance study used to influence

  • index — fixed-string search, faster than a regex when the pattern is a literal substring

  • pos — where the last /g match left off, relevant to the “one long string, many matches” workload study once targeted