Filter::Util::Call#

📦 std

Framework for writing Perl source filters — modules that rewrite program text on the fly as the tokenizer reads it.

A source filter is a normal Perl module that registers a callback with the compiler. Each time the tokenizer needs more source, it calls back into the filter, which pulls raw text from the next filter in the chain (or the file), mangles it, and hands the result back. The compiler never sees the original bytes — it sees whatever the filter produces. This is the mechanism behind Switch, Filter::Simple, PDL::NiceSlice, Smart::Comments, and similar compile-time rewriters.

The filter lifecycle#

A filter goes through three phases. Each phase corresponds to one of the primary functions exported by this module:

  • Install — the filter’s import is called when Perl processes use MyFilter. import calls filter_add to register itself with the compiler. From this point on the compiler will route every read through the filter.

  • Read — each time the compiler wants more source, it invokes the filter’s filter method (or anonymous sub). The filter calls filter_read (line mode) or filter_read_exact (block mode) to pull raw text from the chain below, transforms it in $_, and returns a status: positive for “more data”, 0 for EOF, negative for error.

  • Remove — when the filter decides it is done (for example on seeing an end-marker in the source), it calls filter_del. The compiler then reads directly from whatever filter — or file — sat below it in the chain.

The two filter shapes#

A filter is either a method filter (a blessed object whose class provides a filter method) or a closure filter (an anonymous sub passed directly to filter_add). The two shapes carry context differently — method filters stash state on the object, closure filters close over lexicals — but the read/return contract is identical in both cases.

Shared conventions#

  • $_ is the filter’s scratch buffer. It is cleared before the filter’s filter is called; filter_read appends to it; the transformed contents of $_ on return are what the compiler sees.

  • The status code is three-valued: > 0 success, = 0 EOF, < 0 error. Check it after every read.

  • Filters only run at compile time. Calling filter_add outside a use-triggered import (for example at runtime) does nothing useful, because there is no active parser to register with.

Skeleton#

package MyFilter;
use Filter::Util::Call;

sub import {
    filter_add(bless []);         # method filter
}

sub filter {
    my ($self) = @_;
    my $status = filter_read();
    s/old/new/g if $status > 0;   # rewrite $_
    $status;
}
1;

Functions#

Filter lifecycle#

filter_del#

Remove the calling filter from the active chain. Subsequent reads bypass it and hit the next filter down (or the source file).

filter_add#

Install a source filter for the current package. Called from the filter module’s import.

Input handling#

filter_read#

Pull the next line of source — or up to a given number of bytes — from the filter chain into $_.

filter_read_exact#

Pull exactly $size bytes from the filter chain into $_, looping over short reads until the request is satisfied or the stream ends.

Internal dispatch#

real_import#

Low-level installer that wires a pre-blessed object or code ref into the tokenizer. Called by filter_add after it has determined the filter shape; not intended for direct user calls.