Program input — @ARGV, %ENV, @INC#
Three globals carry input into the program: command-line arguments, environment variables, and the module search path. They share enough context that decisions about one usually inform the others — @ARGV interacts with $0, %ENV’s Perl-relevant keys influence @INC, and @INC’s ordering question is mirrored by similar concerns for @ARGV consumers and shell PATH.
Variable | Holds |
|---|---|
| Command-line arguments (excluding script name) |
| Script name (paired with |
| Process environment |
| Module search path (used by |
| Already-loaded modules |
| Current |
@ARGV and its relationship to $0#
After Perl starts, the script name is in $0 and the script’s arguments are in @ARGV:
$ pperl myscript.pl -v --output /tmp/x foo bar
print "$0\n"; # myscript.pl
print "@ARGV\n"; # -v --output /tmp/x foo bar
print scalar(@ARGV), " args\n"; # 5 args
$ARGV[0] is the first script argument, not the script name itself. This differs from C’s argv (where argv[0] is the program name); Perl exposes the program name separately via $0.
Standard argument handling#
Hand-written arg parsing is fine for trivial scripts:
my %opt;
while (@ARGV && $ARGV[0] =~ /^-/) {
my $flag = shift @ARGV;
last if $flag eq '--';
if ($flag eq '-v') { $opt{verbose} = 1 }
elsif ($flag eq '--output'){ $opt{output} = shift @ARGV }
else { die "unknown flag: $flag\n" }
}
my @files = @ARGV;
For anything beyond toy scripts, the Getopt::Long module handles every standard pattern (long options, abbreviation, mandatory arguments, repeated flags, -- separator, automatic help):
use Getopt::Long;
my %opt;
GetOptions(
'verbose|v' => \$opt{verbose},
'output|o=s' => \$opt{output},
'help|h' => \$opt{help},
) or die "bad options; try --help\n";
my @files = @ARGV; # what's left after option processing
GetOptions mutates @ARGV in place — when it returns, @ARGV contains only the non-option arguments.
@ARGV and the diamond operator#
The <> (diamond) operator iterates lines from the files named in @ARGV, falling back to STDIN if @ARGV is empty:
while (<>) {
# one line from one of the @ARGV files (or STDIN)
chomp;
process($_);
}
This is the default contract of every awk-style filter program. <> opens each file in turn, sets $ARGV to its name during reading, and closes it at EOF.
-i and the other - switches (-n, -p, -l, -a) all build their behaviour on this @ARGV → <> pattern.
Modifying @ARGV before <>#
You can pre-process @ARGV to alter what <> reads:
# Add an extra file to the front of the argv list:
unshift @ARGV, 'header.txt';
# Process compressed files transparently by rewriting argv:
@ARGV = map { /\.gz$/ ? "gunzip -c $_ |" : $_ } @ARGV;
while (<>) {
process($_);
}
The cmd | form turns the argument into a pipe-from-command open; this is a long-standing Perl idiom for transparent decompression in awk-style scripts.
%ENV — the process environment#
%ENV is the hash of environment variables inherited from the parent process (typically the shell). Reading is straightforward:
my $home = $ENV{HOME};
my $term = $ENV{TERM} // 'dumb';
my $path = $ENV{PATH};
Writing changes the environment seen by child processes that this Perl script subsequently spawns:
$ENV{LC_ALL} = 'C'; # POSIX locale for child commands
$ENV{TZ} = 'UTC';
$ENV{LANG} = 'C';
delete $ENV{LC_NUMERIC}; # remove a key entirely
system('date'); # child sees the modified ENV
The change does not propagate back to the parent — the shell that started the script keeps its original environment. If you need the parent shell to pick up new values, write them to a file or have the shell eval your output (the standard dotenv pattern).
Stringification of values#
As of Perl 5.18, %ENV values are always stringified at assignment time. Storing a reference no longer round-trips:
my $arr = [1, 2, 3];
$ENV{DATA} = $arr;
print $ENV{DATA}; # "ARRAY(0x...)" — stringified
Environment variables are a string-typed interface to the operating system; preserving structure across a fork/exec boundary would not work anyway. Marshal anything non-trivial as JSON or similar before storing.
Perl-specific environment variables#
A small set of environment variables are honoured by Perl itself, not by user scripts:
Variable | Effect |
|---|---|
| Colon-separated paths prepended to |
| Same, older variable; used only if |
| Switches applied as if on the command line (subset only) |
| Options for the debugger |
| Default UTF-8 layer settings (mirrors |
| Re-add |
| Per-run hash randomisation seed |
|
|
| Read by |
| Searched by |
In tainted mode (-T), PERL5LIB, PERL5OPT, and PERLLIB are ignored to prevent privilege escalation. See the command-line switches guide for the full security model.
delete $ENV{KEY} versus $ENV{KEY} = undef#
The two are not equivalent:
delete $ENV{KEY}; # KEY is not in the environment
$ENV{KEY} = undef; # KEY is in the environment, value ""
A child process started after delete will not see KEY at all — getenv("KEY") returns NULL. After undef assignment, the child sees KEY="" (empty string). Code that distinguishes ”unset“ from ”set to empty“ cares about this difference.
@INC — the module search path#
When use and require look up a module, they walk @INC left to right and stop at the first match:
print "$_\n" for @INC;
# /usr/local/lib/perl5/site_perl/5.42.0/x86_64-linux
# /usr/local/lib/perl5/site_perl/5.42.0
# /usr/local/lib/perl5/5.42.0/x86_64-linux
# /usr/local/lib/perl5/5.42.0
# (no trailing "." since Perl 5.26)
@INC is initialised at startup from a combination of:
Compiled-in defaults (the build-time
installprivlib,installsitelib, …).-I /pathswitches on the command line, prepended in order.The
PERL5LIB(orPERLLIB) environment variable, also prepended.
After startup it is just an ordinary array. Order matters: a match in an early element wins.
use lib versus unshift @INC versus push @INC#
Three ways to add a path. They are not equivalent:
use lib '/my/lib'; # unshift at compile-time
unshift @INC, '/my/lib'; # unshift at runtime
push @INC, '/my/lib'; # push at runtime
The differences:
use libruns at compile time. It happens before anyusestatement that follows it on the page. This is what you almost always want when you’re saying ”this script needs a custom library directory“:use lib '/opt/myapp/lib'; use MyApp::Module; # found because lib was unshifted first
Internally
use lib '/path'doesBEGIN { unshift @INC, '/path' }, so the new path is searched first — a copy of a module in/opt/myapp/libshadows one in the system@INC.unshift @INC, '/path'at the top level has runtime semantics. It works forrequirecalls that happen after theunshift, but ausestatement on a later line of the same file will not see it (becauseuseis itself compile-time):unshift @INC, '/my/lib'; use Some::Module; # NOT found in /my/lib — too early
This is a frequent gotcha.
BEGIN { unshift @INC, '/my/lib' }fixes it;use libis the cleaner spelling of exactly that.push @INC, '/path'appends. The new path is searched last. This is the right choice when you want your library to be a fallback — used only if no other location has the module — for example, providing a built-in version of a CPAN module that the user might have installed on the system.
The rule of thumb: always use use lib for ”I want my code’s module path to take effect at compile time“, which is almost every case.
@INC hooks#
@INC may contain not only paths but also code references and blessed objects. When require encounters one of these, it calls it as a hook — passing the requested module path — and the hook returns either an open filehandle or a list of source code, generators, etc. This is how mechanisms like Module::Pluggable, PAR, and Test::MockModule intercept module loading.
The full hook protocol is documented under require. Most users never write one; you read about them when debugging ”why does this module not load from where I expect“ problems.
%INC — the loaded-modules cache#
After a module is successfully loaded, %INC records it. The key is the path Perl was asked to find (e.g. Foo/Bar.pm); the value is the absolute path that satisfied the request:
use Data::Dumper;
print "Data::Dumper loaded from $INC{'Data/Dumper.pm'}\n";
# /usr/local/lib/perl5/site_perl/5.42.0/Data/Dumper.pm
# Show every loaded module:
for my $key (sort keys %INC) {
print "$key → $INC{$key}\n";
}
require checks %INC before walking @INC — a second require Foo; is a no-op because Foo.pm is already in the cache. To force a re-require:
delete $INC{'Foo.pm'};
require Foo; # actually re-runs Foo.pm
This is the standard idiom for testing reload behaviour. Be aware that re-running a module file does not un-define what the first run created — packages, subs, and globals stay around.
$INC — the index inside an @INC hook#
Available since Perl 5.37.7. When an @INC hook is being called, $INC is set to the index of the hook in @INC. After the hook returns, the iterator advances based on $INC + 1 — so the hook can rewrite @INC and steer the search to a specific position afterwards. This is an advanced feature; everyday code never touches it.
See also#
use lib— the canonical compile-time spelling for prepending to@INC.Getopt::Long— the standard parser for@ARGV.readline— the function form of the diamond<>operator that drives@ARGV → STDINfiltering.-i,-n,-p,-a— command-line switches built on@ARGV.Process variables —
$0is the script name;$$is the PID; pair these with@ARGVfor self-restart and logging.Command-line one-liners — the tutorial showing every
@ARGV/<>/switch combination in context.