Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Data Types

Perl has three fundamental data types: scalars, arrays, and hashes. PetaPerl implements these with identical semantics to Perl 5.

Scalars

Scalars are Perl’s basic data type, holding a single value. The variable name begins with $.

my $number = 42;
my $string = "hello";
my $float = 3.14;
my $ref = \@array;

Scalar Types

Internally, PetaPerl represents scalars as an enum with these variants:

TypeDescriptionExample
UndefUndefined valueundef
IvInteger (i64)42, -17
UvUnsigned integer (u64)Large positive numbers
NvFloat (f64)3.14, 1.5e10
PvImmutable string (Arc<str>)Constants, literals
PvBufMutable string (Arc<String>).= targets, $_
RvReference\$x, \@arr, \%hash
AvArrayCreated by []
HvHashCreated by {}

Dynamic typing: A scalar can change type during execution.

my $x = 42;          # Integer
$x = "hello";        # Now a string
$x = 3.14;           # Now a float

Scalar Context

Operations that expect a single value force scalar context:

my $count = @array;          # Array length
my $last = (1, 2, 3);        # Last element (3)
if (@array) { ... }          # True if non-empty

Special Scalars

ValueDescription
undefUndefined value
0Numeric zero, string “0”
""Empty string

Truth values: These are false in boolean context: undef, 0, "0", "". Everything else is true.

if ($value) { ... }          # False if undef, 0, "0", or ""
if (defined $value) { ... }  # False only if undef

Arrays

Arrays are ordered lists of scalars. Variable names begin with @.

my @numbers = (1, 2, 3, 4, 5);
my @words = qw(foo bar baz);
my @empty = ();

Array Access

my $first = $numbers[0];     # First element (index 0)
my $last = $numbers[-1];     # Last element
$numbers[5] = 6;             # Set element

Note the sigil change: Use $ to access a single element because you’re getting a scalar.

Array Length

my $length = @array;         # Scalar context
my $length = scalar @array;  # Explicit scalar context
my $max_index = $#array;     # Highest index (length - 1)

Array Operations

push @arr, $value;           # Append
my $value = pop @arr;        # Remove last
my $value = shift @arr;      # Remove first
unshift @arr, $value;        # Prepend

splice @arr, $offset, $len, @replacement;  # General removal/insertion

Array Slices

Extract multiple elements at once:

my @subset = @arr[0, 2, 4];  # Elements 0, 2, 4
my @range = @arr[0..5];      # Elements 0 through 5
@arr[1, 3] = (10, 20);       # Assign to multiple elements

List Context

Operations that expect multiple values force list context:

my @copy = @original;        # Array copy
my ($a, $b, $c) = (1, 2, 3); # List assignment
my @results = function();    # Function returns list

Array References

Create references to arrays:

my $aref = \@array;          # Reference to existing array
my $aref = [1, 2, 3];        # Anonymous array reference

Access via reference:

my $elem = $aref->[0];       # First element
my @copy = @$aref;           # Dereference to array
push @$aref, $value;         # Push via reference

Hashes

Hashes are unordered key-value pairs. Variable names begin with %.

my %user = (
    name => "John",
    age => 30,
    email => "john@example.com",
);

Hash Access

my $name = $user{name};      # Get value
$user{city} = "NYC";         # Set value

Sigil change: Use $ for single element access (getting a scalar).

Hash Operations

my @keys = keys %hash;       # All keys
my @values = values %hash;   # All values
while (my ($k, $v) = each %hash) { ... }  # Iterate

if (exists $hash{key}) { ... }  # Check existence
my $val = delete $hash{key};    # Remove and return

Hash Slices

Extract multiple values at once:

my @vals = @hash{qw(name age)};         # Values for keys
@hash{qw(x y)} = (10, 20);              # Assign multiple

Note: Hash slices use @ sigil because they return a list.

Hash References

Create references to hashes:

my $href = \%hash;           # Reference to existing hash
my $href = { key => "val" }; # Anonymous hash reference

Access via reference:

my $val = $href->{key};      # Get value
$href->{new} = "value";      # Set value
my @keys = keys %$href;      # Dereference to hash

References

References are scalars that point to other data.

Creating References

my $scalar_ref = \$scalar;
my $array_ref = \@array;
my $hash_ref = \%hash;
my $code_ref = \&sub;
my $anon_array = [1, 2, 3];
my $anon_hash = { a => 1, b => 2 };
my $anon_sub = sub { ... };

Dereferencing

my $value = $$scalar_ref;    # Scalar dereference
my @array = @$array_ref;     # Array dereference
my %hash = %$hash_ref;       # Hash dereference
my $result = &$code_ref();   # Code dereference

Arrow notation (preferred for clarity):

my $elem = $array_ref->[0];
my $val = $hash_ref->{key};
my $result = $code_ref->(@args);

Reference Types

Check reference type with ref:

my $type = ref $ref;
# Returns: 'SCALAR', 'ARRAY', 'HASH', 'CODE', 'REF', or '' (not a ref)

Nested Data Structures

References enable complex data structures:

my $data = {
    users => [
        { name => "John", age => 30 },
        { name => "Jane", age => 25 },
    ],
    config => {
        debug => 1,
        timeout => 30,
    },
};

my $name = $data->{users}->[0]->{name};  # "John"

Autovivification

Perl automatically creates intermediate references:

my %hash;
$hash{a}{b}{c} = 1;          # Creates nested hashes automatically
my $val = $hash{x}[0];       # Creates array ref at $hash{x}

Type Conversions

Perl performs automatic type conversions based on context.

String to Number

my $x = "42";
my $y = $x + 10;             # 52 (string → number)

Number to String

my $x = 42;
my $s = "Value: $x";         # "Value: 42" (number → string)

String Concatenation

my $result = 10 . 20;        # "1020" (both → string)

Boolean Context

if ("0")    { ... }          # False
if ("00")   { ... }          # True
if (0)      { ... }          # False
if (0.0)    { ... }          # False
if ("")     { ... }          # False
if (undef)  { ... }          # False

Typeglobs

Typeglobs are a special type that can hold entries for all variable types with the same name.

*name = \$scalar;            # Alias glob to scalar
*name = \&sub;               # Alias glob to subroutine

Primarily used for symbol table manipulation and importing.

PetaPerl-Specific Implementation

Memory Representation

PetaPerl uses efficient internal representations:

Scalars: Rust enum with variants for each type. Two string representations optimize for different access patterns.

#![allow(unused)]
fn main() {
pub enum Sv {
    Undef,
    Iv(i64),                 // Integer
    Uv(u64),                 // Unsigned
    Nv(f64),                 // Float
    Pv(Arc<str>, u32),       // Immutable string + virtual length (O(1) chomp)
    PvBuf(Arc<String>),      // Mutable string (COW via Arc::make_mut)
    Rv(RvInner),             // Reference
    Av(Av),                  // Array
    Hv(Hv),                  // Hash
    // ... additional types
}
}

Dual string representation: Pv is used for constants and literals — the u32 virtual length enables O(1) chomp without modifying the shared string. PvBuf is used for mutable strings (default for new_string()) — Arc::make_mut() provides copy-on-write semantics with zero allocation when unshared.

Arrays: Dynamic vectors with efficient push/pop operations.

Hashes: Optimized hash tables with fast key lookup.

Shared Ownership

String scalars use Arc<str> (atomic reference counting):

  • Cheap cloning (just increment counter)
  • Thread-safe sharing for parallel execution
  • Common strings can share storage

Aliasing Support

PetaPerl implements @_ aliasing correctly:

  • Arguments are aliases to caller’s variables
  • Modifications write through to original
  • Uses SvCell for mutable indirection
sub modify {
    $_[0] = "changed";       # Modifies caller's variable
}

my $x = "original";
modify($x);
print $x;                    # "changed"

Performance Characteristics

OperationComplexityNotes
Scalar assignmentO(1)String uses Arc (no copy)
Array push/popO(1) amortizedDynamic growth
Array shift/unshiftO(n)Must move elements
Hash accessO(1) averageOptimized hash function
Hash insertO(1) averageWith growth

Parallel Execution

PetaPerl’s parallelization model:

  • Each thread gets its own lexical pad (no shared mutation)
  • Array and hash operations are thread-safe
  • Arc<str> strings share across threads safely
  • Loop-level parallelism doesn’t require global locks

Context Sensitivity

Perl’s context system determines how expressions evaluate.

Scalar Context

Forces single-value evaluation:

my $count = @array;          # Length
my $last = (1, 2, 3);        # Last element
my $concat = (1, 2, 3);      # 3

List Context

Forces multiple-value evaluation:

my @copy = @array;           # All elements
my @results = func();        # All return values
my ($a, $b) = (1, 2, 3);     # First two elements

Void Context

Result is discarded:

func();                      # Return value ignored
print "hello";               # No assignment

Context Propagation

my @arr = (1, 2, 3);

# Scalar context
my $x = keys @arr;           # keys in scalar context → count

# List context
my @k = keys @arr;           # keys in list context → all keys

# Function argument context
func(@arr);                  # List context
func(scalar @arr);           # Scalar context (explicit)

Constants

Constants are immutable values.

Literal Constants

42                           # Integer
3.14                         # Float
"string"                     # String
qw(a b c)                    # List of strings

Named Constants

use constant PI => 3.14159;
use constant MAX => 100;
use constant {
    RED   => 0xFF0000,
    GREEN => 0x00FF00,
    BLUE  => 0x0000FF,
};

Compile-Time Constant Folding

PetaPerl performs constant folding at compile time:

my $x = 2 + 3;               # Folded to 5
my $len = length("hello");   # Folded to 5
my $sub = substr("text", 0, 2);  # Folded to "te"

This optimization eliminates runtime computation for constant expressions.

Special Types

Code References

Subroutines can be referenced and called dynamically:

my $coderef = sub { return $_[0] * 2 };
my $result = $coderef->(21);  # 42

my $coderef = \&existing_sub;
$coderef->(@args);

Filehandles

Filehandles are special scalars:

open my $fh, '<', $file or die $!;
my $line = <$fh>;
close $fh;

Globs

Typeglobs reference symbol table entries:

*alias = *original;          # Alias all types
*func = sub { ... };         # Install subroutine

See Also

  • perlop - Operators that work with these types
  • perlfunc - Functions for manipulating data
  • perlref - More on references (Perl 5 docs)