---
name: read
signature: 'read FILEHANDLE,SCALAR,LENGTH,OFFSET'
signature_alt: 'read FILEHANDLE,SCALAR,LENGTH'
since: 5.0
status: documented
categories: ["I/O", "Fixed-length data"]
---

```{index} single: read; Perl built-in
```

*[I/O](../perlfunc-by-category) · [Fixed-length data](../perlfunc-by-category)*

# read

Read a fixed amount of buffered input from a filehandle into a scalar.

`read` pulls up to `LENGTH` characters from `FILEHANDLE` and stores
them in `SCALAR`, returning how many were actually read. It goes
through the handle's PerlIO stack and is therefore buffered on top of
the underlying OS read — contrast with [`sysread`](sysread), which bypasses the
buffer and calls `read(2)` directly. The optional `OFFSET` argument
lets you splice the incoming data into the middle of `SCALAR` rather
than overwriting it.

## Synopsis

```perl
read FILEHANDLE, SCALAR, LENGTH
read FILEHANDLE, SCALAR, LENGTH, OFFSET
```

## What you get back

- The **number of characters read**, which may be less than `LENGTH`.
- `0` at end of file.
- [`undef`](undef) on error, with [`$!`](../perlvar) set.

`SCALAR` is grown or shrunk so that the last character actually read
becomes the last character of the scalar — unless `OFFSET` is given,
in which case only the slice at `OFFSET` is overwritten and anything
beyond it is left alone (see *The OFFSET argument* below).

A short read is **not** an error. On a regular file it normally means
you reached end of file; on a pipe, socket, or terminal it means no
more data is available right now. Loop until you either have the bytes
you need or `read` returns `0` / [`undef`](undef):

```perl
my $buf = "";
my $want = 4096;
while ($want > 0) {
    my $got = read($fh, $buf, $want, length $buf);
    die "read error: $!" unless defined $got;
    last if $got == 0;                 # EOF
    $want -= $got;
}
```

## Global state it touches

- [`$!`](../perlvar) — set when `read` returns [`undef`](undef).
- `${^UTF8CACHE}` / the handle's PerlIO layers — determine whether
  `LENGTH` is counted in bytes or in characters (see *Character vs
  byte semantics* below).

`read` does not interact with [`$_`](../perlvar), [`$/`](../perlvar), [`$\`](../perlvar), or [`$,`](../perlvar). Unlike
[`readline`](readline), it does not care about the input record separator.

## The OFFSET argument

`OFFSET` controls **where in `SCALAR`** the incoming data lands. It
does not seek the filehandle.

- **Omitted** — data replaces the entire contents of `SCALAR`.
- **Positive, within length** — data is written starting at position
  `OFFSET`. Characters before `OFFSET` are preserved; characters from
  `OFFSET` to the end of `SCALAR` are overwritten or extended.
- **Positive, beyond length** — `SCALAR` is first padded with `"\0"`
  bytes out to `OFFSET`, then the read is appended. Useful for
  reading into a fixed slot inside a larger buffer you are assembling.
- **Negative** — counts backwards from the end of `SCALAR`. `-1`
  means "overwrite the last character and append from there."

```perl
my $buf = "HEADER";
read($fh, $buf, 16, length $buf);      # append 16 chars after "HEADER"

my $slab = "";
read($fh, $slab, 512, 1024);           # pad to 1024 "\0" bytes, then
                                       # read 512 chars — $slab is
                                       # now 1536 chars long
```

## Character vs byte semantics

`LENGTH` is measured in whatever unit the handle deals in:

- **Byte-mode handle** (the default, and every handle opened without
  an encoding layer): `LENGTH` is a byte count. `read($fh, $buf, 10)`
  pulls 10 bytes and `length $buf` is 10.
- **`:utf8` layer**: `LENGTH` is a **character** count. Perl decodes
  UTF-8 on the way in, and `$buf` holds decoded codepoints. The
  number of bytes consumed from the file can be anywhere from
  `LENGTH` to `4 * LENGTH`, depending on the text.
- **`:encoding(...)` layer**: same rule as `:utf8`, for any encoding
  the layer knows.

```perl
open my $fh, "<:utf8", "greek.txt" or die $!;
read($fh, my $buf, 5);                 # 5 characters, not 5 bytes
```

Mixing a byte-mode read with UTF-8 data produces mojibake and, under
`use warnings`, a `Malformed UTF-8` warning if you later decode the
result. Pick the layer at [`open`](open) time and stick with it.

## Buffered vs unbuffered I/O

`read` is **stdio-buffered** through PerlIO — internally it calls
`fread(3)` (or PerlIO's replacement) against the handle's buffer.
That has two consequences worth remembering:

- You can mix `read`, [`readline`](readline) / `<$fh>`, [`getc`](getc), and [`seek`](seek) freely
  on the same handle. They all see the same buffer.
- You must **not** mix `read` with [`sysread`](sysread) on the same handle.
  [`sysread`](sysread) bypasses the buffer and goes straight to `read(2)`; any
  bytes already pulled into the buffer by a previous `read` become
  invisible to [`sysread`](sysread), and vice versa. If you need raw syscall
  semantics, use [`sysread`](sysread) exclusively on that handle.

For byte-accurate, non-buffered input — for example on a non-blocking
socket, or when implementing a protocol where a short read is
meaningful rather than "try again" — reach for [`sysread`](sysread).

## Examples

Read a fixed-size header from a binary file:

```perl
open my $fh, "<", "packet.bin" or die "open: $!";
binmode $fh;
my $header;
my $n = read($fh, $header, 16);
die "short header: got $n bytes" unless $n == 16;
```

Append 16 bytes to the end of an existing buffer by using `OFFSET`
equal to the current length:

```perl
my $buf = "PRELUDE:";
read($fh, $buf, 16, length $buf);      # $buf is now "PRELUDE:" . 16 new bytes
```

Read into position 1024 of a scalar, padding the gap with `"\0"`:

```perl
my $slot = "";
read($fh, $slot, 64, 1024);            # length($slot) == 1088
                                       # substr($slot, 0, 1024) is "\0" x 1024
```

Loop until you have exactly `N` bytes or hit EOF — the correct pattern
for pipes and sockets where a single `read` often returns fewer bytes
than requested:

```perl
sub read_exact {
    my ($fh, $n) = @_;
    my $buf = "";
    while (length($buf) < $n) {
        my $got = read($fh, $buf, $n - length($buf), length $buf);
        return undef unless defined $got;
        return $buf if $got == 0;       # EOF; caller inspects length
        # loop
    }
    return $buf;
}
```

Character-counted read through a UTF-8 layer:

```perl
open my $fh, "<:encoding(UTF-8)", "notes.txt" or die $!;
read($fh, my $chunk, 100);             # 100 characters
printf "chars=%d bytes=%d\n", length $chunk, do {
    use bytes; length $chunk;
};
```

## Edge cases

- **Closed filehandle**: returns [`undef`](undef) and sets [`$!`](../perlvar) to
  `"Bad file descriptor"`. Under `use warnings` a
  `read() on closed filehandle` warning is emitted.
- **Unopened filehandle**: same as closed — [`undef`](undef) and [`$!`](../perlvar) set.
- **`LENGTH` of `0`**: `read` returns `0` immediately and does not
  touch `SCALAR`. It is **not** a reliable EOF probe; use [`eof`](eof) for
  that.
- **Negative `LENGTH`**: a fatal runtime error
  (`Negative length at ...`). Validate `LENGTH` before calling.
- **Negative `OFFSET` whose magnitude exceeds the current length of
  `SCALAR`**: a fatal runtime error
  (`Offset outside string`). Clamp with `max($offset, -length $buf)`
  when the offset is computed.
- **Short read on a pipe or socket**: not an error. `read` returns
  fewer characters than requested whenever the PerlIO buffer empties
  before `LENGTH` is reached. Loop if you need the full count.
- **EOF mid-read**: returns the partial count. The next call returns
  `0`. After that, `$fh` stays at EOF until you [`seek`](seek) or `clearerr`.
- **Reading from a tied handle**: `read` dispatches to the tie
  class's `READ` method, which is responsible for honouring
  `LENGTH` and `OFFSET`. Misbehaving tie classes can violate the
  "grow `SCALAR` so the last character read is the last character"
  contract.
- **Interaction with [`sysread`](sysread)**: do not mix them on one handle.
  `read` fills the PerlIO buffer in chunks of its own choosing;
  [`sysread`](sysread) ignores that buffer entirely.
- **Binary data on a text-mode handle**: on Unix-like systems there
  is no distinct text mode, but an encoding layer still transforms
  bytes. `binmode $fh` (or `open ..., "<:raw", ...`) before reading
  binary data.
- **`FILEHANDLE` as an expression**: a bareword or simple scalar is
  fine. Anything more complex must be parenthesised:
  `read(($handles[$i]), $buf, $len)`.

## Differences from upstream

Fully compatible with upstream Perl 5.42.

## See also

- [`open`](open) — acquires the filehandle and decides whether
  subsequent `read`s are byte-counted or character-counted via the
  I/O layer
- [`sysread`](sysread) — the unbuffered counterpart, a direct
  `read(2)` system call; use it for non-blocking I/O or when a short
  read is meaningful
- [`readline`](readline) / `<$fh>` — record-oriented input that
  respects [`$/`](../perlvar) instead of a byte/character count
- [`getc`](getc) — read a single character; roughly
  `read($fh, $c, 1)` but with different EOF/undef reporting
- [`binmode`](binmode) — remove or add I/O layers so that `LENGTH`
  is unambiguously a byte count or a character count
- [`eof`](eof) — the right way to test for end of file, rather than
  reading a zero-length chunk