<!-- GENERATED by docs/sphinx/bin/20-reference (p5) from Encode.rs. Do not edit. -->


# Encode

<div class="pperl-badges"><span class="pperl-badge pperl-tier-std" title="Minimum feature tier: std. Available in all builds from \`--features native-std\` upward.">📦 std</span></div>

Convert between Perl character strings and bytes in any named encoding —
UTF-8, UTF-16, Latin-1, CP1252, Shift_JIS, EUC-JP, and every other
IANA-registered character set.

Encode works in two directions. `decode($name, $bytes)` takes raw bytes
in the named encoding and returns a Perl character string (the `SVf_UTF8`
flag is set on the result). `encode($name, $string)` goes the other
way: it takes a character string and returns raw bytes in the target
encoding. Keep the two operations mentally distinct — strings are
sequences of Unicode codepoints, bytes are what you read from and
write to files, sockets, and pipes.

Because UTF-8 is the internal form Perl uses for character strings,
it has a fast path: `encode_utf8` and `decode_utf8` skip the full
encoding machinery and just flip or validate the `SVf_UTF8` flag.
Three low-level helpers let you poke at that flag directly:
`is_utf8` queries it, `_utf8_on` forces it on, `_utf8_off` forces
it off. Use those only when you know what you are doing — they
change how Perl interprets the bytes already in the scalar without
touching the bytes themselves.

Every conversion takes an optional `$check` bitmask that controls
what happens when a character cannot be represented in the target
encoding. The predefined values are `FB_DEFAULT` (substitute with
`?` or the encoding’s replacement character), `FB_CROAK` (die),
`FB_QUIET` (stop and return the converted prefix), `FB_WARN`
(warn and substitute), `FB_HTMLCREF` (substitute with
`&#NNNN;`), `FB_XMLCREF` (substitute with `&#xHHHH;`), and
`FB_PERLQQ` (substitute with `\x{HHHH}`). `LEAVE_SRC`,
`STOP_AT_PARTIAL`, `PERLQQ`, `WARN_ON_ERR`, and
`ONLY_PRAGMA_WARNINGS` are the raw bits you OR together to build
custom check values.

Encoding names are resolved through a registry. `find_encoding($name)`
returns a blessed encoding object you can call methods on;
`resolve_alias($name)` returns the canonical name as a string;
`encodings()` lists every name the registry knows about.

`from_to($octets, $from, $to)` is a one-shot in-place conversion
useful when all you want is to reencode a byte string — for
example rewriting a file body from Latin-1 to UTF-8 without
unpacking it into characters first. It handles BOM-tagged and
MIME-tagged inputs when paired with `find_mime_encoding`.

## Functions

### Encode/decode

#### [`native_encode`](Encode/native_encode.md)

Turn a character string into a byte string in the named encoding.

#### [`native_decode`](Encode/native_decode.md)

Turn a byte string in the named encoding into a Perl character string.

#### [`native_encode_utf8`](Encode/native_encode_utf8.md)

Fast path for encoding a string to UTF-8 bytes.

#### [`native_decode_utf8`](Encode/native_decode_utf8.md)

Fast path for decoding UTF-8 bytes to a Perl character string.

### UTF-8 flags

#### [`native_is_utf8`](Encode/native_is_utf8.md)

Return true if the scalar carries the `SVf_UTF8` flag.

#### [`native_utf8_on`](Encode/native_utf8_on.md)

Force `SVf_UTF8` on in place, without touching the underlying bytes.

#### `native_utf8_off`

Force `SVf_UTF8` off in place, without touching the underlying bytes.

### Encoding registry

#### [`native_find_encoding`](Encode/native_find_encoding.md)

Look up an encoding by name and return an object you can call methods on.

#### `native_resolve_alias`

Return the canonical encoding name for an alias, or `undef` if unknown.

#### [`native_encodings`](Encode/native_encodings.md)

Return the list of encoding names the registry knows about.

#### [`native_obj_encode`](Encode/native_obj_encode.md)

Method form of `encode` on an encoding object.

#### [`native_obj_decode`](Encode/native_obj_decode.md)

Method form of `decode` on an encoding object.

#### `native_obj_name`

Return the canonical name of an encoding object as a string.

#### `native_obj_renew`

Return a fresh encoding object (effectively a no-op returning `$self`).

#### `native_obj_perlio_ok`

Return true if the encoding is safe to stack as a PerlIO layer.

### MIME/XML helpers

#### `native_fb_htmlcref`

`CHECK` value `520` — replace unencodable characters with HTML decimal character references (`&#NNNN;`).

#### `native_fb_xmlcref`

`CHECK` value `1032` — replace unencodable characters with XML hexadecimal character references (`&#xHHHH;`).

### Conversion

#### [`native_from_to`](Encode/native_from_to.md)

Reencode a byte string in place from one encoding to another.

### Utilities

#### `native_fb_default`

`CHECK` value `0` — substitute unencodable characters with the encoding’s default replacement (usually `?` or U+FFFD).

#### `native_fb_croak`

`CHECK` value `1` — die on the first unencodable character or invalid byte sequence.

#### `native_fb_quiet`

`CHECK` value `4` — stop at the first unencodable character and return the encoded prefix; truncate the input to what was not consumed (method form only).

#### `native_fb_warn`

`CHECK` value `6` — warn and substitute on unencodable input.

#### `native_fb_perlqq`

`CHECK` value `264` — replace unencodable characters with Perl `\x{HHHH}` escape sequences.

#### `native_leave_src`

`CHECK` bit `8` — when OR’d into `$check`, keeps the input scalar untouched; the default is to consume its successfully-encoded prefix.

#### `native_stop_at_partial`

`CHECK` bit `2048` — stop at a partial trailing multi-byte sequence rather than reporting it as an error. Useful for streaming decoders.

#### `native_perlqq`

`CHECK` bit `256` — the raw bit behind `FB_PERLQQ`; OR it into your own check mask for `\x{HHHH}` substitution.

#### `native_warn_on_err`

`CHECK` bit `2` — emit a warning on encoding errors. Combined with a substitution bit to build custom fallback behaviour.

#### `native_only_pragma_warnings`

`CHECK` bit `16` — emit encoding warnings only when the caller has `use warnings 'utf8'` (or equivalent) active, rather than unconditionally.