```{index} single: native_encode; Encode function ``` ```{index} single: Encode::native_encode; Perl function ``` # native_encode Turn a character string into a byte string in the named encoding. ## Synopsis ```perl my $bytes = encode($encoding, $string); my $bytes = encode($encoding, $string, $check); ``` ## What you get back A scalar holding raw bytes. The `SVf_UTF8` flag on the result is always off: this is the form you write to files, sockets, and pipes. The input `$string` is treated as a sequence of Unicode codepoints regardless of how it is internally represented. If `$encoding` is unknown, `encode` croaks with `Unknown encoding '...'`. The optional `$check` argument controls what happens when a character cannot be represented in the target encoding: - `FB_DEFAULT` (0, the default) — substitute with `?` or the encoding's replacement character. - `FB_CROAK` — die on the first unencodable character. - `FB_QUIET` — stop at the first unencodable character and return the encoded prefix. In the method form, the consumed prefix is also removed from `$string`. - `FB_WARN` — warn and substitute. - `FB_PERLQQ` — substitute with `\x{HHHH}`. - `FB_HTMLCREF` — substitute with `&#NNNN;`. - `FB_XMLCREF` — substitute with `&#xHHHH;`. ## Examples Encode a string to UTF-8 bytes for writing to a file: ```perl my $bytes = encode('UTF-8', "caf\x{e9}"); ## $bytes is "caf\xc3\xa9" — 4 bytes, no SVf_UTF8 ``` Encode to Latin-1, losing characters that don't fit: ```perl my $bytes = encode('iso-8859-1', "\x{20ac}"); # Euro sign ## $bytes is "?" — U+20AC has no Latin-1 byte ``` Die if anything can't be encoded: ```perl use Encode qw(encode FB_CROAK); my $bytes = encode('ascii', "caf\x{e9}", FB_CROAK); ## dies: "\x{e9}" does not map to ascii ``` ## Edge cases - `undef` input returns an empty byte string. - Input without `SVf_UTF8` is treated as Latin-1 bytes and reencoded as such. - Encoding `"null"` passes input bytes through unchanged. ## Differences from upstream Fully compatible with upstream for ASCII, Latin-1, CP1252, the ISO-8859 family, and UTF-8. Shift_JIS, EUC-JP, and other multi-byte encodings are not yet registered in the static table and fall back to the Latin-1 identity mapping. Covered by `t/81-xs-native/Encode/040-encode-utf8-latin1.t` and `t/81-xs-native/Encode/060-check-parameter.t`. ## See also - `decode` — the inverse, bytes to string. - `encode_utf8` — the UTF-8-only fast path. - `from_to` — reencode in place without materialising a character string in between.