Digest::MD5#

📦 std

Compute MD5-128 message digests — fixed 128-bit fingerprints of arbitrary byte strings per RFC 1321.

Digest::MD5 offers two idioms. Pick whichever fits the shape of the data you have at hand.

Functional one-shot#

When the whole message is already in memory as a single string (or a few pieces you can list as arguments), the functional form is the shortest path:

use Digest::MD5 qw(md5 md5_hex md5_base64);

my $bin = md5($data);          # 16 raw bytes
my $hex = md5_hex($data);      # 32 lowercase hex chars
my $b64 = md5_base64($data);   # 22 base64 chars, no padding

All three take any number of arguments and concatenate them before hashing, so md5("a", "b", "c") equals md5("abc").

OO incremental#

When the message arrives in pieces — streaming bytes, a file read chunk by chunk, or data assembled across multiple code paths — build a context object, feed it, and ask for the digest at the end:

use Digest::MD5;

my $ctx = Digest::MD5->new;
$ctx->add($chunk1);
$ctx->add($chunk2, $chunk3);
$ctx->addfile($fh);
my $hex = $ctx->hexdigest;     # also resets the context

add returns the object, so calls chain: $ctx->add("a")->add("b"). Calling any of the digest / hexdigest / b64digest methods is destructive — the context is reset to an empty state afterwards, so the same object is immediately reusable. To peek at the digest without resetting, clone first: $ctx->clone->hexdigest.

Output encodings#

Every function and method comes in three encodings of the same 128-bit digest:

  • Raw binary (md5, digest) — 16 bytes. Suitable for packing into fixed-width fields or feeding into another binary protocol.

  • Hex (md5_hex, hexdigest) — 32 lowercase characters from 0-9a-f. This is the form you usually see in ETag headers and checksum files.

  • Base64 (md5_base64, b64digest) — 22 characters from the standard alphabet A-Z a-z 0-9 + /. The result is not padded to a multiple of 4; append "==" yourself if an interop partner expects padding.

File-handle ingestion#

addfile reads an open file handle until EOF and feeds every byte into the context without buffering the whole thing in memory, so it scales to files of arbitrary size. Make sure the handle is in binmode first — MD5 is defined over bytes, not decoded text.

Security context#

MD5 is cryptographically broken for collision resistance: attackers can construct two different messages with the same digest. Do not use it for new digital signatures, certificate fingerprints, password hashing, or any setting where an adversary chooses the input. For those, reach for Digest::SHA (SHA-256 or better) or a purpose-built password hash.

MD5 remains perfectly adequate — and widely used — for non-adversarial integrity checks: ETags, cache keys, change detection in build systems, deduplication of trusted data, and checksums over storage where the threat model is accidental corruption rather than forgery.

Bytes only — no wide characters#

MD5 hashes bytes. Feeding a string containing code points above U+00FF croaks with Wide character in subroutine entry. To hash a Unicode string, decide on an encoding first and pass the encoded bytes: md5_hex(encode_utf8($str)).

Functions#

Functional digests#

md5_bin#

One-shot MD5 digest, returned as 16 raw bytes.

md5_hex#

One-shot MD5 digest, returned as a 32-character lowercase hex string.

md5_b64#

One-shot MD5 digest, returned as a 22-character base64 string.

OO lifecycle#

new#

Create a fresh MD5 context object, or reset an existing one to empty.

clone#

Make an independent copy of the current digest state.

destroy#

Release the native MD5 state buffer when a context object is reaped.

reset#

Clear an existing context so it behaves like a freshly created one.

OO input#

add_bits#

Add a bit-oriented quantum of data — but only in multiples of 8 bits.

add#

Feed one or more pieces of data into the MD5 context.

addfile#

Read an open file handle to EOF and feed every byte into the context.

OO output#

digest#

Finalise the context and return the 16-byte raw binary digest.

hexdigest#

Finalise the context and return the digest as 32 lowercase hex chars.

b64digest#

Finalise the context and return the digest as a 22-char base64 string.

Utility#

context#

Save or restore the raw internal MD5 state — for advanced use only.