--- name: Network protocols --- # Network protocols — a DNS query **By the end of this chapter you will be able to** build and parse a DNS query header and question section using `pack` and `unpack` — and take the same approach to any RFC-defined protocol. A "network protocol" is, in the end, a sequence of byte-exact fields described in English prose. If you can read the field table, you can write the template. This chapter walks one concrete example — the DNS query packet — from the RFC wire diagram to a fully working pair of encode / decode subroutines. ## The problem We want to ask a DNS server for the A record of `example.com`. The UDP payload we send is a **DNS message** (RFC 1035), consisting of: 1. A 12-byte **header**. 2. A **question section** — the domain we are asking about, and the record type we want. (A real client also parses the **answer section** in the response. We stop at encoding and showing a sketch of the decode.) ## The header From RFC 1035 section 4.1.1, the header is six consecutive 16-bit big-endian fields: | Offset | Field | Meaning | |--------|----------|----------------------------------------| | 0 | ID | Arbitrary 16-bit identifier we choose | | 2 | Flags | Opcode, RD bit, response flags, rcode | | 4 | QDCOUNT | Number of entries in question section | | 6 | ANCOUNT | Number of resource records in answer | | 8 | NSCOUNT | Authority records | | 10 | ARCOUNT | Additional records | Six unsigned big-endian shorts means six `n` directives: ```perl sub dns_header { my (%opts) = @_; pack "n6", $opts{id} // 0, $opts{flags} // 0, $opts{qd} // 0, $opts{an} // 0, $opts{ns} // 0, $opts{ar} // 0; } ``` That is 12 bytes, exactly as the spec requires. `n6` is shorthand for `n n n n n n`. ## The question section A question is: 1. A **name**, encoded as a sequence of length-prefixed labels terminated by a zero-length label. 2. A 16-bit **QTYPE** (1 = A record). 3. A 16-bit **QCLASS** (1 = IN, Internet). For `example.com` the name encodes as: ```text \x07 e x a m p l e \x03 c o m \x00 ``` Each label begins with a length byte; the whole name ends with a zero-length label. Two things to notice: the length is one byte (`C`), not two; and there is a trailing `NUL` but it is *not* exactly what `Z` produces — it terminates the list of labels, not a single string. We can build this by hand from a domain name: ```perl sub encode_name { my ($name) = @_; my $out = ""; for my $label (split /\./, $name) { die "label too long" if length($label) > 63; $out .= pack "C/a*", $label; } $out .= "\0"; # zero-length terminator return $out; } ``` Two directives in the loop body: `C/a*` — a single-byte length followed by the label bytes, computed automatically (see the [grouping-and-counts chapter](grouping-and-counts) for the `/` form). After the loop, a literal `"\0"` terminates the list. With `encode_name` in place, the whole question section is two concatenations and one `pack`: ```perl sub dns_question { my ($name, $qtype, $qclass) = @_; return encode_name($name) . pack "n n", $qtype, $qclass; } ``` ## Assembling the packet ```perl use constant { QR_QUERY => 0, OPCODE_QUERY=> 0, RD => 1 << 8, # recursion desired TYPE_A => 1, CLASS_IN => 1, }; sub dns_query_for_A { my ($name) = @_; my $id = int rand 65536; my $flags = RD; # standard query, recursion desired my $pkt = dns_header( id => $id, flags => $flags, qd => 1, # one question ); $pkt .= dns_question($name, TYPE_A, CLASS_IN); return ($id, $pkt); } my ($id, $pkt) = dns_query_for_A("example.com"); # send $pkt over a UDP socket to port 53 ``` 29 bytes total: 12 header + 13 for the name (`\x07example\x03com\x00` is 13) + 2 + 2 for QTYPE and QCLASS. Run `length $pkt` to confirm. ## Parsing the response header The response has the same header shape — only the flag bits change. Decoding is the reverse template: ```perl sub parse_dns_header { my ($buf) = @_; my ($id, $flags, $qd, $an, $ns, $ar) = unpack "n6", $buf; my %hdr = ( id => $id, qr => ($flags >> 15) & 1, op => ($flags >> 11) & 0x0f, aa => ($flags >> 10) & 1, tc => ($flags >> 9) & 1, rd => ($flags >> 8) & 1, ra => ($flags >> 7) & 1, rcode => $flags & 0x0f, qd => $qd, an => $an, ns => $ns, ar => $ar, ); return \%hdr; } ``` `n6` pulls the six shorts out at one go; the individual flag bits come out of shifts and masks on `$flags`. This pattern is universal: use `pack` / `unpack` for the byte-level layout, plain Perl for bit-level decoding of the flag fields. ## Parsing a name Names in the answer section use the same length-prefix encoding, plus a **pointer compression** mechanism we will skip. For a fresh name (no compression) the decoder mirrors the encoder: ```perl sub decode_name { my ($buf, $offset) = @_; my @labels; while (1) { my $len = unpack "x$offset C", $buf; last if $len == 0; die "compression pointer" if $len >= 0xc0; my $label = unpack "x${\ ($offset + 1)} a$len", $buf; push @labels, $label; $offset += 1 + $len; } return (join(".", @labels), $offset + 1); } ``` Three `unpack` calls, each with an `x$offset` prefix to skip to the right byte. The name alternates between one-byte lengths and variable-width labels, and the loop keeps reading until it hits a zero-length label. The `x$offset a$len` trick — compose a template from Perl values, call `unpack` — is the standard pattern when the next field's width depends on the previous field's value. ## What to carry forward - **An RFC byte-layout table maps directly to a template string.** Big-endian shorts → `n`, big-endian longs → `N`, bytes → `C`. - **Length-prefixed fields use `C/a*` / `n/a*` / …** — one expression, no off-by-one counting by hand. - **Bit-level fields inside a byte get decoded in plain Perl** after `unpack` has handed you the byte or short that contains them. - **Variable-width data requires staged unpacking.** Read a length, then build the template for the data from that length, then call `unpack` again. The next chapter applies the same approach to a real file format — a GIF image header — and introduces the "magic number check" idiom.