Acceptance conditions and race — ID / port / TTL / time window
Capture the response acceptance conditions and the race window through concrete numbers for ID, source port, and TTL.
Recap of the previous chapter: In Chapter 2 we sorted out referral / glue / bailiwick and how far the additional data should be trusted. Building on that, this chapter captures "under what conditions a resolver accepts a response" and the relationship between race window and entropy in concrete numbers.
Acceptance is not a single condition
A resolver decides whether a returned packet is "a response to the query I am currently waiting for" by matching several attributes. It does not look only at the ID. Only when the question, the source / destination address, and the source / destination port all line up does it treat the packet as belonging to the same transaction.
Practice 3-1 — What is being matched
Stopping at "transaction ID is only 16 bits, therefore weak" hides the role of port and address. RFC 5452 is the RFC "that summarizes measures for making DNS more resilient against forged answers," and it defines what should be matched at response acceptance time.
Q1. Following the description in RFC 5452, which of the following is not among the attributes a recursive resolver matches when accepting a response?
DNS does not use HTTP headers in its acceptance decision.
RFC 5452 requires matching the question section, the transaction ID, the source / destination address, and the source / destination port. The HTTP Host header is a value at the HTTP layer and is not part of the DNS response acceptance conditions.
Q2. If the transaction ID is only 16 bits, how many possible values are there?
It is 2 to the 16th power.
A 16-bit transaction ID gives 65,536 values. That alone is hardly wide enough against off-path guessing, which is why other matching conditions such as source port matter as well.
The race window is short, but not zero
An off-path forged response only matters during the short window before the real authoritative response has arrived. Real network conditions vary, but conceptually the idea is "if something that matches arrives first during that waiting period, you are in trouble."
| Point on the timeline | Meaning |
|---|---|
| query sent | The resolver starts holding an outstanding query |
| forged responses | Arriving fake responses. If they satisfy the acceptance conditions, they can enter the cache |
| authoritative response | Once the real response arrives, later forged responses are no longer accepted |
Practice 3-2 — Race window and entropy
An off-path forged response only matters during the short window before the real authoritative response has arrived.
Q3. For an off-path forged response to win, which time range is particularly important?
Focus on the "waiting period" before the real response has arrived.
What matters is the short race window during which the query is outstanding, that is, before the real authoritative response arrives. If a forged response that meets the matching conditions slips in first during this window, the situation becomes dangerous.
A simplified model for race and entropy
This is a conceptual model of the RFC 5452 idea, used purely for explanation. Ignoring implementation differences and network conditions, it helps you understand intuitively that "the wider the space to match, the harder to guess" and "the more attempts, the greater the risk."
| Input | Role |
|---|---|
| transaction ID bits | Up to 16 bits of matching space |
| source port bits | Up to nearly 16 more bits from the high ephemeral-port range |
| additional entropy (e.g. 0x20) | A few extra bits from case variation in labels and similar tricks |
| forged packets per window | How many fake packets an attacker can throw into one race window |
| identical outstanding queries | How many parallel copies of the same question are being held |
| attempts | How many times a fresh miss can be forced to repeat the race |
What is 0x20 encoding: DNS name resolution is fundamentally case-insensitive. For example, www.example.com and WwW.ExAmPlE.cOm are treated as the same name. Exploiting this property, a resolver can mix upper- and lowercase letters at random when sending the query, and verify that the response echoes the same letter pattern. This adds bits of entropy proportional to the number of label characters. This technique is called 0x20 encoding, named after the fact that the ASCII difference between upper- and lowercase letters is 0x20.
Even if the success probability of a single attempt is small, the more attempts accumulate, the higher the cumulative success probability rises — remember that relationship.
Port entropy and NAT
Source port randomization is an effective hardening, but if the externally visible port space is narrow, its benefit shrinks. Watch out for NATs and middleboxes that rewrite source ports into a sequential or small range.
Practice 3-3 — Port entropy and NAT
Source port randomization is an effective hardening, but if the externally visible port space is narrow, its benefit shrinks.
Q4. What is the main effect of randomizing the source port on every query?
Think about what else the attacker has to guess correctly, beyond the ID.
Source port randomization adds more space an attacker has to match on top of the transaction ID, making forged responses harder to guess. It is a different hardening from DNSSEC.
Q5. If a NAT rewrites the recursive resolver's outbound UDP source port into a small sequential range, what tends to happen?
What matters is whether the port diversity actually seen on the wire is preserved.
When a NAT folds outbound source ports into a small, predictable range, effective entropy shrinks. Even if the internal choice is random, a narrow externally visible port space tends to erode the hardening benefit.
Review column — Tracking remaining TTL in raw seconds
This is slightly off the main thread of this chapter (acceptance conditions, entropy, race window), but a chance to revisit the TTL counting we covered in Chapter 1. The reflex of being able to answer remaining seconds quickly comes back repeatedly when triaging "stale" versus "abnormal" in the operations work of Chapter 6.
Review column — Tracking remaining TTL in raw seconds
This is slightly off the main thread of race / entropy, but a chance to revisit the TTL counting we covered in Chapter 1. The reflex of answering remaining seconds quickly comes back repeatedly when triaging operations in Chapter 6.
Q6. A wrong answer was cached at 13:00:00 with a TTL of 240 seconds. How many seconds of TTL remain at 13:02:30?
2 minutes 30 seconds is 150 seconds.
Subtract 150 seconds of elapsed time from the initial 240 seconds, leaving 90 seconds of TTL. In operations, being able to track "how many seconds the currently visible value has left" in raw seconds makes it easier to separate "stale" from "abnormal."
Key takeaways from this chapter
- DNS response acceptance is not just the ID; it includes question / address / port
- What matters is the short race window before the authoritative response
- Source port randomization helps, but NAT can crush that entropy