KomuraSoft LLC
Chapter 7

Final review and case studies

Eight case-based questions that span bailiwick, entropy, NAT, DNSSEC, and operations to wrap up the course.

Tie it all together here

In this chapter we work through eight case-based questions that cut across glue / entropy / NAT / the AD bit / negative caching / split-horizon. Rather than chapter-by-chapter knowledge, the goal is to confirm that you can judge from a combined, cross-chapter viewpoint.

Passing mark: If you get 6 or more of the 8 case questions right, you are in good shape to handle first-line triage of cache poisoning defense and observation in the field.

Viewpoints we bring into this chapter

ChapterViewpoint used here
Chapter 1Separate "whose cache" from "who put the answer in"
Chapter 2The difference between in-bailiwick glue and out-of-bailiwick additional data
Chapter 3Matching on ID / source port / question / time window, and entropy
Chapter 4Lessons from 2008 and short-term mitigations: patches and recursion control
Chapter 5What DNSSEC covers and the last-hop trust problem
Chapter 6Triaging with dig and logs, ruling out negative caching / split-horizon

Case studies 7-1 — bailiwick, entropy, DNSSEC (mechanisms axis)

Cases for confirming "how far the mechanisms can defend you": referral handling, port entropy, AD bit and last hop, and so on.

Q1. A referral from the parent zone comes back with NS ns1.shop.example.com together with A ns1.shop.example.com 192.0.2.53. What is the most reasonable way to treat it?

Q2. A referral comes with NS ns.partner.net and A ns.partner.net 198.51.100.10 attached in Additional. What is the safest judgment?

Q3. A resolver is behind NAT, and its outbound source port is effectively rounded down to a sequential range of about 256 values. Which is the most correct description?

Q4. A stub resolver receives a response with AD=1, but it does not sufficiently trust either the upstream recursive resolver or the communication channel. What is the most appropriate interpretation?

Q5. The target zone is unsigned. Which is the most correct description?

Case studies 7-2 — operations, monitoring, and overall policy (operations axis)

Confirm "how to run this in the field": monitoring signals, cache TTL lifetimes, and organizational policy.

Q6. Against a certain zone, a large number of queries for random, non-existent labels is ongoing over a short period. From a cache poisoning perspective, which is the closest reason to care?

Q7. A patch was just applied, yet users sharing the same resolver are expected to keep seeing the same wrong answer for about two more minutes. What is the most direct reason?

Q8. Which is the most appropriate overall policy for an organization's recursive resolver?

Where this course lands

  • You can explain the picture in which a shared recursive resolver keeps an incorrect RRset in its cache, and the same wrong answer is handed out to all its users.
  • You can separate referral / final answer / glue, and handle them with an in-bailiwick / out-of-bailiwick mindset.
  • You can describe the acceptance conditions (question / ID / address / port) together with the relationship between entropy and the number of attempts, using formulas and numbers.
  • You can explain that DNSSEC provides origin authentication / integrity / authenticated denial of existence and that confidentiality is a separate concern.
  • You can read the meaning of the AD bit together with the last-hop trust problem.
  • When something goes wrong, you can first observe responses with dig @resolver / dig +trace / dig +dnssec, and rule out negative caching or split-horizon before anything else.
  • Finally, you can think about combining patches, entropy, bailiwick, DNSSEC, and operational monitoring.

What helps this stick after you finish

  1. On a resolver you operate yourself, run dig +trace and dig +dnssec, and look at the referral and signature-validation output with your own eyes.
  2. Pick one unsigned zone and one signed zone, and observe the difference in the ad flag.
  3. Deliberately set up a situation where a negative caching TTL is still alive, and experience the classic "I created it but I can't see it" pattern by hand.
  4. On the operations side, confirm whether you have monitoring that catches surges in random-label queries or runs of signature-validation failures.