Internet-Draft cds-consistency August 2023
Thomassen Expires 2 February 2024 [Page]
Workgroup:
DNSOP Working Group
Internet-Draft:
draft-ietf-dnsop-cds-consistency-03
Updates:
7344, 7477 (if approved)
Published:
Intended Status:
Standards Track
Expires:
Author:
P. Thomassen
SSE - Secure Systems Engineering GmbH

Consistency for CDS/CDNSKEY and CSYNC is Mandatory

Abstract

Maintenance of DNS delegations requires occasional changes of the DS and NS record sets on the parent side of the delegation. RFC 7344 automates this for DS records by having the child publish CDS and/or CDNSKEY records which hold the prospective DS parameters. Similarly, RFC 7477 specifies CSYNC records to indicate a desired update of the delegation's NS (and glue) records. Parent-side entities (e.g. Registries, Registrars) typically discover these records by querying them from the child, and then use them to update the delegation's DS RRset accordingly.

This document specifies that when performing such queries, parent-side entities MUST ensure that updates triggered via CDS/CDNSKEY and CSYNC records are consistent across the child's authoritative nameservers, before taking any action based on these records.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 2 February 2024.

Table of Contents

1. Introduction

[RFC7344] automates DNSSEC delegation trust maintenance by having the child publish CDS and/or CDNSKEY records which hold the prospective DS parameters. Similarly, [RFC7477] specifies CSYNC records indicating a desired update of the delegation's NS and associated glue records. Parent-side entities (e.g. Registries, Registrars) can use these records to update the corresponding records of the delegation.

A common method for discovering these signals is to periodically query them from the child zone ("polling"). For CSYNC, this is described in [RFC7477] Section 3.1 which advocates limiting queries to just one authoritative nameserver. The corresponding Section 6.1 of [RFC7344] (CDS/CDNSKEY) contains no such provision for how specifically polling of these records should be done.

Implementations are thus likely to retrieve records from just one authoritative server, typically by directing queries towards a trusted validating resolver. While that may be fine if all authoritative nameservers are controlled by the same entity (typically the Child DNS Operator), it does pose a problem as soon as multiple providers are involved. (Note that it is generally impossible for the parent to determine whether all authoritative nameservers are controlled by the same entity.)

In such cases, CDS/CDNSKEY/CSYNC records retrieved "naively" from one nameserver only may be entirely inconsistent with those of other authoritative servers. When no consistency check is done, each provider may unilaterally trigger a roll of the DS or NS record set at the parent.

As a result, adverse consequences can arise in conjunction with the multi-signer scenarios laid out in [RFC8901], both when deployed temporarily (during a provider change) and permanently (in a multi-homing setup). For example, a single provider may (accidentally or maliciously) cause another provider's trust anchors and/or nameservers to be removed from the delegation. Similar breakage can occur when the delegation has lame nameservers. More detailed examples are given in Appendix A.

A single provider should not be in the position to remove the other providers' records from the delegation. To address this issue, this document specifies that parent-side entities MUST ensure that the updates indicated by CDS/CDNSKEY and CSYNC record sets are consistent across all of the child's authoritative nameservers, before taking any action based on these records.

Readers are expected to be familiar with DNSSEC, including [RFC4033], [RFC4034], [RFC4035], [RFC6781], [RFC7344], [RFC7477], and [RFC8901].

1.1. Requirements Notation

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

1.2. Terminology

The terminology in this document is as defined in [RFC7344].

2. Processing Requirements

This section defines consistency requirements for CDS/CDNSKEY/CSYNC queries in the context of automatic delegation maintenance, updating [RFC7344] Section 4.1 and [RFC7477] Sections 3.1 and 4.2. Common ones are listed first, with type-specific consistency criteria described in each subsection.

In all cases, consistency is REQUIRED across received responses only. When a response cannot be obtained from a given nameserver, the Parental Agent SHOULD attempt obtaining it at a later time, before concluding that the nameserver is permanently unreachable and removing it from consideration. A retry schedule with exponential back-off is RECOMMENDED (such as after 5, 10, 20, 40, ... minutes). To sidestep localized routing issues, the Parental Agent MAY also attempt contacting the nameserver from another vantage point.

If an inconsistent state is encountered, the Parental Agent MUST abort the operation. Specifically, it MUST NOT delete or alter any existing RRset that would have been deleted or altered, and MUST NOT create any RRsets that would have been created, had the polling state been consistent.

To accommodate transient inconsistencies (e.g. replication delays), the Parental Agent MAY retry the full process, repeating all queries. A schedule with exponential back-off is RECOMMENDED.

Any pending queries can immediately be dequeued when encountering a response that confirms the status quo (i.e. indicates no update). This is because any subsequent responses could only confirm that nothing needs to happen, or give an inconsistent result in which case nothing needs to happen. Queries MAY be continued across all nameservers for inconsistency reporting purposes.

Existing requirements for ensuring integrity remain in effect. In particular, DNSSEC signatures MUST be requested and validated for all queries unless otherwise noted.

2.1. CDS and CDNSKEY

To retrieve a Child's CDS/CDNSKEY RRset for DNSSEC delegation trust maintenance, the Parental Agent, knowing both the Child zone name and its NS hostnames, MUST ascertain that queries are made against all of the nameservers listed in the Child's delegation from the Parent, and ensure that each key referenced in any of the received answers is also referenced in all other received responses.

In other words, CDS/CDNSKEY records at the Child zone apex MUST be fetched directly from each of the authoritative servers as determined by the delegation's NS record set. When a key is referenced in a CDS or CDNSKEY record set returned by one nameserver, but is missing from a least one other nameserver's answer, the CDS/CDNSKEY state MUST be considered inconsistent.

When CDS/CDNSKEY queries are performed for deploying the initial DS record set (DNSSEC bootstrapping), responses cannot be directly validated. In this case, integrity checks according to [RFC8078] Section 3 (or its successors) continue to apply.

2.2. CSYNC

A CSYNC-based update consists of (1) querying the CSYNC (and possibly SOA) record to determine which data records shall be synchronized from child to parent; (2) querying for these data records (e.g. NS) and placing them in the parent zone. If the below conditions are not met during these steps, the CSYNC state MUST be considered inconsistent.

When querying the CYSNC record, the Parental Agent MUST ascertain that queries are made against all of the nameservers listed in the Child's delegation from the Parent, and ensure that the record's immediate flag and type bitmap are equal across received responses.

The CSYNC record's SOA serial field and soaminimum flag might legitimately differ across nameservers (such as in multi-provider setups); equality is thus not required across responses. Instead, for a given response, processing of these values thus MUST occur with respect to the SOA record as obtained from the same nameserver (preferably in the same connection). The resulting per-response assessments of whether the CSYNC update is permissible MUST match across received responses.

Further, when retrieving the data record sets as indicated in the CSYNC record (such as NS or A/AAAA records), the Parental Agent MUST ascertain that all queries are made against all nameservers from which CSYNC responses were received (preferably in the same connection), and ensure that all return responses with equal rdata sets (including all empty).

CSYNC updates may cause validation or even insecure resolution to break (e.g. by changing the delegation to a set of nameservers that do not serve required DNSKEY records or do not know the zone at all). Parental Agents SHOULD check that CSYNC updates, if applied, do not break the delegation.

3. IANA Considerations

This document has no IANA actions.

4. Security Considerations

The level of rigor mandated by this document is needed to prevent publication of half-baked DS or delegation NS RRsets (authorized only under an insufficient subset of authoritative nameservers), ensuring that an operator in a (functioning) multi-homing setup cannot unilaterally modify the delegation (add or remove trust anchors or nameservers). This applies both to intentional and unintentional multi-homing setups (such as in the case of lame delegation hijacking).

As a consequence, the delegation's records can only be modified when there is consensus across operators, which is expected to reflect the domain owner's intentions. Both availability and integrity of the domain's DNS service benefit from this policy.

In order to resolve situations in which consensus about child zone contents cannot be reached (e.g. because one of the nameserver providers is uncooperative), Parental Agents SHOULD continue to accept DS and NS/glue update requests from the domain owner via an authenticated out-of-band channel (such as EPP [RFC5730]), irrespective of the rise of automated delegation maintenance. Availability of such an interface also enables recovery from a situation where the private key is no longer available for signing the CDS/CDNSKEY or CSYNC records in the child zone.

5. Acknowledgments

Viktor Dukhovni, Wes Hardaker, Libor Peltan, Oli Schacher

6. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC4033]
Arends, R., Austein, R., Larson, M., Massey, D., and S. Rose, "DNS Security Introduction and Requirements", RFC 4033, DOI 10.17487/RFC4033, , <https://www.rfc-editor.org/info/rfc4033>.
[RFC4034]
Arends, R., Austein, R., Larson, M., Massey, D., and S. Rose, "Resource Records for the DNS Security Extensions", RFC 4034, DOI 10.17487/RFC4034, , <https://www.rfc-editor.org/info/rfc4034>.
[RFC4035]
Arends, R., Austein, R., Larson, M., Massey, D., and S. Rose, "Protocol Modifications for the DNS Security Extensions", RFC 4035, DOI 10.17487/RFC4035, , <https://www.rfc-editor.org/info/rfc4035>.
[RFC5730]
Hollenbeck, S., "Extensible Provisioning Protocol (EPP)", STD 69, RFC 5730, DOI 10.17487/RFC5730, , <https://www.rfc-editor.org/info/rfc5730>.
[RFC7344]
Kumari, W., Gudmundsson, O., and G. Barwood, "Automating DNSSEC Delegation Trust Maintenance", RFC 7344, DOI 10.17487/RFC7344, , <https://www.rfc-editor.org/info/rfc7344>.
[RFC7477]
Hardaker, W., "Child-to-Parent Synchronization in DNS", RFC 7477, DOI 10.17487/RFC7477, , <https://www.rfc-editor.org/info/rfc7477>.
[RFC8078]
Gudmundsson, O. and P. Wouters, "Managing DS Records from the Parent via CDS/CDNSKEY", RFC 8078, DOI 10.17487/RFC8078, , <https://www.rfc-editor.org/info/rfc8078>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.

7. Informative References

[LAME1]
Akiwate, G., Jonker, M., Sommese, R., Foster, I., Voelker, G. M., Savage, S., Claffy, K., and ACM, "Unresolved Issues", DOI 10.1145/3419394.3423623, , <http://dx.doi.org/10.1145/3419394.3423623>.
[LAME2]
Akiwate, G., Savage, S., Voelker, G. M., Claffy, K. C., and ACM, "Risky BIZness", DOI 10.1145/3487552.3487816, , <http://dx.doi.org/10.1145/3487552.3487816>.
[RFC6781]
Kolkman, O., Mekking, W., and R. Gieben, "DNSSEC Operational Practices, Version 2", RFC 6781, DOI 10.17487/RFC6781, , <https://www.rfc-editor.org/info/rfc6781>.
[RFC8901]
Huque, S., Aras, P., Dickinson, J., Vcelak, J., and D. Blacka, "Multi-Signer DNSSEC Models", RFC 8901, DOI 10.17487/RFC8901, , <https://www.rfc-editor.org/info/rfc8901>.

Appendix A. Failure Scenarios

The following scenarios are examples of how things can go wrong when consistency is not enforced by the parent during CDS/CDNSKEY/CSYNC processing. Other scenarios that cause similar (or perhaps even more) harm may exist.

The common feature of these scenarios is that if one nameserver steps out of line and the parent is not careful, DNS resolution and/or validation will break down. When several DNS providers are involved, this undermines the very guarantees of operator independence that multi-homing configurations are expected to provide.

A.1. DS Breakage due to Replication Lag

If an authoritative nameserver is lagging behind during a key rollover, the parent may see different CDS/CDNSKEY RRsets depending on the nameserver contacted. This may cause old and new DS RRsets to be deployed in an alternating fashion. The zone maintainer, having detected that the DS deployment was successful, may then confidently remove the old DNSKEY from the zonefile, only to find out later that the DS RRset has been turned back, breaking the delegation's DNSSEC chain of trust.

Checking for consistency minimizes this risk. In case the parent reports consistency errors downstream, it can also help detect the replication issue on the child side.

A.2. Lame Delegations

A delegation may include a non-existent NS hostname, for example due to a typo or when the nameserver's domain registration has expired. (Re-)registering such a non-resolvable nameserver domain allows a third party to run authoritative DNS service for all domains delegated to that NS hostname, serving responses different from those in the legitimate zonefile.

This strategy for hijacking (at least part of the) DNS traffic and spoofing responses is not new, but surprisingly common [LAME1][LAME2]. It is also known that DNSSEC reduces the impact of such an attack, as validating resolvers will reject illegitimate responses due to lack of signatures consistent with the delegation's DS records.

On the other hand, if the delegation is not protected by DNSSEC, the rogue nameserver is not only able to serve unauthorized responses without detection; it is even possible for the attacker to escalate the nameserver takeover to a full domain takeover.

In particular, the rogue nameserver can publish CDS/CDNSKEY records. If those are processed by the parent without ensuring consistency with other authoritative nameservers, the delegation will be secured with the attacker's DNSSEC keys. As responses served by the remaining legitimate nameservers are not signed with these keys, validating resolvers will start rejecting them.

Once DNSSEC is established, the attacker can use CSYNC to remove other nameservers from the delegation at will (and potentially add new ones under their control). This enables the attacker to position themself as the only party providing authoritiative DNS service for the victim domain, significantly augmenting the attack's impact.

A.3. Multi-Homing (Permanent Multi-Signer)

A.3.1. DS Breakage

While performing a key rollover and adjusting the corresponding CDS/CDNSKEY records, a provider could accidentally publish CDS/CDNSKEY records that only include its own keys.

When the parent happens to retrieve the records from a nameserver controlled by this provider, the other providers' DS records would be removed from the delegation. As a result, the zone is broken at least for some queries.

A.3.2. NS Breakage

A similar scenario affects the CSYNC record, which is used to update the delegation's NS record set at the parent. The issue occurs, for example, when a provider accidentally includes only their own set of hostnames in the local NS record set, or publishes an otherwise flawed NS record set.

If the parent then observes a CSYNC signal and fetches the flawed NS record set without ensuring consistency across nameservers, the delegation may be updated in a way that breaks resolution or silently reduces the multi-homing setup to a single-provider setup.

A.4. Provider Change (Temporary Multi-Signer)

Transferring DNS service for a domain name from one (signing) DNS provider to another, without going insecure, necessitates a brief period during which the domain is operated in multi-signer mode: First, the providers include each other's signing keys as DNSKEY and CDS/CDNSKEY records in their copy of the zone. Once the parent detects the updated CDS/CDNSKEY record set at the old provider, the delegation's DS record set is updated. Then, after waiting for cache expiration, the new provider's NS hostnames can be added to the zone's NS record set, so that queries start balancing across both providers. (To conclude the hand-over, the old provider is removed by inverting these steps with swapped roles.)

The multi-signer phase of this process breaks when the new provider, perhaps unaware of the situation and its intricacies, fails to include the old provider's keys in the DNSKEY (and CDS/CDNSKEY) record sets. One obvious consequence of that is that whenever the resolver happens to retrieve the DNSKEY record set from the new provider, the old provider's RRSIGs do no longer validate, causing SERVFAIL to be returned.

However, an even worse consequence can occur when the parent performs their next CDS/CDNSKEY scan: It may then happen that the incorrect CDS/CDNSKEY record set is fetched from the new provider and used to update the delegation's DS record set. As a result, the old provider (who still appears in the delegation) is prematurely removed from the domain's DNSSEC chain of trust. The new DS record set authenticates the new provider's DNSKEYs only, and DNSSEC validation fails for all answers served by the old provider.

Appendix B. Change History (to be removed before publication)

Clarify that CSYNC updates should not break delegations

Describe consistency requirements for CSYNC soaminimum

Editorial changes

Retry before assuming a nameserver is permanently unreachable

Make nits tool happy

New failure mode: DS Breakage due to Replication Lag

Point out zero overhead if nothing changed, and need for OOB interface

Editorial changes

Moved Failure Scenarios to appendix

Point out zero overhead if nothing changed, and need for OOB interface

Editorial changes.

Describe risk from lame delegations

Acknowledgments

Say what is being updated

Editorial changes.

Retry mechanism to resolve inconsistencies

Don't ignore DoE responses from individual nameservers (instead, require consistency across all responses received)

Allow for nameservers that don't respond or provide DoE (i.e. require consistency only among the non-empty answers received)

Define similar requirements for CSYNC.

Editorial changes.

Initial public draft.

Author's Address

Peter Thomassen
SSE - Secure Systems Engineering GmbH
Hauptstraße 3
10827 Berlin
Germany