Some Considerations on the Choice of Naming in IETF Protocols

From time to time, networking protocols need to be able to name things used within the protocol, and resolve the names created or referenced. Necessary operations tend to include creating, modifying, and deleting names and accessing values and relationships that correspond to them. It's common for protocol designers in this predicament to attempt to use domain names as the starting point for their systems of names, and the DNS as the starting point for name resolution. This is completely understandable-- domain names, and DNS resolution, are well-established in both the expectations of network users and developers, and well-supported by fielded software. However, there are some risks when the protocol designer attempts to re-use domain names and DNS, even (or especially) with modifications, to support a specific use case or protocol design or deployment constraint. These have been touched upon in several RFCs, and in the long history of struggles to keep evolving DNS itself and the use of domain names as new needs and constraints appear. See in particular RFC 6055 ("IAB Thoughts on Encodings for Internationalized Domain Names") and RFC 6943 ("Issues in Identifier Comparison for Security Purposes"). This document deals principally with the questions a protocol designer or software developer should ask themselves about what behavior they want from the names they use in the context of a new protocol or scope for names. Depending to the answers to these questions, the designer may find that domain names will not meet the constraints at hand. Future versions of this draft will provide some comments on alternatives. Required reading includes draft-lewis-domain-names.txt, , , , , ,

The Domain Name System is a critical part of the global internet infrastructure. From the protocol standards perspective, it's comprised of a number of standards-track documents and BCPs, but roughly speaking, it includes a description of naming syntax and semantics, some operational rules for constructing a globally shared database of such names, and a specification of a wire protocol for maintaining, querying, and generating responses from that database. It has always been the case that all three need to be maintained in a coordinated fashion for the DNS to function properly and the DNS database to remain useful. In an even larger sense, however, domain names and the DNS protocol provide one answer to some fundamental questions for any computer system: naming and the manipulation of names are fundamental topics in computer science. Thus, DNS names and the DNS protocol exist as a common and highly useful solution to the basic need for naming "stuff" in certain applications and activities on the internet. We do occasionally have to notice, however, that they're not the final and complete solutions-- they have weaknesses-- even as they've proven so useful they tend to be re-used where possible. Domain names considerably predate the Domain Name System. The set of domain names is, however, a superset of the DNS namespace, and the characteristics of the DNS namespace are inherited from it. In particular, part of the abstraction that describes domain names is a tree with an identified root and identified semantics for labeling nodes. The basic structure of the domain namespace is a tree, with a domain name as a list of nodes in the tree. Such a tree must have a single root in order to maintain the uniqueness of each node. In 2002, the IAB wrote to clarify that the existence of this root is inherent in the design of the DNS and requires coordination of changes to the root of the global namespace. This remains true-- the mathematics in particular have not changed!-- but is not as simple as it sounds. This root domain isn't limited to names instantiated in the DNS namespace, but of both mathematical and operational necessity, includes them. For application and protocol designers, then, domain names come with desirable properties such as relatively straightforward structure and widespread conventions for interpretation (such as IDNA to internationalize a name in cases where human-friendliness is important). This apparent ease of use has been increased in recent years by the publication of RFC 6761, which specifies a registry of domain names for special uses. In a case where a protocol uses domain names and a DNS-like protocol such as mDNS (see RFC 6762), the registry marks a portion of the abstract domain name space as associated with that use. This allows a protocol- or application-specific node or subtree to be associated with a location in the global domain namespace, offering a degee of assurance that such names are globally unique-- also often a valuable property. However, there are also risks in this approach. For all the useful properties that come with domain names, they can be tricky to use, and interoperation can be subtle. There's no historically accepted definition of "domain name," and in some cases people use more restricted subsets of domain names such as host names with idiosyncratic limitations of their own. There are security and interoperability risks in comparison of such identifiers (see RFC 6943). They allow people to think of domain name labels as "words" and other natural language analogs, but don't behave as people expect in such contexts. Thus any choice to re-use DNS namespace, even without the DNS protocol for resolving names, requires some decisions to be made about namespace management and potential collisions or overlaps between DNS namespace and others.

some words, and framework The primary references for this section will be RFC 7719, draft-lewis-domain-names, and RFC 1034; the primary elements probably include: domain name (and domain namespace) DNS name (and DNS namespace) DNS global, distributed database as instantiation of DNS namespace root zone probably others....

This section will offer some questions that should be considered in analysis of a candidate naming scheme for a new or revised protocol. For the protocol designer who thinks they want to use domain names, RFC 6761 lists a set of questions to be answered for a special use name, discussing how users, DNS name registries, and DNS operators should treat such a namein order to maintain compatibility with the public DNS. However, those questions largely leave undefined how to tell if a special use domain name is really what's required, or how to choose an appropriate string if it is, and don't touch at all on the underlying fundamentals of choosing a naming scheme in the first place. In general, it's important to discuss separately: * What behavior the protocol designer wants to occur around use of the name * What name format and composition rules the designer wants to use Some questions follow, not yet in any particular order, about how the protocol will use names; they start with the assumption that domain names may be suitable, but may lead to the conclusion that domain names won't solve the problem at hand: Do you expect to use a non-DNS resolution protocol? What is it? Are all domain names legal for the protocol? If not, how are they limited? Do you expect to have a limited or qualified (non-global) scope? How are you specifying? Do your names need to resolvable in the global public DNS? Do they need to *not* be resolvable in the global public DNS? Do you care about collisions with the global public DNS? (It's quite generally the case that domain names that aren't intended to be resolved in the global public DNS nonetheless result in DNS queries, since the default context for a domain name in many, many applications and generic resolution engines is in fact the global DNS.) What happens to your application/protocol if names are ambiguous or resolvable with multiple protocols/scopes? Do you assume a precedence or ordering of possible resolution methods? Do you signal it explicitly? How will names be created, allocated, de-allocated, and destroyed? How long are they likely to persist? What authorization are these activities likely to require? Do names need to be authenticable? by what mechanism? What are the security and privacy implications of name disclosure to those on the intended resolution or routing paths, or leakage to those outside? There are also some questions that arise, once a protocol has taken shape, in making a choice of what names are suitable. If the choice is domain names, some analysis still needs to be done. Of the extremely large set of possible domain names, the list of acceptable ones may be quite long, or quite short, depending on the constraints imposed by the protocol and the preferences of the protocol designers. Such questions include: Will these names be human-visible? What humans will see them? In normal operations, or only in geeky places like URLs or error messages? If human-visible, do they need to be mnemonic or otherwise meaningful? If names are to be human-visible, is internationalization a concern? It's easy to pick a domain name string that seems to represent a "word" in a particular language, or an acronym or expression that's meaningful to the designers. However, translation or otherwise extending the meaning of the string beyond that initial human context is usually far more challenging than it first appears. Do you need (not just want) a single label? ("a TLD")? Why?

Decades of experience with naming in computer programming and network protocols, and with the DNS and domain names in the internet, suggest a few observations that may be relevant for those looking for a suitable naming system and name resolution protocol for network applications and protocols. As a starting point, most of them pertain to the challenges of using domain names and DNS conventions in internet protocols. Later revisions of this document will add some observations on other ways of approaching names.

It's increasingly common for protocol designers to denote a specific name resolution context for a domain in the domain namespace by using a special string, intended to be interpreted as a domain name and then used as a switch into another name resolution context. This is usually done by designating a string to be used as a "special use name" in the rightmost label in a domain name (presentation format) or the node closest to the root in a canonical FQDN. This solution may or may not involve a delegation for the name in the global DNS, or an expectation that the string will not be delegated. (See questions above regarding the assumptions made in a new protocol about potential collision between domain names in its context and domain names in the public DNS or elsewhere.) As described above, this practice has some benefits. It allows the protocol to take advantage of a number of existing features of the internet environment, including widespread availability of libraries for parsing domain names and a reasonable degree of comfort that names in a subtree of the domain namespace are globally unique. It's commonly referred to as obtaining or reserving "a TLD". This usage is deceptive, however, and this apparently simple solution hides some risks. Problems with this approach include: The IETF can't get you "a TLD"-- a single-label DNS name to be added to the root zone of the global public DNS. That authority was delegated to ICANN in RFC 2860. The IETF has no role in ICANN's decisions about what to put into the global public DNS root outside of the IETF's authority over the DNS protocol standard. In recent years ICANN has dramatically expanded the number of names actually delegated in the root zone of the DNS namespace, and since the rules for doing so are determined in a widely consultative public process, there are no guarantees about how the root zone might change in the future. If DNS resolution to a specific DNS name is required, this can be accomplished at the direction of the IAB, which is the administrator of record for the .arpa TLD and can get you a DNS name under .arpa. (See RFC 3172 for more on this.) The IAB can also commit that a domain name intended for resolution outside of the DNS under .arpa will not collide with a DNS name there. It's also been proposed that a special use name be set aside specifically as the root domain label for "domain names not to be used in the DNS" so that protocol designers and implementers can be reasonably sure that names used in that domain will not collide with names in the global DNS namespace. (Reference alt-tld draft.)

An IETF standard cannot force a name to be resolved in a given context, or not. That authority belongs to the operators of name resolvers, for the DNS protocol and otherwise. In the case of DNS, DNS operators determine what names can and can't be resolved with the DNS protocol by users sending queries to their resolvers. In other words, having the IETF document in an RFC that a particular name is to be used for a particular purpose or protocol does not prevent network operators from using the same string as a name for other purposes or in other protocols. An RFC is accepted as guidance by many DNS operators and implementers, however. RFC 6761 establishes a registry of names that the IETF has designated as "special use domain names." An entry in this registry does not prevent local operators from configuring their environments as they see fit, including allowing such names to leak into the global DNS even if they're not supposed to (often considered a privacy risk). An entry in this registry discourages others from attempting to re-use the same domain names for other purposes or protocols, particularly within the set of IETF protocols. Concerns are frequently expressed that spurious queries into the DNS are to be avoided in order to avoid leakage of potentially sensitive information into the global internet, challenges in debugging provided by giving up control of where such queries go, and extra load on the DNS root name servers. The first two concerns are well within the scope of operational concern. However, root name servers are configured for abnormal environmental conditions, not normal loads, and are probably not a big concern here. It's been the case for decades that most of the load on the root name servers is already spurious, in much the same way that load on email services is a concern only after one has considered that the vast bulk of email is spam. Human-readable names may pose problems that random strings do not, such as internationalization and intellectual property concerns. "Human readable" is not a constraint to be added casually to the choice of domain names for a protocol or application. Global uniqueness is also a constraint that comes at a higher price than may be obvious. The contents of the DNS root zone are evolving on a relatively short time scale, and the number of protocols and applications that assume their choices of strings will meet with universal respect from potentially colliding other uses seemsto be growing.

This document has no action for IANA. It might, in fact, help make some possible future IANA actions unnecessary.

This document poses no specific security considerations. However, a poorly specified naming scheme at the base of a protocol poses significant security risks and should be avoided.

This draft is the outcome of many conversations over many months, including discussions in the DNSOP WG, the IAB, and the ICANN SSAC. Particular thanks to Ed Lewis, Wendy Seltzer, Ralph Droms, Lyman Chapin, David Conrad, Andrew Sullivan, and everyone who's expressed exasperation to the author with respect to the issues discussed here.