IAB T. Hain, Microsoft Internet Draft Document: draft-iab-nat-implications-02.txt October 1998 Architectural Implications of NAT Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." To view the entire list of current Internet-Drafts, please check the "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net (Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast). A revised version of this draft document will be submitted to the RFC editor as an Informational RFC for the Internet Community. Discussion and suggestions for improvement are requested. This document will expire before October 1998. Distribution of this draft is unlimited. Abstract In light of the growing interest in, and deployment of network address translation (NAT) [RFC-1631], this paper will discuss some of the architectural implications and guidelines for implementations. It is assumed the reader is familiar with the address translation concepts presented in [RFC-1631]. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC-2119]. Hain Informational - Expires January 1999 1 Architectural Implications of NAT July 1998 Introduction In discussing the architectural impact of NATs [RFC-1631] on the Internet, the first task is defining the scope of the Internet. The most basic definition is; a concatenation of networks built using IETF defined technologies. This simple description does not distinguish between the public network known as the Internet, and the private networks built using the same technologies. An approach resolving this would be including the resources of Names or Addresses administered through IANA or its delegates. While this is more accurate, it still includes many private networks that have coordinated their names or addresses with the public Internet. Rekhter, et al [RFC-1918] defined hosts as public when they need network layer access outside the enterprise, using a globally unambiguous address. Those that need limited or no access are defined as private. Another way to view this is the transparency of the connection between any given node and the rest of the Internet. True transparency could be stated as; an unambiguous locator known by a node and identifiable by any other node participating in the public Internet, with no restrictions on packet delivery. The ultimate resolution of public or private is found in the intent of the network in question. Generally networks that use coordinated names and addresses, but do not intend to be part of the greater Internet will use some screening technology to insert a barrier. Historically barrier devices between the public and private networks were known as Firewalls or Application Gateways, and were managed to allow approved traffic while blocking everything else. Increasingly the screening technology is becoming a simple NAT, which manages the network locator between the public and private use address spaces. As noted by Carpenter, et al [RFC-2101], once private use addresses [RFC-1918] were deployed in the network, addresses were guaranteed to be ambiguous. At the same time when NATs were attached to the network, the process of resolving names to or from addresses gained a dependency on where the question was asked; thus both names and addresses became globally ambiguous. As private use addresses are by definition not part of the public infrastructure, and an unambiguous locator is required within a routing realm, NATs are clearly left sitting at the boundary of the Internet. Here they become another screening technology for connecting private networks. In one view, NAT is the feature which finally breaks the semantic overload of the IP address as both a locator and the end point identifier (EID). Another view of NAT is that of 'necessary evil', where there is a real concern that the technology is the weed which is destined to choke out continued development. In either case, there is no direct impact on the public Internet, since NATs sit at the boundary. This leaves the discussion focused on the impact on end-to-end communications between hosts that use the public Internet as a transport medium. Hain Informational - Expires January 1999 2 Architectural Implications of NAT July 1998 A significant factor in the success of the Internet is the flexibility derived from a few basic tenets. First and foremost is the End-to-End principle, which assumes the end points are in control of the communication and the network simply moves bits between these points. Restated, the data stream delivered by the transport protocol of the end points is of no concern to the lower layer packet routing devices and therefore may contain anything the end point applications consider appropriate. Another is that the network does not maintain per connection state information to allow fast rerouting around failures through parallel paths. Lack of state also removes any requirement for the network nodes to notify each other as connections are formed or dropped and enables connectionless transports. Furthermore, the end points are not, and need not be, aware of any network components other than the first hop router(s), name resolution service, and destination. Packet integrity is preserved through the network, and transport checksums are valid end to end. NATs (particularly the port multiplexing variety) break most of these, reducing overall flexibility, increasing operational complexity, and impeding diagnostic capabilities. Terminology Locator - the address within a packet directing its delivery within a routing realm. Routing realm - unambiguous address pool used by a contiguous collection of routers and end systems. End point identifier (EID) - used by one end system to identify the other end of a communication. Uniqueness is required only within the context of the originating end. Using the Domain Name System as a common database requires uniqueness throughout the Internet. NAT - segregates realms of routing information by connecting between and rewriting packet headers as necessary. A NAT does not interpret packet contents in any way. Application Gateway - segregates realms of transport information by terminating an application data stream then reconstructing packet contents as necessary before forwarding to the next destination. Firewall - blocks unauthorized end-to-end connections. Often used within a routing realm, firewalls adhere to forwarding rules but do not modify packet headers or contents. VPN - Virtual Private Network which technically treats an IP infrastructure as a multiplexing substrate allowing the end points to build virtual circuits to run another instance of IP over. Hain Informational - Expires January 1999 3 Architectural Implications of NAT July 1998 Utility of NATs A quick look at the popularity of NAT technology shows that it addresses several real world problems. - Masking the address changes that take place, from either dial- access or provider changes, improves stability in the local network. - Globally routable addresses can be reused for intermittent access customers. This lowers the demand and utilization of addresses to the number of active nodes rather than the total number of nodes. - There is a potential that NATs would lower an ISP's support burden since there could be a consistent, simple device with a known configuration at the customer end of the access interface. - Breaking the Internet into a collection of routing realms would limit the scope of routing knowledge, and thereby the workload on the routers within each realm. - For applications which don't care about the integrity of the packet header, there are no changes necessary in the hosts. Taken together these are strong motivations for moving quickly with NAT deployment. Removing hosts that are not currently active lowers address demands of the public Internet, which improves the load on the routing system as well as lengthens the lifetime of the IPv4 address space. While this is a natural byproduct of the existing dynamic allocation dial access devices, in the dedicated connection case this service could be provided through a NAT. In the case of a port multiplexing NAT, the aggregation potential is even greater as multiple end systems share a single public address. Compartmentalizing routing knowledge and distributing the overhead by breaking a network into a collection of routing realms might improve its stability. It could also alleviate some of the pressure on the routing infrastructure causing ISP's to enforce artificial boundaries on how much detail they are willing to accept in routing updates. Since the details of adjacent routing realms could be completely masked, the level of aggregation possible would dwarf all prior efforts. The number of entries in the routing table would be reduced to the number of external attachments (albeit at the expense of increasing the NAT mapping table at each attachment point). Determination of the proper exit point is left as an exercise for the reader. NAT deployments should raise the awareness of protocol designers who are interested in ensuring that their protocols work end to end. Breaking the semantic overload of the IP address will force applications to find a more appropriate mechanism for end point identification and discourage carrying the locator in the data stream. Since this will not work for legacy applications, RFC-1631 discusses how to look into the packet and make NAT transparent to the application (ie: create an application gateway). Hain Informational - Expires January 1999 4 Architectural Implications of NAT July 1998 Another popular practice is hiding a collection of hosts behind a single IP address. In many implementations this is architecturally a NAT, since the addresses are mapped to the real destination on the fly. When packet header integrity is not an issue, this type of virtual host requires no modifications to the remote applications since the end client is unaware of the mapping activity. While the virtual host has the CPU performance characteristics of the total set of machines, the overall performance is bounded by the processing and I/O capabilities of the NAT device as it funnels the packets back and forth. Repercussion of NATs As noted earlier, NATs break the basic tenet of the Internet that the end points are in control of the communication. The greatest concern from the explosion of NATs is the impact on the fledgling efforts at deploying IP security. For lack of another globally unique EID, the traditional use of the IP address was assumed. This combination of required global uniqueness of the address, and assured ambiguity by NAT leaves the IPsec effort with a severely restricted working set. In a statement about the use of IPv4 today, RFC-2101 details architectural issues and notes: "... it has been considered more useful to deliver the packet, then worry about how to identify the end points, than to provide identity in a packet that cannot be delivered." This argument presumes that delivering the packet has an inherent value, even if the end points can't be identified. In a self- fulfillment of that prophecy, the applications developed to date are structured to assume packets will be delivered and identity is only assured in controlled environments. In many ways, this fundamental impediment to basic trust has been the stalling factor in deploying security across the Internet. In another note from RFC-2101: "Since IP Security authentication headers assume that the addresses in the network header are preserved end-to-end, it is not clear how one could support IP Security-based authentication between a pair of hosts communicating through either an ALG (ed: Application Level Gateway) or a NAT." A feature of stateful devices like NATs is the creation of a single point of failure. Attempts to avoid this by establishing redundant NATs, creates a new set of problems related to timely communication of the state. This encompasses several issues such as update frequency, performance impact of frequent updates, reliability of the state update transaction, a-priori knowledge of all nodes needing this state information, and notification to end nodes of appropriate path for each transaction. Hain Informational - Expires January 1999 5 Architectural Implications of NAT July 1998 It has been observed that operational management of networks incorporating stateful packet modifying device is considerably easier if inbound and outbound packets traverse the same path. While easy to say, it is difficult to manage even with careful planning. The problem is ensuring that routes advertised to the private side reach the end nodes and map to the same device as the public side route advertisements. In many cases this borders on the impossible, given the internal and external topology churn. Another major drawback of NAT technology is the process of resolving addresses from names, or names from addresses. When the public DNS is required to resolve a given host name on both sides of a NAT there is no obviously correct answer. In the example below it is not clear what answer DNS should return for Host D. Returning the local address will assure global invisibility, while returning the global address will prevent local access from Host C. If DNS were to return both, the results would be unpredictable. By knowing which side the request came from the DNS server could provide the correct answer, but significant development would be required to add the capability to DNS for source specific responses. (note: since Host A has no access to the DNS service it is required to maintain a local table, but the others may be expecting DNS to provide the appropriate resolution.) In the case where Hosts C & D share an address (either time shared or port multiplexed), there is no way Host B could know which it was connecting to. DNS would return a public side address for either, then it is up to the NAT to decide where the packets are eventually directed. Since Host B cannot rely on the fully qualified domain name to uniquely identify a specific host, the name space is fragmented, resulting in pockets of validity. -------- --- --- -------- | Host A |--|NAT|------|NAT|--| Host B | -------- --- --- -------- \ \ --- -------- --- |NAT|---|Internet|----|DNS| --- -------- --- | ----------------- | | -------- -------- | Host C | | Host D | -------- -------- Even if forward mappings are working, implementations that require an unambiguous reverse mapping from the in-addr.arpa tree will fail (diagnostic tools come to mind). Hain Informational - Expires January 1999 6 Architectural Implications of NAT July 1998 Discussions about an arbitrary mesh of NAT connections will ultimately exaggerate the issue of name space integration with the routing infrastructure and show that the only resolution to appropriately answer name queries in a NAT environment is to locate the DNS service within each NAT. This brings the additional complexity of knowing which NAT to look to for remote resolutions. Since most NATs are engineered to be auto configuring turnkey devices, and DNS has not been known for its auto configuring properties, this is not a particularly viable approach. One proposal to deal with locating the DNS service in each NAT is the DNS ALG (1). Rather than running the full DNS server in the NAT, it provides a mapping service by intercepting DNS messages and modifying the contents appropriately. The recent mass growth of the Internet has been driven by support of low cost publication via the web. The next big push appears to be support of Virtual Private Networks (VPNs). Technically VPNs treat an IP infrastructure as a multiplexing substrate allowing the end points to build what appear to be clear pathways from end to end. VPNs redefine network visibility and increase the likelihood of address collision when traversing NATs. Address management in the hidden space behind NATs will become a significant burden, as there is no central body capable of, or willing to do it. The lower burden for the ISP is actually a transfer of burden to the local level, because administration of addresses and names becomes both distributed and more complicated. As noted in RFC-1918, the merging of private address spaces can cause an overlap in address use, which creates a problem. VPNs will increase the likelihood and frequency of that merging through the simplicity of their establishment. There are several configurations of address overlap which will cause failure, but in the simple example shown below the private use address of Host B matches the private use address of the VPN pool used by Host A for inbound connections. When Host B tries to establish the VPN, Host A will assign it an address from its pool for inbound connections, and identify the gateway for Host B to use. In the example, Host B will not be able to distinguish the VPN address of Host A from its own address so the connection will fail. Since private use addresses are by definition not coordinated, as the complexity of the VPN mesh increases so does the likelihood that there will be a collision which cannot be resolved. --------------- ---------------- | 10.10.10.10 |--------VPN--------| Assigned by A | | Host A | --- --- | Host B | | 10.1.1.1 |--|NAT|-----|NAT|--| 10.10.10.10 | --------------- --- --- ---------------- ---------- 1 draft-ietf-nat-dns-alg-00.txt (work in progress 7/98) Hain Informational - Expires January 1999 7 Architectural Implications of NAT July 1998 The primary feature of NATs is the ability to simply connect private networks to the public Internet. When the private network exists prior to installing the NAT, it is unlikely and unnecessary that its name resolver would use a registered domain. Connecting the NAT device, and reconfiguring the resolver to proxy for all external requests allows access to the public network by hosts on the private network. Configuring the public DNS for the set of private hosts that need inbound connections would require a registered domain (either private, or from the connecting ISP) and a unique name. At this point the name space is partitioned as hosts would have different names based on inside vs. outside queries. -------- -------- | Host A | | Host B | | Foo |-----| Bar | -------- | -------- --- |-------------|DNS| --- --- |NAT| --- | -------- --- |Internet|----|DNS| -------- --- | --- |NAT| --- --- |-------------|DNS| -------- | -------- --- | Host C |-----| Host D | | Foo | | Bar | -------- -------- Everything in this simple example will work until an application embeds a name. For example, a Web service running on Host D might present embedded URL's of the form http://bar/*.html, which would work from Host C, but would thoroughly confuse Host A. If the embedded name matched the public DNS, Host A would be happy, but Host C would not. To establish a connection from Host C, the NAT would have to look at the destination rather than simply forwarding the packet to a router (which would not send it back on the same interface it came from). NATs place constraints on the deployment of applications that carry IP addresses in the data stream. Applications or protocols that assume end to end integrity of the address will fail when traversing a NAT. The resolution to this is to provide an Application Level Gateway within or alongside each NAT. An additional gateway service is necessary for each application that may imbed an address. Even this approach will fail when requirement is end to end encryption since only the end points have access to the keys. Hain Informational - Expires January 1999 8 Architectural Implications of NAT July 1998 Finally, while the port multiplexing NATs (popular because they allow Internet access through a single address, thus lowering cost) work modestly well for private toward public connections, they create management problems for applications connecting from public toward private. Since only one private side system can be mapped through the single public side port number, applications like multi- player Internet games can only be played on one system at a time, and X-windows services relying on port 6000 need to be manually mapped to each target prior to connection. IPv6 Considerations It has been argued that IPv6 is no longer necessary because NATs relieve the address space constraints and allow the Internet to continue growing. The reality is they point out the need for IPv6 more clearly than ever. People are trying to connect multiple machines through a single access line to their ISP and have been willing to give up some functionality to get that at minimum cost. Frequently the reason for cost increases is the perceived scarcity (therefore increased value) of IPv4 addresses, which would be eliminated through deployment of IPv6. This crisis mentality is creating a market for a solution to a problem already solved with greater flexibility by IPv6. Beyond all of the above issues, the existence of NATs will complicate the integration of IPv6 in the Internet because the name space and end point addresses are not consistent and globally unique. While multiple addresses are less of a concern to an IPv6 node, the disjoint name space will certainly make management interesting. If IPv6 nodes are willing to continue in private networks behind a NAT, they will only need a link local address and all of the issues become the same as IPv4. If the intent is to move into a public space as a feature of moving to IPv6, address and name administration will require explicit effort. Security Considerations NATs break most implementation modes of IPsec, and therefore may stall further deployment of enhanced security across the Internet. It is difficult to identify all the combinations of header orderings and options that are possible using NATs, VPNs, and IPsec. It is even more difficult to clearly state which of those are applicable, or workable in any given context. For example, use of AH is not possible via NAT as the hash protects the IP address in the header. In some cases, authenticated certificates may contain the IP address as part of the subject name for authentication purposes. Encrypted Quick Mode structures may contain IP addresses and ports for policy verifications. While the Revised Mode of public key encryption includes the peer identity in the encrypted payload. It may be possible to engineer and work around NATs for IPsec on a case by case basis, but attempts to retrofit IPsec over and existing NAT infrastructure can be problematic. With all of the restrictions Hain Informational - Expires January 1999 9 Architectural Implications of NAT July 1998 placed on deployment flexibility, NATs present the greatest single threat to security integration being deployed in the Internet today. Security mechanisms that do not protect or rely on IP addresses as identifiers, such as TLS (2), SSL (3), or SSH (4) may operate in environments containing NATs. For applications that can establish and make use of this type of transport connection, NATs do not create any additional complications. Guidelines Given that NAT devices are being deployed at a fairly rapid pace, some guidelines are in order. Most of these amount to 'think before you leap', then think again. - Determine the mechanism for name resolution, and ensure the appropriate answer is given for each routing realm. Embedding the DNS server, or its ALG in the NAT device will be more manageable than trying to synchronize independent DNS systems across realms. - Is the NAT configured for static one to one mappings, or will it dynamically manage them? If dynamic, make sure the TTL of the DNS responses is set to 0, and that the clients pay attention to the don't cache notification. - Examine the applications that will need to traverse the NAT and verify their immunity to address changes. If necessary provide an appropriate ALG or establish a VPN to isolate the application from the NAT. - Determine need for public toward private connections, variability of destinations on the private side, and potential for simultaneous use of public side port numbers. Avoid port NATs if these apply. - If there are encrypted payloads, the contents cannot be modified unless the NAT is a security end point acting as a gateway between security realms. This precludes end to end confidentiality, as the NAT becomes the security end point. - When using VPNs over NATs, identify a clearinghouse for addresses to avoid collisions. - Assure that applications that will be used both internally and externally either avoid embedding names, or use globally unique ones. ---------- 2 draft-ietf-tls-protocol-05.txt (work in progress 11/97) 3 http://home.netscape.com/eng/ssl3/ssl-toc.html March 1996 4 draft-ietf-secsh-architecture-02.txt Hain Informational - Expires January 1999 10 Architectural Implications of NAT July 1998 Summary Wherever they are located topologically, NATs break a long-standing architectural principle that applications can trust packets to move between end points without modification beyond ttl or tos. Another design principle, 'keep-it-simple' is being overlooked as more features are added to the network to work around the complications created by NATs. In the end the overall flexibility and manageability are lowered while support costs go up dealing with the problems introduced. NATs are a 'fact of life', and will proliferate as an enhancement that sustains the existing IPv4 infrastructure. At the same time, they require strong applicability statements, clearly stating what works and what does not. NATs are a 'necessary evil' as well, and by fragmenting the Name Space, NATs create an administrative burden that is not easily resolved. More significantly, they inhibit the roll out of IPsec, which will in turn slow growth of applications that require a secure infrastructure. As such, NATs represent the single greatest threat to a secure Internet. An overview of the pluses and minuses: NAT utility NAT repercussions -------------------------------- -------------------------------- Masks Global Address Changes Mandates Multiple Name Spaces Routing realms distribute overhead Requires source specific DNS reply Lowers Address Utilization Allows end-to-end address conflicts Lowers ISP support burden Increases local support burden Breaks end-to-end association Breaks end-to-end association Transparent to end systems Unique development for each app Load sharing as virtual host Performance impact with scale Delays need for IPv4 replacement Complicates integration of IPv6 There have been many discussions lately about the value of continuing with IPv6 development when the market place is widely deploying IPv4 NATs. A short sighted view would miss the point that both have a role, because NATs address some real-world issues today, while IPv6 is targeted at solving fundamental problems, as well as moving forward. It should be recognized that there will be a long co-existence as applications and services develop for IPv6, while the lifetime of the existing IPv4 systems will likely be measured in decades. At their best, NATs are a diversion from forward motion, but they do enable wider participation at the present state. At their worst, they break an association that arguably should never have been made to begin with. There have also been many questions about the probability of VPNs being established which would raise some of the concerns listed above. While it is hard to predict the future, one way to avoid ALGs for each application is to establish a VPN over the NATs. This Hain Informational - Expires January 1999 11 Architectural Implications of NAT July 1998 restricts the NAT visibility to the headers of the tunnel packets, and removes its effects from all applications. While this solves the ALG issues, it raises the likelihood that there will be address collisions as arbitrary connections are established between uncoordinated address spaces. The IAB wants to remind everyone to focus on the goal, which is continued evolution of the Internet, and recognize continued development of IP (in all current and future versions) is the path. It has been noted that the success of the Internet is based on the 'living' characteristic of IP. As in life, when growth, evolution, and forward progress stops, decay overtakes and destroys. History has shown that protocols that were 'complete and finished' as presented, have had very short lifetimes while those still 'a work in progress' manage to survive and continue moving ahead. All parties need to understand the significant role they are playing in pursuing the goal, and that none can get there without all the others. References [RFC 1631], Egevang, K., Francis, P., "The IP Network Address Translator", RFC 1631, May 1994 [RFC 1918], Rekhter, et al, "Address Allocation for Private Internets", RFC 1918 February 1996 [RFC 2101], Carpenter, et al, "IPv4 Address Behavior Today", RFC 2101, February 1997 [RFC-2119], Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", RFC 2119, March 1997 draft-ietf-nat-dns-alg-00.txt, P. Srisuresh, et al, .DNS extensions to Network Address Translators., July 1998 draft-ietf-tls-protocol-05.txt, T. Dierks, C. Allen, "The TLS Protocol", November 1997 draft-ietf-secsh-architecture-02.txt, T. Ylonen, et al, "SSH Protocol Architecture", August 1998 Acknowledgments Valuable contributions to this draft came from the IAB, Yakov Rekhter(cisco) and Eliot Lear (cisco). Author's Addresses Tony Hain Microsoft One Microsoft Way Phone: 1-425-703-6619 Redmond, Wa. USA Email: tonyhain@microsoft.com Hain Informational - Expires January 1999 12