Network Working Group Ping Pan (Juniper Networks) Internet Draft Nischal Sheth (Juniper Networks) Expiration Date: January 2002 Dave Cooper (Global Crossing) Network Working Group George Swallow (Cisco Systems) Sanjay Wadhwa (Unisphere Networks) Detecting Data Plane Liveliness in RSVP-TE draft-pan-lsp-ping-01.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as ``work in progress.'' The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This document describes a simple and efficient mechanism that can be used to detect data plane failures in MPLS LSP's. The proposed mechanism requires a new optional RSVP object. The processing overhead imposed on LSR control plane is kept to minimum. Sub-IP Summary ID This document describes a simple and efficient mechanism that can be used to detect data plane failures in MPLS LSP's. The proposed mechanism requires a new optional RSVP object. The processing overhead imposed on LSR control plane is kept to minimum. RELATED DOCUMENTS draft-pan-lsp-ping-01.txt ^L[Page 1] Internet Draft draft-pan-lsp-ping-01.txt July 2001 May be found in the "references" section. WHERE DOES IT FIT IN THE PICTURE OF THE SUB-IP WORK Fits the MPLS box. WHY IS IT TARGETED AT THIS WG MPLS WG is currently looking at MPLS-specific error detection and recovery mechanisms. This work presents a simple mechanism to detect a specific MPLS data plane failure, that cannot be detected by MPLS control plane. One possible cause of such failure may be due to memory corruption. JUSTIFICATION The WG should consider this document, as it allows network operators to detect MPLS LSP data plane failures in the network. This type of failures had occurred in MPLS networks. 1. Introduction This document describes a simple and efficient mechanism that can be used to detect data plane failures in MPLS LSP's. The proposed mechanism requires a new optional RSVP object. The processing overhead imposed on LSR control plane is kept to minimum. 2. Motivation When an LSP has failed to deliver user traffic, the failure cannot always be detected by the MPLS control plane. In the case of this draft we are addressing the RSVP-TE component. There is a need to provide a tool that would enable users to detect such traffic "black holes" within a reasonable period of time. Such a tool should additionally have the following characteristics. It should not introduce a heavy processing overhead on LSR's. It should not open the door to potential DOS attacks. In this document, we describe a mechanism, termed "LSP-ping", that accomplishes these goals. draft-pan-lsp-ping-01.txt ^L[Page 2] Internet Draft draft-pan-lsp-ping-01.txt July 2001 3. LSP-ping Extension 3.1. LSP-ping message During the LSP liveliness test, an ingress LSR sends probe packets to the egress LSR's control plane over the LSP that is being tested. This packet must be encapsulated in UDP with a well-known port number. The reason for choosing UDP is described below. We call the UDP-encapsulated packet as "an LSP-ping message". Each LSP-ping message must carry sufficient amount of information that can identify the testing LSP. At minimum, it must contain an RSVP SESSION, an RSVP SENDER_TEMPLATE and an LSP_ECHO object, which is defined below. 3.2. RSVP-TE Extension To test an LSP's liveliness, an ingress LSR sends LSP-ping messages that contains an LSP_ECHO object over the LSP being tested. When an egress LSR receives the message, it needs to acknowledge the ingress LSR by copying the LSP_ECHO object into a RSVP Resv message. The object has the following format: Class = LSP_ECHO (use form 11bbbbbb for compatibility) C-Type = 1 +-------------+-------------+-------------+-------------+ | Source Identifier | +-------------+-------------+-------------+-------------+ Source Identifier This value is assigned by ingress LSR to uniquely identify the sending process. This would allow an ingress LSR to identify the returned responses if there are multiple instances of LSP-ping running. draft-pan-lsp-ping-01.txt ^L[Page 3] Internet Draft draft-pan-lsp-ping-01.txt July 2001 4. Operation For the sake of brevity in the context of this document by "the control plane" we mean "the RSVP-TE component of the control plane". Consider an LSP between an ingress LSR and an egress LSR spanning multiple LSR hops. 4.1. Procedures at the ingress LSR Before initiating the liveliness test, the user must make sure that both ingress and egress LSR can support the LSR-ping. When an LSP needs to be tested, the ingress LSR sends ICMP ECHO_REQUEST messages [ICMP] over the LSP periodically. The period is controlled by a timer. The value of the time interval should be configurable. If there are multiple LSPs between the ingress and egress LSRs, the ECHO_REQUEST messages MUST be differentiated by using unique identifiers in the Identifier field of the ECHO_REQUEST message. If the ingress LSR does not receive ICMP ECHO_REPLY messages from the egress for a long period of time, it is likely that there is an LSP failure on either the forward path (from ingress to egress) or the reverse path (from egress to ingress), or both. When the ingress LSR suspects that the LSP may have failed and the RSVP control plane shows the LSP as operational, the ingress LSR MUST send LSP-ping messages to the egress over the LSP, periodically. The value of the time interval should be configurable. The ingress LSR selects a unique Source_Identifer value for this particular test and places it in the LSP_ECHO object. The ingress LSR includes the LSP_ECHO object along with the SESSION and SENDER_TEMPLATE objects of the LSP under test. If the ingress LSR does not receive an Resv message from the egress LSR that consists of an LSP_ECHO object within a period of time, it declares the LSP as "down". At this point, the ingress LSR should apply the necessary procedures to fix the LSP. This may include generating a message to network management, tearing-down and re- building the LSP, and/or rerouting user traffic to a backup LSP. During the test, ICMP ECHO_REQUEST and LSP-ping messages MUST set the IP TTL field to one in the IP header. This is to prevent the misbehavior at egress LSR's. draft-pan-lsp-ping-01.txt ^L[Page 4] Internet Draft draft-pan-lsp-ping-01.txt July 2001 To test an LSP that carries non-IP traffic, before injecting ICMP and LSP-ping messages into the LSP, the IPv4 Explicit NULL label should be prepended to such messages. The ingress and egress LSR's must follow the procedures defined in [LABEL-STACKING]. 4.2. Procedures at the egress LSR When the egress LSR receives an ICMP ECHO_REQUEST message, it handles the message according to the procedures defined in [ICMP] (this is irrespective of whether the message is used for an LSP liveliness test or not). It is possible that the ICMP processing is entirely done by the hardware or in the IP fast data path, thus, the initial ICMP "ping" messages have little impact on control plane's performance. When the egress LSR receives an LSP-ping message, it needs to deliver the message to the control plane. To avoid potential DOS attacks, it is recommended to regulate the LSP-ping traffic going to the control plane. A rate limiter should be applied to the well-known UDP port defined above. At the control plane, based on the RSVP SESSION and SENDER_TEMPLATE objects carried in the LSP-ping message, the LSR can find the corresponding LSP in its RSVP-TE database. The LSR then checks to see if the Resv message for this LSP contains an LSP_ECHO object with the same Source_Identifier value. If not, the LSR adds or updates the LSP_ECHO object and refreshes the Resv message. 4.3. Procedures for the intermediate LSR's At intermediate LSRs, normal RSVP processing procedures will cause the LSP_ECHO object to be forwarded as RSVP messages are refreshed. At the LSR's that support LSP-ping, the Resv messages that carry the LSP_ECHO object MUST be delivered upstream immediately. Note that an intermediate LSR using RSVP refresh reduction [RSVP- REFRESH], the new or changed LSP_ECHO object will cause the LSR to classify the RSVP message as a trigger message. draft-pan-lsp-ping-01.txt ^L[Page 5] Internet Draft draft-pan-lsp-ping-01.txt July 2001 5. Security Considerations The mechanism introduced in this document can prevent potential DOS attacks. The security considerations pertaining to the original RSVP protocol remain relevant. 6. Intellectual Property Considerations Juniper Networks, Inc. is seeking patent protection on technology described in this Internet-Draft. If technology in this Internet- Draft is adopted as a standard, Juniper Networks agrees to license, on reasonable and non-discriminatory terms, any patent rights it obtains covering such technology to the extent necessary to comply with the standard. 7. Acknowledgments This is the outcome of many discussions among many people, that also include Manoj Leelanivas, Paul Traina, Kireeti Kompella, Yakov Rekhter, Der-Hwa Gan, Brook Bailey, Eric Rosen and Ron Bonica. 8. References [ICMP] J. Postel, "Internet Control Message Protocol", RFC792. [RSVP] R. Braden, Ed., et al, "Resource ReSerVation protocol (RSVP) -- version 1 functional specification," RFC2205. [RSVP-TE] D. Awduche, et al, "RSVP-TE: Extensions to RSVP for LSP tunnels" Internet Draft. [LABEL-STACKING] E. Rosen, et al, "MPLS Label Stack Encoding", RFC3032. [RSVP-REFRESH] L. Berger, et al, "RSVP Refresh Overhead Reduction Extensions", RFC2961. draft-pan-lsp-ping-01.txt ^L[Page 6] Internet Draft draft-pan-lsp-ping-01.txt July 2001 9. Author Information Ping Pan Juniper Networks 1194 N.Mathilda Ave Sunnyvale, CA 94089 e-mail: pingpan@juniper.net phone: 408.745.3704 Nischal Sheth Juniper Networks 1194 N.Mathilda Ave Sunnyvale, CA 94089 e-mail: nsheth@juniper.net phone: 408.745.2068 Dave Cooper Global Crossing 960 Hamlin Court Sunnyvale, CA 94089 email: dcooper@gblx.net phone: 916.415.0437 George Swallow Cisco Systems, Inc. 250 Apollo Drive Chelmsford, MA 01824 e-mail: swallow@cisco.com phone: 978.244.8143 Sanjay Wadhwa Unisphere Networks, Inc. 10 Technology Park Drive Westford, MA 01886-3146 email: swadhwa@unispherenetworks.com phone: 978.589.0697 draft-pan-lsp-ping-01.txt ^L[Page 7]