<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="exp" docName="draft-ietf-idr-bgp-fwd-rr-00" ipr="trust200902">
  <front>
    <title abbrev="BGP Forwarding Route Reflector">BGP Route Reflector in
    Forwarding Path</title>

    <author fullname="Kaliraj Vairavakkalai" initials="K." role="editor"
            surname="Vairavakkalai">
      <organization>Juniper Networks, Inc.</organization>

      <address>
        <postal>
          <street>1133 Innovation Way,</street>

          <city>Sunnyvale</city>

          <region>CA</region>

          <code>94089</code>

          <country>US</country>
        </postal>

        <email>kaliraj@juniper.net</email>
      </address>
    </author>

    <author fullname="Natrajan Venkataraman" initials="N." role="editor"
            surname="Venkataraman">
      <organization>Juniper Networks, Inc.</organization>

      <address>
        <postal>
          <street>1133 Innovation Way,</street>

          <city>Sunnyvale</city>

          <region>CA</region>

          <code>94089</code>

          <country>US</country>
        </postal>

        <email>natv@juniper.net</email>
      </address>
    </author>

    <date day="16" month="February" year="2024"/>

    <abstract>
      <t>The procedures in BGP Route Reflection (RR) spec <xref
      target="RFC4456"/> primarily deal with scenarios where the RR is not in
      forwarding path, and is reflecting BGP routes with next hop
      unchanged.</t>

      <t>These procedures can sometimes result in traffic forwarding loops in
      deployments where the RR is in forwarding path, and is reflecting BGP
      routes with next hop set to self.</t>

      <t>This document specifies approaches to minimize possiblity of such
      traffic forwarding loops. One of those approaches updates path selection
      procedures specified in <xref target="RFC4456">Section 9 of BGP
      RR.</xref></t>
    </abstract>

    <note title="Requirements Language">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
      "OPTIONAL" in this document are to be interpreted as described in BCP 14
      <xref target="RFC2119">RFC 2119</xref> <xref target="RFC8174">RFC
      8174</xref> when, and only when, they appear in all capitals, as shown
      here.</t>
    </note>
  </front>

  <middle>
    <section title="Introduction">
      <t>The procedures in BGP Route Reflection (RR) spec <xref
      target="RFC4456"/> primarily deal with scenarios where the RR is not in
      forwarding path, and is reflecting BGP routes with next hop
      unchanged.</t>

      <t>These procedures can sometimes result in traffic forwarding loops in
      deployments where the RR is in forwarding path, and is reflecting BGP
      routes with next hop set to self.</t>

      <t>This document specifies approaches to minimize possiblity of such
      traffic forwarding loops. One of those approaches updates path selection
      procedures specified in <xref target="RFC4456">Section 9 of BGP
      RR.</xref></t>
    </section>

    <section title="Terminology">
      <t>AS: Autonomous System</t>

      <t>NLRI: Network Layer Reachability Information</t>

      <t>AFI: Address Family Identifier</t>

      <t>SAFI: Subsequent Address Family Identifier</t>

      <t>SN: Service Node</t>

      <t>BN: Border Node</t>

      <t>PE: Provider Edge</t>

      <t>EP: Endpoint, e.g. a loopback address in the network</t>

      <t>MPLS: Multi Protocol Label Switching</t>
    </section>

    <section title="Avoiding Loops Between Route Reflectors in Forwarding Path">
      <figure anchor="RefTopo" suppress-title="false"
              title="Reference Topology: Inter-domain BGP Transport Network">
        <artwork align="left" xml:space="preserve">
                [RR26]      [RR27]                       [RR16]
                 |            |                             |
                 |            |                             |
                 |+-[ABR23]--+|+--[ASBR21]---[ASBR13]-+|+--[PE11]--+
                 ||          |||          `  /        |||          |
[CE41]--[PE25]--[P28]       [P29]          `/        [P15]     [CE31]
                 |           | |           /`         | |          |
                 |           | |          /  `        | |          |
                 |           | |         /    `       | |          |
                 +--[ABR24]--+ +--[ASBR22]---[ASBR14]-+ +--[PE12]--+


       |      AS2       |         AS2      |                   |
   AS4 +    region-1    +      region-2    +       AS1         + AS3
       |                |                  |                   |


203.0.113.41  ------------ Traffic Direction ----------&gt;  203.0.113.31

</artwork>
      </figure>

      <list>
        <t>A pair of redundant ABRs (ABR23, ABR24 in <xref
        target="RefTopo"/>), each acting as an RR with next hop self, may
        choose each other as best path towards egress PE11, instead of the
        upstream ASBR (ASBR21 or ASBR22), causing a traffic forwarding
        loop.</t>

        <t>This happens because of following the path selection rule specified
        in <xref target="RFC4456">Section 9 of BGP RR</xref> that tie-breaks
        on ORIGINATOR_ID before CLUSTER_LIST. RFC4456 considers pure RR
        functionality which is not in forwarding path. When a RR is in
        forwarding path and reflects routes with next hop self, as is the case
        for ABR BNs in a BGP transport network, this rule may cause loops.</t>

        <t>This problem can happen for routes of any BGP address family,
        including BGP LU (1/4 or 2/4) and BGP CT (AFI/SAFIs: 1/76 or
        2/76).</t>

        <t>Using one or more of the following approaches softens the
        possibility of such loops in a network with redundant ABRs.</t>
      </list>

      <section title="Path selection change">
        <list>
          <t>Implementations SHOULD provide a way to alter the tie-breaking
          rule specified in <xref target="RFC4456">Section 9 of BGP RR</xref>
          so as to tie-break on CLUSTER_LIST step before ORIGINATOR_ID step,
          when performing path selection for BGP routes.</t>

          <t>This document suggests the following modification to the BGP
          Decision Process Tie Breaking rules (Section 9.1.2.2 of <xref
          target="RFC4271"/>) that can be applied to path selection of BGP
          routes:</t>

          <t>The following rule SHOULD be inserted between Steps e) and f): a
          BGP Speaker SHOULD prefer a route with the shorter CLUSTER_LIST
          length. The CLUSTER_LIST length is zero if a route does not carry
          the CLUSTER_LIST attribute.</t>
        </list>
      </section>

      <section title="Other mechanisms">
        <list>
          <t>Taking into account some other deployment considerations can also
          help in avoiding this problem, e.g.,: <list>
              <t>IGP metric should be assigned such that "ABR to redundant
              ABR" cost is inferior to "ABR to upstream ASBR" cost.</t>

              <t>Using procedures described in <xref target="BGP-CT"/> ,
              tunnels belonging to non 'best effort' Transport Classes SHOULD
              NOT be provisioned between ABRs. This will ensure that the BGP
              CT route received from an ABR with next hop self will not be
              usable at a redundant ABR.</t>
            </list></t>
        </list>
      </section>
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>This document makes no new requests of IANA.</t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>This document does not change the underlying security issues inherent
      in the existing BGP protocol, such as those described in <xref
      target="RFC4271"/>, <xref target="RFC4272"/> and <xref
      target="RFC4456"/>.</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.4456.xml"?>

      <?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"?>

      <?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.4271.xml"?>

      <?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.4272.xml"?>

      <?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"?>
    </references>

    <references title="Informative References">
      <reference anchor="BGP-CT"
                 target="https://datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-ct-16">
        <front>
          <title abbrev="BGP CT">BGP Classful Transport Planes</title>

          <author fullname="Kaliraj Vairavakkalai" initials="" role="editor"
                  surname="Vairavakkalai"/>

          <author fullname="Natarajan Venkataraman" initials="" role="editor"
                  surname="Venkataraman"/>

          <date day="22" month="09" year="2023"/>
        </front>
      </reference>
    </references>

    <section anchor="Appendix A" numbered="true" title="Appendix">
      <section title="Document History">
        <t>The content in this document was introduced as part of <xref
        target="BGP-CT"/>. But because the described problem is not specific
        to BGP CT and is useful for other BGP families also, it is being
        extracted out to this separate document.</t>
      </section>
    </section>

    <section anchor="Contributors" numbered="false" title="Contributors">
      <section anchor="Co-Authors" numbered="false" title="Co-Authors">
        <author fullname="Reshma Das" initials="D." surname="Das">
          <organization>Juniper Networks, Inc.</organization>

          <address>
            <postal>
              <street>1133 Innovation Way,</street>

              <city>Sunnyvale</city>

              <region>CA</region>

              <code>94089</code>

              <country>US</country>
            </postal>

            <email>dreshma@juniper.net</email>
          </address>
        </author>

        <author fullname="Israel Means" initials="I" surname="Means">
          <organization abbrev="">AT&amp;T</organization>

          <address>
            <postal>
              <street>2212 Avenida Mara,</street>

              <city>Chula Vista</city>

              <region>California</region>

              <code>91914</code>

              <country>USA</country>
            </postal>

            <email>israel.means@att.com</email>
          </address>
        </author>

        <author fullname="Csaba Mate" initials="CS" surname="Mate">
          <organization abbrev="">KIFU, Hungarian NREN</organization>

          <address>
            <postal>
              <street>35 Vaci street,</street>

              <city>Budapest</city>

              <region/>

              <code>1134</code>

              <country>Hungary</country>
            </postal>

            <email>ietf@nop.hu</email>
          </address>
        </author>

        <author fullname="Deepak J Gowda" initials="J" surname="Gowda">
          <organization abbrev="">Extreme Networks</organization>

          <address>
            <postal>
              <street>55 Commerce Valley Drive West, Suite 300,</street>

              <city>Thornhill, Toronto,</city>

              <region>Ontario</region>

              <code>L3T 7V9</code>

              <country>Canada</country>
            </postal>

            <email>dgowda@extremenetworks.com</email>
          </address>
        </author>
      </section>

      <section anchor="Other Contributors" numbered="false"
               title="Other Contributors">
        <author fullname="Balaji Rajagopalan" initials="B."
                surname="Rajagopalan">
          <organization>Juniper Networks, Inc.</organization>

          <address>
            <postal>
              <street>Electra, Exora Business Park~Marathahalli - Sarjapur
              Outer Ring Road,</street>

              <city>Bangalore</city>

              <region>KA</region>

              <code>560103</code>

              <country>India</country>
            </postal>

            <email>balajir@juniper.net</email>
          </address>
        </author>

        <author fullname="Rajesh M" initials="M">
          <organization>Juniper Networks, Inc.</organization>

          <address>
            <postal>
              <street>Electra, Exora Business Park~Marathahalli - Sarjapur
              Outer Ring Road,</street>

              <city>Bangalore</city>

              <region>KA</region>

              <code>560103</code>

              <country>India</country>
            </postal>

            <email>mrajesh@juniper.net</email>
          </address>
        </author>

        <author fullname="Chaitanya Yadlapalli" initials="C"
                surname="Yadlapalli">
          <organization abbrev="">AT&amp;T</organization>

          <address>
            <postal>
              <street>200 S Laurel Ave,</street>

              <city>Middletown,</city>

              <region>NJ</region>

              <code>07748</code>

              <country>USA</country>
            </postal>

            <email>cy098d@att.com</email>
          </address>
        </author>

        <author fullname="Mazen Khaddam" initials="M" surname="Khaddam">
          <organization abbrev="">Cox Communications Inc.</organization>

          <address>
            <postal>
              <street/>

              <city>Atlanta</city>

              <region>GA</region>

              <code/>

              <country>USA</country>
            </postal>

            <email>mazen.khaddam@cox.com</email>
          </address>
        </author>

        <author fullname="Rafal Jan Szarecki" initials="R" surname="Szarecki">
          <organization abbrev="">Google.</organization>

          <address>
            <postal>
              <street>1160 N Mathilda Ave, Bldg 5,</street>

              <city>Sunnyvale,</city>

              <region>CA</region>

              <code>94089</code>

              <country>USA</country>
            </postal>

            <email>szarecki@google.com</email>
          </address>
        </author>

        <author fullname="Xiaohu Xu" initials="X" surname="Xu">
          <organization abbrev="">China Mobile</organization>

          <address>
            <postal>
              <street/>

              <city>Beijing</city>

              <region/>

              <code/>

              <country>China</country>
            </postal>

            <email>xuxiaohu@cmss.chinamobile.com</email>
          </address>
        </author>
      </section>
    </section>

    <section anchor="Acknowledgements" numbered="false"
             title="Acknowledgements">
      <t>The authors thank Jeff Haas, John Scudder, Susan Hares, Dongjie
      (Jimmy), Moses Nagarajah, Jeffrey (Zhaohui) Zhang, Joel Harpern,
      Jingrong Xie, Mohamed Boucadair, Greg Skinner, Simon Leinen, Navaneetha
      Krishnan, Ravi M R, Chandrasekar Ramachandran, Shradha Hegde, Colby
      Barth, Vishnu Pavan Beeram, Sunil Malali, William J Britto, R Shilpa,
      Ashish Kumar (FE), Sunil Kumar Rawat, Abhishek Chakraborty, Richard
      Roberts, Krzysztof Szarkowicz, John E Drake, Srihari Sangli, Jim Uttaro,
      Luay Jalil, Keyur Patel, Ketan Talaulikar, Dhananjaya Rao, Swadesh
      Agarwal, Robert Raszuk, Ahmed Darwish, Aravind Srinivas Srinivasa
      Prabhakar, Moshiko Nayman, Chris Tripp, Gyan Mishra, Vijay Kestur,
      Santosh Kolenchery for all the valuable discussions, constructive
      criticisms, and review comments.</t>

      <t>The decision to not reuse SAFI 128 and create a new address-family to
      carry these transport-routes was based on suggestion made by Richard
      Roberts and Krzysztof Szarkowicz.</t>
    </section>
  </back>
</rfc>
