<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="std" docName="draft-heitz-bess-evpn-option-b-00" ipr="trust200902">
  <front>
    <title abbrev="EVPN Inter-AS Option B">Multi-homing in EVPN with Inter-AS Option B</title>

    <author fullname="Jakob Heitz" initials="J. " surname="Heitz">
      <organization>Cisco</organization>
      <address>
        <postal>
          <street>170 West Tasman Drive</street>
          <city>San Jose, CA</city>
          <region>CA</region>
          <code>95054</code>
          <country>USA</country>
        </postal>
        <email>jheitz@cisco.com</email>
      </address>
    </author>

    <author fullname="Ali Sajassi" initials="A. " surname="Sajassi">
      <organization>Cisco</organization>
      <address>
        <postal>
          <street>170 West Tasman Drive</street>
          <city>San Jose, CA</city>
          <region>CA</region>
          <code>95054</code>
          <country>USA</country>
        </postal>
        <email>sajassi@cisco.com</email>
      </address>
    </author>

    <author fullname="John Drake" initials="J. " surname="Drake">
      <organization>Juniper</organization>
      <address>
        <postal>
          <street></street>
        </postal>
        <email>jdrake@juniper.net</email>
      </address>
    </author>

    <author fullname="Jorge Rabadan" initials="J. " surname="Rabadan">
      <organization>Nokia</organization>
      <address>
        <postal>
          <street></street>
        </postal>
        <email>jorge.rabadan@nokia.com</email>
      </address>
    </author>

    <date/>

    <area>Routing</area>
    <workgroup>BESS</workgroup>
    <keyword>BGP</keyword>
    <keyword>communities</keyword>

    <abstract>
      <t>The BGP speaker that originates an EVPN Ethernet A-D per ES route is identified by the next-hop of the route. When the route is propagated by an ASBR as an Inter-AS Option B route, the ASBR overwrites the next-hop. This document describes a method to identify the originator of the route.</t>
    </abstract>

    <note title="Requirements Language">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in <xref target="RFC2119"/>.</t>
    </note>

  </front>

  <middle>
    <section title="Terminology">
      <t>Inter-AS Option B: This is described in Section 10.b of <xref target="RFC4364"/></t>
      <t>EAD-per-ES: Ethernet A-D per Ethernet Segment Route.</t>
      <t>EAD-per-EVI: Ethernet A-D per EVPN Instance Route.</t>
      <t>EAD: EVPN Type 1 route: Ethernet Auto-discovery Route. Either an EAD-per-ES or an EAD-per-EVI route.</t>
      <t>Type 2/5: either the EVPN Type 2 route: MAC/IP Advertisement Route or the EVPN Type 5 route: IP Prefix Route described in <xref target="I-D.ietf-bess-evpn-prefix-advertisement"/>.</t>
      <t>Mass Withdraw: To withdraw the route from the forwarding table. For example, a MAC route that is mass withdrawn remains in the BGP table. The MAC route is required for directing packets with the specified MAC destination address to a matching backup or alias route. When a MAC route is completely withdrawn, then the matching backup or alias routes can no longer be used for the given MAC address. The withdrawal of an EAD-per-ES route will cause the mass withdrawal of associated Type 2/5 routes as well as associated EAD-per-EVI routes.</t>
    </section>

    <section title="Introduction">
      <t>Inter-AS Option B is illustrated in Figure 1.
<figure align="center"><artwork><![CDATA[
        PE1
       /   \
     CE1    ASBR1---ASBR2---PE3--CE2
       \   /
        PE2

        Figure 1: Inter-AS Option B
]]></artwork></figure></t>
      <t>Traffic flow is from CE2 to CE1 where PE3 is an imposition PE, and PE1 and PE2 are disposition PEs.</t>
      <t>In a multi-homing scenario, the router that performs the redundancy switchover or the load balancing (e.g. PE3) must know which router originated the Ethernet A-D routes. These redundancy functions are normally implemented on a PE, but not on an ASBR.</t>
      <t>Quote from <xref target="RFC7432"/>:
      <list><t>"A remote PE that receives a MAC/IP Advertisement route with a non-reserved ESI SHOULD consider the advertised MAC address to be reachable via all PEs that have advertised reachability to that MAC address's EVI/ES via the combination of an Ethernet A-D per EVI route for that EVI/ES (and Ethernet tag, if applicable) AND Ethernet A-D per ES routes for that ES."</t></list>
      </t>
      <t>In the Intra-AS case, the remote PE identifies the "PEs that have advertised reachability" by the next-hops of the Ethernet A-D routes. In the Inter-AS option B case, ASBR1 and ASBR2 rewrite the next-hops to themselves on all EVPN route advertisements, thus losing the identity of the PE that originated an advertisement.</t>
      <t>As a result, PE3 is unable to distinguish an EAD-per-ES route that originated at PE1 from one that originated at PE2.</t>
    </section>

    <section title="Solution using the Tunnel Encapsulation Attribute">

      <t>The Tunnel Encapsulation Attribute is specified in <xref target="I-D.ietf-idr-tunnel-encaps"/>. A new TLV to identify the PE of Origin is specified here. It is called PEO. The tunnel type for the PEO (suggested value 15) is to be assigned by IANA. The PEO MUST contain the Remote Endpoint Sub-TLV. The PEO must be able to uniquely identify the PE of origin within all ASes that participate in an EVPN instance.</t>

      <t>If a BGP speaker, such as a route reflector or an ASBR, is about to re-advertise a Type 2/5 or EAD route that does not have a PEO, and will change the next-hop of that route, then it MUST add one by putting the received next-hop into the Remote Endpoint Sub-TLV of the PEO. This will ensure that all originating EVPN routes carry the necessary information for imposition PEs to function properly for aliasing and mass withdraw.</t>

      <t>Any router that re-advertises a route that contains a PEO may modify some TLVs in the Tunnel Encapsulation Attribute attribute. However, it MUST keep the PEO unchanged. Examples are ASBR1 and ASBR2 in Figure 1.</t>

    </section>

    <section title="Operation">
      <t>For an inter-AS option B scenario, when a PE receives EVPN route(s) with PEO from an ASBR, then everything works per <xref target="RFC7432"/> specification including both aliasing function and mass withdraw. i.e., the imposition PE (e.g., PE3) can process mass withdraw messages (Ethernet A-D per ES route). However, if a PE receives EVPN route(s) without a PEO from an ASBR, then the mass withdraw function operates in a degenerate mode where only Ethernet A-D per EVI route can be processed (for its corresponding MAC-VRF) but not Ethernet A-D per ES route (corresponding to all the impacted MAC-VRFs). The following sections detail the procedures associated with PEO processing.</t>
    </section>

    <section title="Procedures at the Imposition PE">
       <section title="Primer for subsequent sections">
         <t>When routes are being compared, they must exist in the same MAC-VRF and have the same non-reserved ESI. In addition, when Type 2/5 routes and EAD-per-EVI routes are being compared, they must have the same Ethernet Tag. Type 2/5 routes with ESI==0 do not use mass withdrawal or aliasing.</t>
       </section>

       <section title="PEO exists on all Type 2/5 and EAD Routes">
         <t>If all Type 2/5 and EAD routes have a PEO, then "PEs that have advertised reachability" can be identified by the PEO and the procedures of <xref target="RFC7432"/> can be applied without modification.</t>
       </section>

       <section title="Some routes do not contain PEO">

         <t>The routes that have a PEO are handled as per the previous section. The routes that do not have a PEO need the following procedures.</t>

         <t>Type 2/5 routes without a PEO and EAD-per-EVI routes without a PEO are valid if at least one EAD-per-ES route without a PEO exists with the same next-hop. In other words: if multiple EAD-per-ES routes with the same next-hop as a Type 2/5 route exist, then the Type 2/5 route will only be mass withdrawn once all of the EAD-per-ES routes are withdrawn. This rule is necessary, because a BGP speaker may serve dual roles as ASBR and PE</t>

         <t>[Editorial note: If it is determined that no BGP speakers exist that do not normally follow the procedures in this document (Legacy speakers) then the following sub sections may be omitted]</t>

         <t>If an EAD-per-EVI route without a PEO is withdrawn, it will mass withdraw all Type 2/5 routes without a PEO that have the same next-hop and the same RD as the EAD-per-EVI route. This is called mass-withdraw per EVI. Note, it is not the absence of the EAD-per-EVI route that causes mass-withdrawal, but the actual withdrawal itself. If the route was never there to begin with, then no withdrawal took place.</t>

         <t>If any entity in the network rewrites an RD, then all entities must rewrite the RD in a consistent manner, such that routes with the same RD continue to have the same RD and routes with different RDs continue to have different RDs. Note that if this condition is violated, then other network functions would also break.</t>
       </section>

       <section title="PEO exists on EAD routes, but not on Type 2/5 routes">
         <t>If a Type 2/5 route exists without a PEO and an EAD-per-EVI route exists with a PEO and it has the same next-hop and the same RD as the Type 2/5 route, then the Type 2/5 route shall inherit the PEO from the EAD-per-EVI route. Thereafter, section 5.2 applies.</t>
       </section>

    </section>

    <section title="Security Considerations">
       <t>TBD</t>
    </section>

    <section title="IANA Considerations">
       <t>A Tunnel Encapsulation Attribute Tunnel Type for the PEO is required.</t>
    </section>

    <section title="Acknowledgements">
       <t>Thanks to Kiran Pillai, Patrice Brissette, Satya Mohanty and Keyur Patel for careful review and suggestions.</t>
    </section>

    <section title="Appendix">
    <section title="Alternative Ways to Signal PEO">
       <t>[Note to RFC editor: This appendix to be removed before publication]</t>

       <section title="Extended Community holding the IP addres">
         <t>The Extended Community to use must be transitive and either IPv4 Specific or IPv6 Specific as described in <xref target="RFC5701"/>. Thus, if it is IPv4 Specific, it will be of type 0x41 and if IPv6 Specific, it will be of type 0x40.</t>
         <t>The extended community will hold the IP address of the PE that originates the EVPN routes.</t>
       </section>

       <section title="Large Community holding the BGP Identifier">
         <t>A PE can be uniquely identified by its BGP identifier (also called Router ID) and its AS number. A Large Community is a 4-octet AS specific extended community with a 6 octet local administrator field. The local administrator field should carry the BGP identifier.</t>
       </section>

    </section>

    <section title="Considerations">
       <t>It may be possible to associate the EAD-per-ES route with the Type 2/5 route by matching the Administrator Subfield of the RD. However, there are too many constraints that need to be met to make this method reliable. Basically, the RD was emphatically designed to distinguish routes, not to identify them. The constraints that need to be met are:
       <list style='symbols'>
          <t>The RD MUST by of Type 1. <xref target="RFC7432"/> recommends Type 1, but does not mandate it.</t>
          <t>The Administrator subfield of the RD MUST be the same for each of these routes originated by one PE. <xref target="RFC7432"/> does not require this. It just says "The value field comprises an IP address of the PE", but does not say that it must be the same IP address for all. In an IPv6 only scenario, other ways will be used to assign RD.</t>
          <t>The Administrator subfield of the RD MUST be unique among all PEs participating in the Inter-AS EVPN. This is likely, but not guaranteed.</t>
          <t>If RDs are rewritten at AS boundaries, then the Administrator subfield MUST be rewritten in a consistent way such as to preserve the above properties.</t>
       </list></t>

       <t>By allowing a single EAD-per-ES route to validate all EAD-per-EVI routes and all Type 2/5 routes, some of those routes may be falsely validated. However that is the best possible outcome without a PEO. It is transient until the Type 2/5 route can be withdrawn.</t>

       <t>The possibility of the address space of PE next-hops in one AS overlapping that of another AS was raised. In such a case, the IP address of a PE in one AS may be the same as the IP address of a different PE in another AS. Because an ASBR overwrites next-hops, this can work. The PEO contains both the ASN as well as the IP address of the originating PE, so this works too. However, EVPN route types 3 and 4 contain only the originating router's IP address, but not the originating router's ASN. Therefore, EVPN route types 3 and 4 may also need a PEO.</t>

       <t>The possibility of making the EAD-per-EVI route mandatory was raised. This would make some of the procedures easier, because the RD of the EAD-per-EVI route can be matched with the RD of the Type 2/5 route</t>

    </section>

    </section>

  </middle>

  <back>
    <references title="Normative References">
      <?rfc include='reference.I-D.draft-ietf-idr-tunnel-encaps-02'?>
      <?rfc include='reference.I-D.draft-ietf-bess-evpn-prefix-advertisement-02'?>
      <?rfc include="reference.RFC.2119"?>
      <?rfc include="reference.RFC.4364"?>
      <?rfc include="reference.RFC.5701"?>
      <?rfc include="reference.RFC.7432"?>
    </references>

  </back>
</rfc>
