<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?xml-stylesheet type='text/xsl' href='http://xml.resource.org/authoring/rfc2629.xslt' ?>
<!-- Alterations to I-D/RFC boilerplate -->
<?rfc private="" ?>
<!-- Default private="" Produce an internal memo 2.5pp shorter than an I-D or RFC -->
<?rfc rfcprocack="yes" ?>
<!-- Default rfcprocack="no" add a short sentence acknowledging xml2rfc -->
<?rfc strict="no" ?>
<!-- Default strict="no" Don't check I-D nits -->
<?rfc rfcedstyle="yes" ?>
<!-- Default rfcedstyle="yes" attempt to closely follow finer details from the latest observable RFC-Editor style -->
<!-- IETF process -->
<?rfc iprnotified="no" ?>
<!-- Default iprnotified="no" I haven't disclosed existence of IPR to IETF -->
<!-- ToC format -->
<?rfc toc="yes" ?>
<!-- Default toc="no" No Table of Contents -->
<!-- Cross referencing, footnotes, comments -->
<?rfc symrefs="yes"?>
<!-- Default symrefs="no" Don't use anchors, but use numbers for refs -->
<?rfc sortrefs="yes"?>
<!-- Default sortrefs="no" Don't sort references into order -->
<?rfc comments="yes" ?>
<!-- Default comments="no" Don't render comments -->
<?rfc inline="no" ?>
<!-- Default inline="no" if comments is "yes", then render comments inline; otherwise render them in an `Editorial Comments' section -->
<!-- Pagination control -->
<?rfc compact="yes"?>
<!-- Default compact="no" Start sections on new pages -->
<?rfc subcompact="no"?>
<!-- Default subcompact="(as compact setting)" yes/no is not quite as compact as yes/yes -->
<!-- HTML formatting control -->
<?rfc emoticonic="yes" ?>
<!-- Default emoticonic="no" Doesn't prettify HTML format -->
<rfc category="exp" docName="draft-ietf-tcpm-accurate-ecn-10"
     ipr="trust200902" updates="">
  <front>
    <title abbrev="Accurate TCP-ECN Feedback">More Accurate ECN Feedback in
    TCP</title>

    <author fullname="Bob Briscoe" initials="B." surname="Briscoe">
      <organization>Independent</organization>

      <address>
        <postal>
          <street/>

          <city/>

          <country>UK</country>
        </postal>

        <email>ietf@bobbriscoe.net</email>

        <uri>http://bobbriscoe.net/</uri>
      </address>
    </author>

    <author fullname="Mirja K&uuml;hlewind" initials="M."
            surname="K&uuml;hlewind">
      <organization>Ericsson</organization>

      <address>
        <postal>
          <street/>

          <country>Germany</country>
        </postal>

        <email>ietf@kuehlewind.net</email>
      </address>
    </author>

    <author fullname="Richard Scheffenegger" initials="R."
            surname="Scheffenegger">
      <organization>NetApp</organization>

      <address>
        <postal>
          <street/>

          <city>Vienna</city>

          <region/>

          <code/>

          <country>Austria</country>
        </postal>

        <email>Richard.Scheffenegger@netapp.com</email>
      </address>
    </author>

    <date year="2020"/>

    <area>Transport</area>

    <workgroup>TCP Maintenance &amp; Minor Extensions (tcpm)</workgroup>

    <keyword>Congestion Control and Management</keyword>

    <keyword>Congestion Notification</keyword>

    <keyword>Feedback</keyword>

    <keyword>Reliable</keyword>

    <keyword>Ordered</keyword>

    <keyword>Protocol</keyword>

    <keyword>ECN</keyword>

    <abstract>
      <t>Explicit Congestion Notification (ECN) is a mechanism where network
      nodes can mark IP packets instead of dropping them to indicate incipient
      congestion to the end-points. Receivers with an ECN-capable transport
      protocol feed back this information to the sender. ECN is specified for
      TCP in such a way that only one feedback signal can be transmitted per
      Round-Trip Time (RTT). Recent new TCP mechanisms like Congestion
      Exposure (ConEx), Data Center TCP (DCTCP) or Low Latency Low Loss
      Scalable Throughput (L4S) need more accurate ECN feedback information
      whenever more than one marking is received in one RTT. This document
      specifies an experimental scheme to provide more than one feedback
      signal per RTT in the TCP header. Given TCP header space is scarce, it
      allocates a reserved header bit, that was previously used for the
      ECN-Nonce which has now been declared historic. It also overloads the
      two existing ECN flags in the TCP header. The resulting extra space is
      exploited to feed back the IP-ECN field received during the 3-way
      handshake as well. Supplementary feedback information can optionally be
      provided in a new TCP option, which is never used on the TCP SYN.</t>
    </abstract>
  </front>

  <!-- ================================================================ -->

  <middle>
    <!-- ================================================================ -->

    <section anchor="accecn_Introduction" title="Introduction">
      <t>Explicit Congestion Notification (ECN) <xref target="RFC3168"/> is a
      mechanism where network nodes can mark IP packets instead of dropping
      them to indicate incipient congestion to the end-points. Receivers with
      an ECN-capable transport protocol feed back this information to the
      sender. ECN is specified for TCP in such a way that only one feedback
      signal can be transmitted per Round-Trip Time (RTT). Recently, proposed
      mechanisms like Congestion Exposure (ConEx <xref target="RFC7713"/>),
      DCTCP <xref target="RFC8257"/> or L4S <xref
      target="I-D.ietf-tsvwg-l4s-arch"/> need to know when more than one
      marking is received in one RTT which is information that cannot be
      provided by the feedback scheme as specified in <xref
      target="RFC3168"/>. This document specifies an alternative feedback
      scheme that provides more accurate information and could be used by
      these new TCP extensions. A fuller treatment of the motivation for this
      specification is given in the associated requirements document <xref
      target="RFC7560"/>.</t>

      <t>This documents specifies an experimental scheme for ECN feedback in
      the TCP header to provide more than one feedback signal per RTT. It will
      be called the more accurate ECN feedback scheme, or AccECN for short. If
      AccECN progresses from experimental to the standards track, it is
      intended to be a complete replacement for classic TCP/ECN feedback, not
      a fork in the design of TCP. AccECN feedback complements TCP's loss
      feedback and it supplements classic TCP/ECN feedback, so its
      applicability is intended to include all public and private IP networks
      (and even any non-IP networks over which TCP is used today), whether or
      not any nodes on the path support ECN of whatever flavour.</t>

      <t>Until the AccECN experiment succeeds, <xref target="RFC3168"/> will
      remain as the only standards track specification for adding ECN to TCP.
      To avoid confusion, in this document we use the term 'classic ECN' for
      the pre-existing ECN specification <xref target="RFC3168"/>.</t>

      <t>AccECN feedback overloads the two existing ECN flags and allocates
      the currently reserved flag (previously called NS) in the TCP header, to
      be used as one field indicating the number of congestion experienced
      marked packets. Given the new definitions of these three bits, both ends
      have to support the new wire protocol before it can be used. Therefore
      during the TCP handshake the two ends use these three bits in the TCP
      header to negotiate the most advanced feedback protocol that they can
      both support, in a way that is backward compatible with <xref
      target="RFC3168"/>.</t>

      <t>AccECN is solely an (experimental) change to the TCP wire protocol;
      it only specifies the negotiation and signaling of more accurate ECN
      feedback from a TCP Data Receiver to a Data Sender. It is completely
      independent of how TCP might respond to congestion feedback, which is
      out of scope. For that we refer to <xref target="RFC3168"/> or any RFC
      that specifies a different response to TCP ECN feedback, for example:
      <xref target="RFC8257"/>; or ECN experiments such as those referred to
      in <xref target="RFC8311"/>, namely: a TCP-based Low Latency Low Loss
      Scalable (L4S) congestion control <xref
      target="I-D.ietf-tsvwg-l4s-arch"/>; ECN-capable TCP control packets
      <xref target="I-D.ietf-tcpm-generalized-ecn"/>, or Alternative Backoff
      with ECN (ABE) <xref target="RFC8511"/>.</t>

      <t>It is recommended that the AccECN protocol is implemented alongside
      SACK <xref target="RFC2018"/> and the experimental ECN++ protocol <xref
      target="I-D.ietf-tcpm-generalized-ecn"/>, which allows the ECN
      capability to be used on TCP control packets. Therefore, this
      specification does not discuss implementing AccECN alongside <xref
      target="RFC5562"/>, which was an earlier experimental protocol with
      narrower scope than ECN++.</t>

      <section title="Document Roadmap">
        <t>The following introductory sections outline the goals of AccECN
        (<xref target="accecn_Goals"/>) and the goal of experiments with ECN
        (<xref target="accecn_Expt_Goals"/>) so that it is clear what success
        would look like. Then terminology is defined (<xref
        target="accecn_Terminology"/>) and a recap of existing prerequisite
        technology is given (<xref target="accecn_Recap"/>).</t>

        <t><xref target="accecn_Overview"/> gives an informative overview of
        the AccECN protocol. Then <xref target="accecn_Spec"/> gives the
        normative protocol specification. <xref
        target="accecn_Interact_Variants"/> assesses the interaction of AccECN
        with commonly used variants of TCP, whether standardized or not. <xref
        target="accecn_Properties"/> summarizes the features and properties of
        AccECN.</t>

        <t><xref target="accecn_IANA_Considerations"/> summarizes the protocol
        fields and numbers that IANA will need to assign and <xref
        target="accecn_Security_Considerations"/> points to the aspects of the
        protocol that will be of interest to the security community.</t>

        <t><xref target="accecn_Algo_Examples"/> gives pseudocode examples for
        the various algorithms that AccECN uses.</t>

        <!-- <t>Three further appendices are included for use during document development {Delete this list before publication}:<list style="symbols">
            <t><xref target="accecn_Alt_Designs"/>: Protocol design
            alternatives that could be considered for inclusion in the main
            specification;</t>

            <t><xref target="accecn_Open_Issues"/>: a 'To Do' list of open
            protocol design issues;</t>

            <t><xref target="accecn_Doc_Changes"/>: Document change log.</t>
          </list></t>-->
      </section>

      <section anchor="accecn_Goals" title="Goals">
        <t><xref target="RFC7560"/> enumerates requirements that a candidate
        feedback scheme will need to satisfy, under the headings: resilience,
        timeliness, integrity, accuracy (including ordering and lack of bias),
        complexity, overhead and compatibility (both backward and forward). It
        recognizes that a perfect scheme that fully satisfies all the
        requirements is unlikely and trade-offs between requirements are
        likely. <xref target="accecn_Properties"/> presents the properties of
        AccECN against these requirements and discusses the trade-offs
        made.</t>

        <t>The requirements document recognizes that a protocol as ubiquitous
        as TCP needs to be able to serve as-yet-unspecified requirements.
        Therefore an AccECN receiver aims to act as a generic (dumb) reflector
        of congestion information so that in future new sender behaviours can
        be deployed unilaterally.</t>
      </section>

      <section anchor="accecn_Expt_Goals" title="Experiment Goals">
        <t>TCP is critical to the robust functioning of the Internet,
        therefore any proposed modifications to TCP need to be thoroughly
        tested. The present specification describes an experimental protocol
        that adds more accurate ECN feedback to the TCP protocol. The
        intention is to specify the protocol sufficiently so that more than
        one implementation can be built in order to test its function,
        robustness and interoperability (with itself and with previous version
        of ECN and TCP).</t>

        <!--<t> <list style="hanging">
            <t hangText="Success criteria: "> The experimental protocol
        will be considered successful if it is deployed and if it satisfies
        the requirements of <xref target="RFC7560"/> in the consensus opinion
        of the IETF tcpm working group. In short, this requires that it
        improves the accuracy and timeliness of TCP's ECN feedback, as claimed
        in <xref target="accecn_Properties"/>, while striking a balance
        between the conflicting requirements of resilience, integrity and
        minimization of overhead. It also requires that it is not unduly
        complex, and that it is compatible with prevalent equipment behaviours
        in the current Internet (e.g. hardware offloading and middleboxes),
        whether or not they comply with standards.</t>-->

        <!-- <t hangText="Duration: ">To be credible, the experiment will need
            to last at least 12 months from publication of the present
            specification. At that time, a report on the experiment will be
            written up. If successful, it would then be appropriate to work on
            a standards track specification that adds more accurate ECN
            feedback to TCP.</t>
          </list></t>
ToDo: Why is this timescale point commented out? It means that, if successful after 12 months, the chairs cannot delay moving from expt to stds track. But we are still allowed to take longer to be successful.
-->

        <t>The experimental protocol will be considered successful if testing
        confirms that the proposed mechanism can be deployed at large scale.
        Testing will mostly focus on fall-back strategies in case of middlebox
        interference. Current recommended strategies are specified in Sections
        <xref format="counter" target="accecn_sec_SYN_rexmt"/>, <xref
        format="counter" target="accecn_sec_ACE_init_invalid"/>, <xref
        format="counter" target="accecn_sec_ecn-mangling"/> and <xref
        format="counter" target="accecn_Mbox_Interference"/>. The
        effectiveness of these strategies depends on the actual deployment
        situation of middleboxes. Therefore experimental verification to
        confirm large-scale path traversal in the Internet is needed before
        finalizing this specification on the Standards Track.</t>

        <t>Another experimentation focus is the implementation feasibiliy of
        change-triggered ACKs as described in section <xref format="counter"
        target="accecn_option_usage"/>. While on average this should not lead
        to a higher ACK rate, it changes the ACK pattern which can
        particularly have an impact on hardware offload. It is currently
        specified as a hard requirement, because the sender can exploit the
        predictability of the receiver's behaviour. However, further
        experimentation is needed to advise if will have to become just
        preferred behavior.</t>
      </section>

      <section anchor="accecn_Terminology" title="Terminology">
        <t><list style="hanging">
            <t hangText="AccECN:">The more accurate ECN feedback scheme will
            be called AccECN for short.</t>

            <t hangText="Classic ECN:">the ECN protocol specified in <xref
            target="RFC3168"/>.</t>

            <t hangText="Classic ECN feedback:">the feedback aspect of the ECN
            protocol specified in <xref target="RFC3168"/>, including
            generation, encoding, transmission and decoding of feedback, but
            not the Data Sender's subsequent response to that feedback.</t>

            <t hangText="ACK:">A TCP acknowledgement, with or without a data
            payload (ACK=1).</t>

            <t hangText="Pure ACK:">A TCP acknowledgement without a data
            payload.</t>

            <t hangText="Acceptable packet / segment:">A packet or segment
            that passes the acceptability tests in <xref target="RFC0793"/>
            and <xref target="RFC5961"/>.</t>

            <!-- <t hangText="SupAccECN:">The Supplementary Accurate ECN field that
            provides additional resilience as well as information about the
            ordering of ECN markings covered by a delayed ACK.</t> -->

            <t hangText="TCP client:">The TCP stack that originates a
            connection.</t>

            <t hangText="TCP server:">The TCP stack that responds to a
            connection request.</t>

            <t hangText="Data Receiver:">The endpoint of a TCP half-connection
            that receives data and sends AccECN feedback.</t>

            <t hangText="Data Sender:">The endpoint of a TCP half-connection
            that sends data and receives AccECN feedback.</t>

            <!-- <t
            hangText="Outgoing AccECN Protocol Handler (or, Outgoing Protocol Handler):">The
            protocol handler at the Data Receiver that marshals the AccECN
            fields when sending an ACK.</t>

            <t
            hangText="Incoming AccECN Protocol Handler (or, Incoming Protocol Handler):">The
            protocol handler at the Data Sender that reads the AccECN fields
            when receiving an ACK.</t> -->
          </list></t>

        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
        document are to be interpreted as described in BCP 14 <xref
        target="RFC2119"/> <xref target="RFC8174"/> when, and only when, they
        appear in all capitals, as shown here.</t>
      </section>

      <section anchor="accecn_Recap"
               title="Recap of Existing ECN feedback in IP/TCP">
        <t>ECN <xref target="RFC3168"/> uses two bits in the IP header. Once
        ECN has been negotiated with the receiver at the transport layer, an
        ECN sender can set two possible codepoints (ECT(0) or ECT(1)) in the
        IP header to indicate an ECN-capable transport (ECT). <!-- It is
        prohibited from doing so unless it has checked that the receiver will
        understand ECN and be able to feed it back.--> If both ECN bits are
        zero, the packet is considered to have been sent by a Not-ECN-capable
        Transport (Not-ECT). When a network node experiences congestion, it
        will occasionally either drop or mark a packet, with the choice
        depending on the packet's ECN codepoint. If the codepoint is Not-ECT,
        only drop is appropriate. If the codepoint is ECT(0) or ECT(1), the
        node can mark the packet by setting both ECN bits, which is termed
        'Congestion Experienced' (CE), or loosely a 'congestion mark'. <xref
        target="accecn_Tab_ECN"/> summarises these codepoints.</t>

        <texttable anchor="accecn_Tab_ECN"
                   title="The ECN Field in the IP Header">
          <ttcol>IP-ECN codepoint (binary)</ttcol>

          <ttcol>Codepoint name</ttcol>

          <ttcol>Description</ttcol>

          <c>00</c>

          <c>Not-ECT</c>

          <c>Not&nbsp;ECN-Capable&nbsp;Transport</c>

          <c>01</c>

          <c>ECT(1)</c>

          <c>ECN-Capable&nbsp;Transport (1)</c>

          <c>10</c>

          <c>ECT(0)</c>

          <c>ECN-Capable&nbsp;Transport (0)</c>

          <c>11</c>

          <c>CE</c>

          <c>Congestion&nbsp;Experienced</c>
        </texttable>

        <t>In the TCP header the first two bits in byte 14 are defined as
        flags for the use of ECN (CWR and ECE in <xref
        target="accecn_Fig_TCPHdr"/> <xref target="RFC3168"/>). A TCP client
        indicates it supports ECN by setting ECE=CWR=1 in the SYN, and an
        ECN-enabled server confirms ECN support by setting ECE=1 and CWR=0 in
        the SYN/ACK. On reception of a CE-marked packet at the IP layer, the
        Data Receiver starts to set the Echo Congestion Experienced (ECE) flag
        continuously in the TCP header of ACKs, which ensures the signal is
        received reliably even if ACKs are lost. The TCP sender confirms that
        it has received at least one ECE signal by responding with the
        congestion window reduced (CWR) flag, which allows the TCP receiver to
        stop repeating the ECN-Echo flag. This always leads to a full RTT of
        ACKs with ECE set. Thus any additional CE markings arriving within
        this RTT cannot be fed back.</t>

        <t>The last bit in byte 13 of the TCP header was defined as the Nonce
        Sum (NS) for the ECN Nonce <xref target="RFC3540"/>. In the absence of
        widespread deployment RFC 3540 has been reclassified as historic <xref
        target="RFC8311"/> and the respective flag has been marked as
        "reserved", making this TCP flag available for use by the AccECN
        experiment instead.</t>

        <?rfc needLines="8" ?>

        <figure align="center" anchor="accecn_Fig_TCPHdr"
                title="The (post-ECN Nonce) definition of the TCP header flags">
          <artwork align="center"><![CDATA[             
  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|               |           | N | C | E | U | A | P | R | S | F |
| Header Length | Reserved  | S | W | C | R | C | S | S | Y | I |
|               |           |   | R | E | G | K | H | T | N | N |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
]]></artwork>
        </figure>
      </section>
    </section>

    <!-- ================================================================ -->

    <section anchor="accecn_Overview"
             title="AccECN Protocol Overview and Rationale">
      <t>This section provides an informative overview of the AccECN protocol
      that will be normatively specified in <xref target="accecn_Spec"/></t>

      <t>Like the original TCP approach, the Data Receiver of each TCP
      half-connection sends AccECN feedback to the Data Sender on TCP
      acknowledgements, reusing data packets of the other half-connection
      whenever possible.</t>

      <!--<section title="Essential and Supplementary Parts">-->

      <t>The AccECN protocol has had to be designed in two parts:<list
          style="symbols">
          <t>an essential part that re-uses ECN TCP header bits to feed back
          the number of arriving CE marked packets. This provides more
          accuracy than classic ECN feedback, but limited resilience against
          ACK loss;</t>

          <t>a supplementary part using a new AccECN TCP Option that provides
          additional feedback on the number of bytes that arrive marked with
          each of the three ECN codepoints (not just CE marks). This provides
          greater resilience against ACK loss than the essential feedback, but
          it is more likely to suffer from middlebox interference. <!-- <t>a supplementary part that serves three functions:<list
                style="symbols">
                <t>it greatly improves the resilience of AccECN feedback
                information against loss of ACKs;</t>

                <t>it provides information about the order in which ECN
                markings in the IP header arrived at the Data Receiver;</t>

                <t>it improves the timeliness of AccECN feedback when a
                delayed ACK covers multiple congestion signals.</t>
              </list> --></t>
        </list>The two part design was necessary, given limitations on the
      space available for TCP options and given the possibility that certain
      incorrectly designed middleboxes prevent TCP using any new options.</t>

      <t>The essential part overloads the previous definition of the three
      flags in the TCP header that had been assigned for use by ECN. This
      design choice deliberately replaces the classic ECN feedback protocol,
      rather than leaving classic ECN feedback intact and adding more accurate
      feedback separately because:<list style="symbols">
          <t>this efficiently reuses scarce TCP header space, given TCP option
          space is approaching saturation;</t>

          <t>a single upgrade path for the TCP protocol is preferable to a
          fork in the design;</t>

          <t>otherwise classic and accurate ECN feedback could give
          conflicting feedback on the same segment, which could open up new
          security concerns and make implementations unnecessarily
          complex;</t>

          <t>middleboxes are more likely to faithfully forward the TCP ECN
          flags than newly defined areas of the TCP header.</t>
        </list></t>

      <t>AccECN is designed to work even if the supplementary part is removed
      or zeroed out, as long as the essential part gets through.</t>

      <section title="Capability Negotiation">
        <t>AccECN is a change to the wire protocol of the main TCP header,
        therefore it can only be used if both endpoints have been upgraded to
        understand it. The TCP client signals support for AccECN on the
        initial SYN of a connection and the TCP server signals whether it
        supports AccECN on the SYN/ACK. The TCP flags on the SYN that the
        client uses to signal AccECN support have been carefully chosen so
        that a TCP server will interpret them as a request to support the most
        recent variant of ECN feedback that it supports. Then the client falls
        back to the same variant of ECN feedback.</t>

        <t>An AccECN TCP client does not send the new AccECN Option on the SYN
        as SYN option space is limited. The TCP server sends the AccECN Option
        on the SYN/ACK and the client sends it on the first ACK to test
        whether the network path forwards the option correctly.</t>
      </section>

      <section title="Feedback Mechanism">
        <t>A Data Receiver maintains four counters initialized at the start of
        the half-connection. Three count the number of arriving payload bytes
        marked CE, ECT(1) and ECT(0) respectively. The fourth counts the
        number of packets arriving marked with a CE codepoint (including
        control packets without payload if they are CE-marked).</t>

        <t>The Data Sender maintains four equivalent counters for the half
        connection, and the AccECN protocol is designed to ensure they will
        match the values in the Data Receiver's counters, albeit after a
        little delay.</t>

        <t>Each ACK carries the three least significant bits (LSBs) of the
        packet-based CE counter using the ECN bits in the TCP header, now
        renamed the Accurate ECN (ACE) field (see <xref
        target="accecn_Fig_ACE_ACK"/> later). The 24 LSBs of each byte counter
        are carried in the AccECN Option.</t>
      </section>

      <section title="Delayed ACKs and Resilience Against ACK Loss">
        <t>With both the ACE and the AccECN Option mechanisms, the Data
        Receiver continually repeats the current LSBs of each of its
        respective counters. There is no need to acknowledge these continually
        repeated counters, so the congestion window reduced (CWR) mechanism is
        no longer used. Even if some ACKs are lost, the Data Sender should be
        able to infer how much to increment its own counters, even if the
        protocol field has wrapped.</t>

        <t>The 3-bit ACE field can wrap fairly frequently. Therefore, even if
        it appears to have incremented by one (say), the field might have
        actually cycled completely then incremented by one. The Data Receiver
        is not allowed to delay sending an ACK to such an extent that the ACE
        field would cycle. However cycling is still a possibility at the Data
        Sender because a whole sequence of ACKs carrying intervening values of
        the field might all be lost or delayed in transit.</t>

        <!-- "Further, if the lost ACKs included no payload they would never be retransmitted." Commented out, because even data ACks would be retransmitted with a different ACE field anyway.-->

        <t>The fields in the AccECN Option are larger, but they will increment
        in larger steps because they count bytes not packets. Nonetheless,
        their size has been chosen such that a whole cycle of the field would
        never occur between ACKs unless there had been an infeasibly long
        sequence of ACK losses. Therefore, as long as the AccECN Option is
        available, it can be treated as a dependable feedback channel.</t>

        <t>If the AccECN Option is not available, e.g. it is being stripped by
        a middlebox, the AccECN protocol will only feed back information on CE
        markings (using the ACE field). Although not ideal, this will be
        sufficient, because it is envisaged that neither ECT(0) nor ECT(1)
        will ever indicate more severe congestion than CE, even though future
        uses for ECT(0) or ECT(1) are still unclear <xref target="RFC8311"/>.
        Because the 3-bit ACE field is so small, when it is the only field
        available the Data Sender has to interpret it assuming the most likely
        wrap, but with a degree of conservatism.</t>

        <t>Certain specified events trigger the Data Receiver to include an
        AccECN Option on an ACK. The rules are designed to ensure that the
        order in which different markings arrive at the receiver is
        communicated to the sender (as long as options are reaching the sender
        and as long as there is no ACK loss). Implementations are encouraged
        to send an AccECN Option more frequently, but this is left up to the
        implementer.</t>

        <!--As one ACK might acknowledge multiple data segments at the same time the 
proposed scheme providing accumulated information does not preserve the 
order at which the marking were received.This decision was taken 
deliberately to reduce complexity.-->
      </section>

      <section title="Feedback Metrics">
        <t>The CE packet counter in the ACE field and the CE byte counter in
        the AccECN Option both provide feedback on received CE-marks. The CE
        packet counter includes control packets that do not have payload data,
        while the CE byte counter solely includes marked payload bytes. If
        both are present, the byte counter in the option will provide the more
        accurate information needed for modern congestion control and policing
        schemes, such as L4S, DCTCP or ConEx. If the option is stripped, a
        simple algorithm to estimate the number of marked bytes from the ACE
        field is given in <xref target="accecn_Algo_ACE_Bytes"/>.</t>

        <t>Feedback in bytes is recommended in order to protect against the
        receiver using attacks similar to 'ACK-Division' to artificially
        inflate the congestion window, which is why <xref target="RFC5681"/>
        now recommends that TCP counts acknowledged bytes not packets.</t>
      </section>

      <section anchor="accecn_demb_reflector" title="Generic (Dumb) Reflector">
        <t>The ACE field provides information about CE markings on both data
        and control packets. According to <xref target="RFC3168"/> the Data
        Sender is meant to set control packets to Not-ECT. However, mechanisms
        in certain private networks (e.g. data centres) set control packets to
        be ECN capable because they are precisely the packets that performance
        depends on most.</t>

        <t>For this reason, AccECN is designed to be a generic reflector of
        whatever ECN markings it sees, whether or not they are compliant with
        a current standard. Then as standards evolve, Data Senders can upgrade
        unilaterally without any need for receivers to upgrade too. It is also
        useful to be able to rely on generic reflection behaviour when senders
        need to test for unexpected interference with markings (for instance
        <xref target="accecn_sec_ACE_init_invalid"/>, <xref
        target="accecn_sec_ecn-mangling"/> and <xref
        target="accecn_Mbox_Interference"/> of the present document, para 2 of
        Section 20.2 of <xref target="RFC3168"/>) and <xref
        target="I-D.kuehlewind-tcpm-ecn-fallback"/>.</t>

        <t>The initial SYN is the most critical control packet, so AccECN
        provides feedback on its ECN marking. Although RFC 3168 prohibits an
        ECN-capable SYN, providing feedback of ECN marking on the SYN supports
        future scenarios in which SYNs might be ECN-enabled (without
        prejudging whether they ought to be). For instance, <xref
        target="RFC8311"/> updates this aspect of RFC 3168 to allow
        experimentation with ECN-capable TCP control packets.</t>

        <t>Even if the TCP client (or server) has set the SYN (or SYN/ACK) to
        not-ECT in compliance with RFC 3168, feedback on the state of the ECN
        field when it arrives at the receiver could still be useful, because
        middleboxes have been known to overwrite the ECN IP field as if it is
        still part of the old Type of Service (ToS) field <xref
        target="Mandalari18"/>. If a TCP client has set the SYN to Not-ECT,
        but receives feedback that the ECN field on the SYN arrived with a
        different codepoint, it can detect such middlebox interference and
        send Not-ECT for the rest of the connection (see <xref
        target="I-D.kuehlewind-tcpm-ecn-fallback"/>). Today, if a TCP server
        receives ECT or CE on a SYN, it cannot know whether it is invalid (or
        valid) because only the TCP client knows whether it originally marked
        the SYN as Not-ECT (or ECT). Therefore, prior to AccECN, the server's
        only safe course of action was to disable ECN for the connection.
        Instead, the AccECN protocol allows the server to feed back the
        received ECN field to the client, which then has all the information
        to decide whether the connection has to fall-back from supporting ECN
        (or not).</t>
      </section>
    </section>

    <!-- ================================================================ -->

    <section anchor="accecn_Spec" title="AccECN Protocol Specification">
      <section anchor="accecn_Negotiation" title="Negotiating to use AccECN">
        <t/>

        <section anchor="accecn_Negotiation_3WHS"
                 title="Negotiation during the TCP handshake">
          <t>Given the ECN Nonce <xref target="RFC3540"/> has been
          reclassified as historic <xref target="RFC8311"/>, the present
          specification re-allocates the TCP flag at bit 7 of the TCP header,
          which was previously called NS (Nonce Sum), as the AE (Accurate ECN)
          flag (see IANA Considerations in <xref
          target="accecn_IANA_Considerations"/>) as shown below.</t>

          <figure align="center" anchor="accecn_Fig_TCPHdr_AE"
                  title="The (post-AccECN) definition of the TCP header flags                  during the TCP handshake">
            <artwork align="center"><![CDATA[             
  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|               |           | A | C | E | U | A | P | R | S | F |
| Header Length | Reserved  | E | W | C | R | C | S | S | Y | I |
|               |           |   | R | E | G | K | H | T | N | N |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
]]></artwork>
          </figure>

          <t>During the TCP handshake at the start of a connection, to request
          more accurate ECN feedback the TCP client (host A) MUST set the TCP
          flags AE=1, CWR=1 and ECE=1 in the initial SYN segment.</t>

          <t>If a TCP server (B) that is AccECN-enabled receives a SYN with
          the above three flags set, it MUST set both its half connections
          into AccECN mode. Then it MUST set the TCP flags on the SYN/ACK to
          one of the 4 values shown in the top block of <xref
          target="accecn_Tab_Negotiation"/> to confirm that it supports
          AccECN. The TCP server MUST NOT set one of these 4 combination of
          flags on the SYN/ACK unless the preceding SYN requested support for
          AccECN as above.</t>

          <t>A TCP server in AccECN mode MUST set the AE, CWR and ECE TCP
          flags on the SYN/ACK to the value in <xref
          target="accecn_Tab_Negotiation"/> that feeds back the IP-ECN field
          that arrived on the SYN. This applies whether or not the server
          itself supports setting the IP-ECN field on a SYN or SYN/ACK (see
          <xref target="accecn_demb_reflector"/> for rationale).</t>

          <!--Bob: Out of scope: move to fall-back draft.-->

          <!--If the sending host (A) indicated AccECN support, the receiving host (B) may set the IP ECN field of the SYN/ACK to ECT.

 <t>If the SYN was ECT and the SYN/ACK indicates that a CE mark was received
         (NS=1), the originating host (A) MUST react to this congestion
         indication e.g. by selecting a lower initial sending window.</t> 
         
         <t>If the SYN was ECT marked, but the receiving host is not AccECN enabled
         (ECE=0 and CWR=0 in SYN/ACK), the originating host (A) SHOULD conservatively
         reduce its initial window as if the SYN had been CE-marked.</t> -->

          <t>Once a TCP client (A) has sent the above SYN to declare that it
          supports AccECN, and once it has received the above SYN/ACK segment
          that confirms that the TCP server supports AccECN, the TCP client
          MUST set both its half connections into AccECN mode.</t>

          <t>Once in AccECN mode, a TCP client or server has the rights and
          obligations to participate in the ECN protocol defined in <xref
          target="accecn_implications_accecn_mode"/>.</t>

          <t>The procedure for the client to follow if a SYN/ACK does not
          arrive before its retransmission timer expires is given in <xref
          target="accecn_sec_SYN_rexmt"/>.</t>
        </section>

        <section anchor="accecn_sec_backward_compat"
                 title="Backward Compatibility">
          <t>The three flags set to 1 to indicate AccECN support on the SYN
          have been carefully chosen to enable natural fall-back to prior
          stages in the evolution of ECN, as above. <xref
          target="accecn_Tab_Negotiation"/> tabulates all the negotiation
          possibilities for ECN-related capabilities that involve at least one
          AccECN-capable host. The entries in the first two columns have been
          abbreviated, as follows: <list hangIndent="4" style="hanging">
              <t hangText="AccECN:">More Accurate ECN Feedback (the present
              specification)</t>

              <t hangText="Nonce:">ECN Nonce feedback <xref
              target="RFC3540"/></t>

              <t hangText="ECN:">'Classic' ECN feedback <xref
              target="RFC3168"/></t>

              <t hangText="No ECN:">Not-ECN-capable. Implicit congestion
              notification using packet drop.</t>
            </list></t>

          <!--Could turn first 4 columns into 2 columns headed A & B, with Ac, N, E, I within the columns.-->

          <!-- <?rfc needLines="22" ?> -->

          <texttable align="center" anchor="accecn_Tab_Negotiation"
                     title="ECN capability negotiation between Client (A) and Server (B)">
            <ttcol align="left">A</ttcol>

            <ttcol align="left">B</ttcol>

            <ttcol align="center">SYN A-&gt;B</ttcol>

            <ttcol align="center" width="10">SYN/ACK B-&gt;A</ttcol>

            <ttcol align="left">Feedback Mode</ttcol>

            <c/>

            <c/>

            <c>AE CWR ECE</c>

            <c>AE CWR ECE</c>

            <c/>

            <c>AccECN</c>

            <c>AccECN</c>

            <c>1 &nbsp; 1 &nbsp; 1</c>

            <c>0 &nbsp; 1 &nbsp; 0</c>

            <c>AccECN (no ECT on SYN)</c>

            <!-- new -->

            <c>AccECN</c>

            <c>AccECN</c>

            <c>1 &nbsp; 1 &nbsp; 1</c>

            <c>0 &nbsp; 1 &nbsp; 1</c>

            <c>AccECN (ECT1 on SYN)</c>

            <!-- changed to AccECN (CU) instead of RSVD -->

            <c>AccECN</c>

            <c>AccECN</c>

            <c>1 &nbsp; 1 &nbsp; 1</c>

            <c>1 &nbsp; 0 &nbsp; 0</c>

            <!-- changed to AccECN (CU) instead of RSVD -->

            <c>AccECN (ECT0 on SYN)</c>

            <c>AccECN</c>

            <c>AccECN</c>

            <c>1 &nbsp; 1 &nbsp; 1</c>

            <c>1 &nbsp; 1 &nbsp; 0</c>

            <c>AccECN (CE on SYN)</c>

            <!-- new end -->

            <c/>

            <c/>

            <c/>

            <c/>

            <c/>

            <c>AccECN</c>

            <c>Nonce</c>

            <c>1 &nbsp; 1 &nbsp; 1</c>

            <c>1 &nbsp; 0 &nbsp; 1</c>

            <c>(Reserved)</c>

            <c>AccECN</c>

            <c>ECN</c>

            <c>1 &nbsp; 1 &nbsp; 1</c>

            <c>0 &nbsp; 0 &nbsp; 1</c>

            <c>classic ECN</c>

            <c>AccECN</c>

            <c>No ECN</c>

            <c>1 &nbsp; 1 &nbsp; 1</c>

            <c>0 &nbsp; 0 &nbsp; 0</c>

            <c>Not ECN</c>

            <c/>

            <c/>

            <c/>

            <c/>

            <c/>

            <c>Nonce</c>

            <c>AccECN</c>

            <c>0 &nbsp; 1 &nbsp; 1</c>

            <c>0 &nbsp; 0 &nbsp; 1</c>

            <c>classic ECN</c>

            <c>ECN</c>

            <c>AccECN</c>

            <c>0 &nbsp; 1 &nbsp; 1</c>

            <c>0 &nbsp; 0 &nbsp; 1</c>

            <c>classic ECN</c>

            <c>No ECN</c>

            <c>AccECN</c>

            <c>0 &nbsp; 0 &nbsp; 0</c>

            <c>0 &nbsp; 0 &nbsp; 0</c>

            <c>Not ECN</c>

            <c/>

            <c/>

            <c/>

            <c/>

            <c/>

            <!-- moved -->

            <c>AccECN</c>

            <c>Broken</c>

            <c>1 &nbsp; 1 &nbsp; 1</c>

            <c>1 &nbsp; 1 &nbsp; 1</c>

            <c>Not ECN</c>

            <!-- moved  end -->
          </texttable>

          <t><xref target="accecn_Tab_Negotiation"/> is divided into blocks
          each separated by an empty row.<list style="numbers">
              <t>The top block shows the case already described in <xref
              target="accecn_Negotiation"/> where both endpoints support
              AccECN and how the TCP server (B) indicates congestion
              feedback.</t>

              <t>The second block shows the cases where the TCP client (A)
              supports AccECN but the TCP server (B) supports some earlier
              variant of TCP feedback, indicated in its SYN/ACK. Therefore, as
              soon as an AccECN-capable TCP client (A) receives the SYN/ACK
              shown it MUST set both its half connections into the feedback
              mode shown in the rightmost column. If it has set itself into
              classic ECN feedback mode it MUST then comply with <xref
              target="RFC3168"/>.<vspace blankLines="1"/>The server response
              called 'Nonce' in the table is now historic. For an AccECN
              implementation, there is no need to recognize or support ECN
              Nonce feedback <xref target="RFC3540"/>, which has been
              reclassified as historic <xref target="RFC8311"/>. AccECN is
              compatible with alternative ECN feedback integrity approaches
              (see <xref target="accecn_Integrity"/>).</t>

              <t>The third block shows the cases where the TCP server (B)
              supports AccECN but the TCP client (A) supports some earlier
              variant of TCP feedback, indicated in its SYN.<vspace
              blankLines="1"/>When an AccECN-enabled TCP server (B) receives a
              SYN with AE,CWR,ECE = 0,1,1 it MUST do one of the
              following:<list style="symbols">
                  <t>set both its half connections into the classic ECN
                  feedback mode and return a SYN/ACK with AE, CWR, ECE = 0,0,1
                  as shown. Then it MUST comply with <xref
                  target="RFC3168"/>.</t>

                  <t>set both its half-connections into No ECN mode and return
                  a SYN/ACK with AE,CWR,ECE = 0,0,0, then continue with ECN
                  disabled. This latter case is unlikely to be desirable, but
                  it is allowed as a possibility, e.g. for minimal TCP
                  implementations.</t>
                </list>When an AccECN-enabled TCP server (B) receives a SYN
              with AE,CWR,ECE = 0,0,0 it MUST set both its half connections
              into the Not ECN feedback mode, return a SYN/ACK with AE,CWR,ECE
              = 0,0,0 as shown and continue with ECN disabled.</t>

              <t>The fourth block displays a combination labelled `Broken'.
              Some older TCP server implementations incorrectly set the
              reserved flags in the SYN/ACK by reflecting those in the SYN.
              Such broken TCP servers (B) cannot support ECN, so as soon as an
              AccECN-capable TCP client (A) receives such a broken SYN/ACK it
              MUST fall back to Not ECN mode for both its half connections and
              continue with ECN disabled.</t>
            </list></t>

          <t>The following additional rules do not fit the structure of the
          table, but they complement it:<list style="hanging">
              <t hangText="Simultaneous Open:">An originating AccECN Host (A),
              having sent a SYN with AE=1, CWR=1 and ECE=1, might receive
              another SYN from host B. Host A MUST then enter the same
              feedback mode as it would have entered had it been a responding
              host and received the same SYN. Then host A MUST send the same
              SYN/ACK as it would have sent had it been a responding host.</t>

              <t hangText="In-window SYN during TIME-WAIT:">Many TCP
              implementations create a new TCP connection if they receive an
              in-window SYN packet during TIME-WAIT state. When a TCP host
              enters TIME-WAIT or CLOSED state, it should ignore any previous
              state about the negotiation of AccECN for that connection and
              renegotiate the feedback mode according to <xref
              target="accecn_Tab_Negotiation"/>.</t>
            </list></t>
        </section>

        <section anchor="accecn_sec_forward_compat"
                 title="Forward Compatibility">
          <t>If a TCP server that implements AccECN receives a SYN with the
          three TCP header flags (AE, CWR and ECE) set to any combination
          other than 000, 011 or 111, it MUST negotiate the use of AccECN as
          if they had been set to 111. This ensures that future uses of the
          other combinations on a SYN can rely on consistent behaviour from
          the installed base of AccECN servers.</t>

          <t>For the avoidance of doubt, the behaviour described in the
          present specification applies whether or not the three remaining
          reserved TCP header flags are zero.</t>
        </section>

        <section anchor="accecn_sec_SYN_rexmt"
                 title="Retransmission of the SYN">
          <!--Bob: Out of scope: move to fall-back draft.
        <t> In AccECN mode the originating host (A) MAY set the IP ECN field to
        ECT in the first ACK that finalizes the three way handshake (3WSH). 
        E.g. to test ECN support of the path, setting the SYN/ACK as well as
        the first ACK to ECT allows each end to determine as soon as possible
        whether the path passes ECT or a middlebox bleaches or overwrites the
        IP ECN field.</t>
-->

          <t>If the sender of an AccECN SYN times out before receiving the
          SYN/ACK, the sender SHOULD attempt to negotiate the use of AccECN at
          least one more time by continuing to set all three TCP ECN flags on
          the first retransmitted SYN (using the usual retransmission
          time-outs). If this first retransmission also fails to be
          acknowledged, the sender SHOULD send subsequent retransmissions of
          the SYN with the three TCP-ECN flags cleared (AE=CWR=ECE=0). A
          retransmitted SYN MUST use the same ISN as the original SYN.</t>

          <t>Retrying once before fall-back adds delay in the case where a
          middlebox drops an AccECN (or ECN) SYN deliberately. However,
          current measurements imply that a drop is less likely to be due to
          middlebox interference than other intermittent causes of loss, e.g.
          congestion, wireless interference, etc.</t>

          <t>Implementers MAY use other fall-back strategies if they are found
          to be more effective (e.g. attempting to negotiate AccECN on the SYN
          only once or more than twice (most appropriate during high levels of
          congestion). However, other fall-back strategies will need to follow
          all the rules in <xref target="accecn_implications_accecn_mode"/>,
          which concern behaviour when SYNs or SYN/ACKs negotiating different
          types of feedback have been sent within the same connection.</t>

          <t>Further it may make sense to also remove any other new or
          experimental fields or options on the SYN in case a middlebox might
          be blocking them, although the required behaviour will depend on the
          specification of the other option(s) and any attempt to co-ordinate
          fall-back between different modules of the stack.</t>

          <t>Whichever fall-back strategy is used, the TCP initiator SHOULD
          cache failed connection attempts. If it does, it SHOULD NOT give up
          attempting to negotiate AccECN on the SYN of subsequent connection
          attempts until it is clear that the blockage is persistently and
          specifically due to AccECN. The cache should be arranged to expire
          so that the initiator will infrequently attempt to check whether the
          problem has been resolved.</t>

          <t>The fall-back procedure if the TCP server receives no ACK to
          acknowledge a SYN/ACK that tried to negotiate AccECN is specified in
          <xref target="accecn_Mbox_Interference"/>.</t>
        </section>

        <section anchor="accecn_implications_accecn_mode"
                 title="Implications of AccECN Mode">
          <t><xref target="accecn_Negotiation_3WHS"/> describes the only ways
          that a host can enter AccECN mode, whether as a client or as a
          server.</t>

          <t>As a Data Sender, a host in AccECN mode has the rights and
          obligations concerning the use of ECN defined below, which build on
          those in <xref target="RFC3168"/> as updated by <xref
          target="RFC8311"/>:<list style="symbols">
              <t>Using ECT:<list style="symbols">
                  <t>It can set an ECT codepoint in the IP header of packets
                  to indicate to the network that the transport is capable and
                  willing to participate in ECN for this packet.</t>

                  <t>It does not have to set ECT on any packet (for instance
                  if it has reason to believe such a packet would be
                  blocked).</t>

                  <t>If for any reason it is not willing to provide ECN
                  feedback on a particular TCP connection, to indicate this
                  unwillingness it SHOULD clear the AE, CWR and ECE flags in
                  all SYN and/or SYN/ACK packets that it sends.</t>
                </list></t>

              <t>Switching feedback negotiation (e.g. fall-back):<list
                  style="symbols">
                  <t>It SHOULD NOT set ECT on any packet if it has received at
                  least one valid SYN or Acceptable SYN/ACK with AE=CWR=ECE=0.
                  A "valid SYN" has the same port numbers and the same ISN as
                  the SYN that caused the server to enter AccECN mode.</t>

                  <t>It MUST NOT send an ECN-setup SYN <xref
                  target="RFC3168"/> within the same connection as it has sent
                  a SYN requesting AccECN feedback.</t>

                  <t>It MUST NOT send an ECN-setup SYN/ACK <xref
                  target="RFC3168"/> within the same connection as it has sent
                  a SYN/ACK agreeing to use AccECN feedback.</t>
                </list>The above rules are necessary because, when one peer
              negotiates the feedback mode in two different types of
              handshake, it is not possible for the other peer to know for
              certain which handshake packet(s) the other end eventually
              receives or in which order it receives them. So the two peers
              can end up using difference feedback modes without knowing
              it.</t>

              <t>Congestion response:<list style="symbols">
                  <t>It is still obliged to respond appropriately to AccECN
                  feedback with congestion indications on packets it had
                  previously sent, as defined in Section 6.1 of <xref
                  target="RFC3168"/> and updated by Sections 2.1 and 4.1 of
                  <xref target="RFC8311"/>.</t>

                  <t>The commitment to respond appropriately to incoming
                  indications of congestion remains even if it sends a SYN
                  packet with AE=CWR=ECE=0, in a later transmission within the
                  same TCP connection.</t>

                  <t>Unlike an RFC 3168 data sender, it MUST NOT set CWR to
                  indicate it has received and responded to indications of
                  congestion (for the avoidance of doubt, this does not
                  preclude it from setting the bits of the ACE counter field,
                  which includes an overloaded use of the same bit).</t>
                </list></t>
            </list></t>

          <t>As a Data Receiver:<list style="symbols">
              <t>a host in AccECN mode MUST feed back the information in the
              IP-ECN field on incoming packets using Accurate ECN feedback, as
              specified in <xref target="accecn_feedback"/> below.</t>

              <t>if it receives an ECN-setup SYN or ECN-setup SYN/ACK <xref
              target="RFC3168"/> during the same connection as it receives a
              SYN requesting AccECN feedback or a SYN/ACK agreeing to use
              AccECN feedback, it MUST reset the connection with a RST
              packet.</t>

              <t>it MUST NOT use reception of packets with ECT set in the
              IP-ECN field as an implicit signal that the peer is ECN-capable.
              Reason: ECT at the IP layer does not explicitly confirm the peer
              has the correct ECN feedback logic, and the packets could have
              been mangled at the IP layer.</t>
            </list></t>
        </section>
      </section>

      <section anchor="accecn_feedback" title="AccECN Feedback">
        <t>Each Data Receiver of each half connection maintains four counters,
        r.cep, r.ceb, r.e0b and r.e1b:<list style="symbols">
            <t>The Data Receiver MUST increment the CE packet counter (r.cep),
            for every Acceptable packet that it receives with the CE code
            point in the IP ECN field, including CE marked control packets but
            excluding CE on SYN packets (SYN=1; ACK=0).</t>

            <t>The Data Receiver MUST increment the r.ceb, r.e0b or r.e1b byte
            counters by the number of TCP payload octets in Acceptable packets
            marked respectively with the CE, ECT(0) and ECT(1) codepoint in
            their IP-ECN field, including any payload octets on control
            packets, but not including any payload octets on SYN packets
            (SYN=1; ACK=0).</t>
          </list></t>

        <t>Each Data Sender of each half connection maintains four counters,
        s.cep, s.ceb, s.e0b and s.e1b intended to track the equivalent
        counters at the Data Receiver.</t>

        <t>A Data Receiver feeds back the CE packet counter using the Accurate
        ECN (ACE) field, as explained in <xref target="accecn_ACE"/>. And it
        feeds back all the byte counters using the AccECN TCP Option, as
        specified in <xref target="accecn_option"/>.</t>

        <t>Whenever a host feeds back the value of any counter, it MUST report
        the most recent value, no matter whether it is in a pure ACK, an ACK
        with new payload data or a retransmission. Therefore the feedback
        carried on a retransmitted packet is unlikely to be the same as the
        feedback on the original packet.</t>

        <section anchor="accecn_init_counters"
                 title="Initialization of Feedback Counters">
          <t>When a host first enters AccECN mode, in its role as a Data
          Receiver it initializes its counters to r.cep = 5 and r.ceb = 0, The
          initial values of the other two byte counters depend on the Data
          Receiver's choice of the order of fields it will use in the AccECN
          TCP Option (see <xref target="accecn_option"/>). If field order 0,
          it will initialize the remaining counters to r.e0b = 1; r.e1b.= 0.
          If field order 1, it will initialize them to r.e0b = 0 and r.e1b.=
          0x800001.</t>

          <t>Non-zero initial values are used to support a stateless handshake
          (see <xref target="accecn_Interaction_SYN_Cookies"/>) and to be
          distinct from cases where the fields are incorrectly zeroed (e.g. by
          middleboxes - see <xref target="accecn_sec_zero_option"/>).</t>

          <t>When a host enters AccECN mode, in its role as a Data Sender it
          initializes its counters to s.cep = 5 and s.ceb = 0. The initial
          values of the other two byte counters depend on the peer's choice of
          the order of fields it will use in the AccECN TCP Option (see <xref
          target="accecn_option"/>). If field order 0, it will initialize the
          remaining counters to s.e0b = 1; s.e1b.= 0. If field order 1, it
          will initialize them to s.e0b = 0 and s.e1b.= 0x800001.</t>
        </section>

        <section anchor="accecn_ACE" title="The ACE Field">
          <t>After AccECN has been negotiated on the SYN and SYN/ACK, both
          hosts overload the three TCP flags (AE, CWR and ECE) in the main TCP
          header as one 3-bit field. Then the field is given a new name, ACE,
          as shown in <xref target="accecn_Fig_ACE_ACK"/>.</t>

          <!-- <?rfc needLines="9" ?> -->

          <figure align="center" anchor="accecn_Fig_ACE_ACK"
                  title="Definition of  the ACE field within bytes 13 and 14 of the TCP Header (when AccECN has been negotiated and SYN=0).">
            <artwork align="center"><![CDATA[  
  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|               |           |           | U | A | P | R | S | F |
| Header Length | Reserved  |    ACE    | R | C | S | S | Y | I |
|               |           |           | G | K | H | T | N | N |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
]]></artwork>
          </figure>

          <t>The original definition of these three flags in the TCP header,
          including the addition of support for the ECN Nonce, is shown for
          comparison in <xref target="accecn_Fig_TCPHdr"/>. This specification
          does not rename these three TCP flags to ACE unconditionally; it
          merely overloads them with another name and definition once an
          AccECN connection has been established.</t>

          <t>With one exception (<xref target="accecn_ACE_3rdACK"/>), a host
          with both of its half-connections in AccECN mode MUST interpret the
          AE, CWR and ECE flags as the 3-bit ACE counter on a segment with the
          SYN flag cleared (SYN=0). On such a packet, a Data Receiver MUST
          encode the three least significant bits of its r.cep counter into
          the ACE field that it feeds back to the Data Sender. A host MUST NOT
          interpret the 3 flags as a 3-bit ACE field on any segment with SYN=1
          (whether ACK is 0 or 1), or if AccECN negotiation is incomplete or
          has not succeeded.</t>

          <t>Both parts of each of these conditions are equally important. For
          instance, even if AccECN negotiation has been successful, the ACE
          field is not defined on any segments with SYN=1 (e.g. a
          retransmission of an unacknowledged SYN/ACK, or when both ends send
          SYN/ACKs after AccECN support has been successfully negotiated
          during a simultaneous open).</t>

          <section anchor="accecn_ACE_3rdACK"
                   title="ACE Field on the ACK of the SYN/ACK">
            <t>A TCP client (A) in AccECN mode MUST feed back which of the 4
            possible values of the IP-ECN field was on the SYN/ACK by writing
            it into the ACE field of a pure ACK with no SACK blocks using the
            binary encoding in <xref target="accecn_Tab_SYN-ACK_fb2"/> (which
            is the same as that used on the SYN/ACK in <xref
            target="accecn_Tab_Negotiation"/>). This shall be called the
            handshake encoding of the ACE field, and it is the only exception
            to the rule that the ACE field carries the 3 least significant
            bits of the r.cep counter on packets with SYN=0.</t>

            <t>Normally, a TCP client acknowledges a SYN/ACK with an ACK that
            satisfies the above conditions anyway (SYN=0, no data, no SACK
            blocks). If an AccECN TCP client intends to acknowledge the
            SYN/ACK with a packet that does not satisfy these conditions (e.g.
            it has data to include on the ACK), it SHOULD first send a pure
            ACK that does satisfy these conditions (see <xref
            target="accecn_Interaction_Other"/>), so that it can feed back
            which of the four values of the IP-ECN field arrived on the
            SYN/ACK. A valid exception to this "SHOULD" would be where the
            implementation will only be used in an environment where mangling
            of the ECN field is unlikely.</t>

            <texttable anchor="accecn_Tab_SYN-ACK_fb2"
                       title="The encoding of the ACE field in the ACK of the SYN-ACK to reflect the SYN-ACK's IP-ECN field">
              <ttcol>IP-ECN codepoint on SYN/ACK</ttcol>

              <ttcol>ACE on pure ACK of SYN/ACK</ttcol>

              <ttcol>r.cep of client in AccECN mode</ttcol>

              <c>Not-ECT</c>

              <c>0b010</c>

              <c>5</c>

              <c>ECT(1)</c>

              <c>0b011</c>

              <c>5</c>

              <c>ECT(0)</c>

              <c>0b100</c>

              <c>5</c>

              <c>CE</c>

              <c>0b110</c>

              <c>6</c>
            </texttable>

            <t>When an AccECN server in SYN-RCVD state receives a pure ACK
            with SYN=0 and no SACK blocks, instead of treating the ACE field
            as a counter, it MUST infer the meaning of each possible value of
            the ACE field from <xref target="accecn_Tab_SYN-ACK_fb"/>, which
            also shows the value that an AccECN server MUST set s.cep to as a
            result.</t>

            <t>Given this encoding of the ACE field on the ACK of a SYN/ACK is
            exceptional, an AccECN server using large receive offload (LRO)
            might prefer to disable LRO until such an ACK has transitioned it
            out of SYN-RCVD state.</t>

            <texttable anchor="accecn_Tab_SYN-ACK_fb"
                       title="Meaning of the ACE field on the ACK of the SYN/ACK">
              <ttcol>ACE on ACK of SYN/ACK</ttcol>

              <ttcol>IP-ECN codepoint on SYN/ACK inferred by server</ttcol>

              <ttcol>s.cep of server in AccECN mode</ttcol>

              <c>0b000</c>

              <c>{Notes 1, 3}</c>

              <c>Disable ECN</c>

              <c>0b001</c>

              <c>{Notes 2, 3}</c>

              <c>5</c>

              <c>0b010</c>

              <c>Not-ECT</c>

              <c>5</c>

              <c>0b011</c>

              <c>ECT(1)</c>

              <c>5</c>

              <c>0b100</c>

              <c>ECT(0)</c>

              <c>5</c>

              <c>0b101</c>

              <c>Currently Unused {Note 2}</c>

              <c>5</c>

              <c>0b110</c>

              <c>CE</c>

              <c>6</c>

              <c>0b111</c>

              <c>Currently Unused {Note 2}</c>

              <c>5</c>
            </texttable>

            <t>{Note 1}: If the server is in AccECN mode, the value of zero
            raises suspicion of zeroing of the ACE field on the path (see
            <xref target="accecn_sec_ACE_init_invalid"/>).</t>

            <t>{Note 2}: If the server is in AccECN mode, these values are
            Currently Unused but the AccECN server's behaviour is still
            defined for forward compatibility. Then the designer of a future
            protocol can know for certain what AccECN servers will do with
            these codepoints.</t>

            <t>{Note 3}: In the case where a server that implements AccECN is
            also using a stateless handshake (termed a SYN cookie) it will not
            remember whether it entered AccECN mode. The values 0b000 or 0b001
            will remind it that it did not enter AccECN mode, because AccECN
            does not use them (see <xref
            target="accecn_Interaction_SYN_Cookies"/> for details). If a
            stateless server that implements AccECN receives either of these
            two values in the ACK, its action is implementation-dependent and
            outside the scope of this spec, It will certainly not take the
            action in the third column because, after it receives either of
            these values, it is not in AccECN mode. I.e., it will not disable
            ECN (at least not just because ACE is 0b000) and it will not set
            s.cep.</t>
          </section>

          <section anchor="accecn_sec_ACE_feedback"
                   title="Encoding and Decoding Feedback in the ACE Field">
            <t>Whenever the Data Receiver sends an ACK with SYN=0 (with or
            without data), unless the handshake encoding in <xref
            target="accecn_ACE_3rdACK"/> applies, the Data Receiver MUST
            encode the least significant 3 bits of its r.cep counter into the
            ACE field (see <xref target="accecn_Algo_ACE_Wrap"/>).</t>

            <t>Whenever the Data Sender receives an ACK with SYN=0 (with or
            without data), it first checks whether it has already been
            superseded by another ACK in which case it ignores the ECN
            feedback. If the ACK has not been superseded, and if the special
            handshake encoding in <xref target="accecn_ACE_3rdACK"/> does not
            apply, the Data Sender decodes the ACE field as follows (see <xref
            target="accecn_Algo_ACE_Wrap"/> for examples).<list
                style="symbols">
                <t>It takes the least significant 3 bits of its local s.cep
                counter and subtracts them from the incoming ACE counter to
                work out the minimum positive increment it could apply to
                s.cep (assuming the ACE field only wrapped at most once).</t>

                <t>It then follows the safety procedures in <xref
                target="accecn_ACE_Safety_S"/> to calculate or estimate how
                many packets the ACK could have acknowledged under the
                prevailing conditions to determine whether the ACE field might
                have wrapped more than once.</t>
              </list></t>

            <t>The encode/decode procedures during the three-way handshake are
            exceptions to the general rules given so far, so they are spelled
            out step by step below for clarity:<list style="symbols">
                <t>If a TCP server in AccECN mode receives a CE mark in the
                IP-ECN field of a SYN (SYN=1, ACK=0), it MUST NOT increment
                r.cep (it remains at its initial value of 5). <vspace
                blankLines="1"/>Reason: It would be redundant for the server
                to include CE-marked SYNs in its r.cep counter, because it
                already reliably delivers feedback of any CE marking on the
                SYN/ACK using the encoding in <xref
                target="accecn_Tab_Negotiation"/>. This also ensures that,
                when the server starts using the ACE field, it has not
                unnecessarily consumed more than one initial value, given they
                can be used to negotiate variants of the AccECN protocol (see
                <xref target="accecn_space_evolution"/>).</t>

                <t>If a TCP client in AccECN mode receives CE feedback in the
                TCP flags of a SYN/ACK, it MUST NOT increment s.cep (it
                remains at its initial value of 5), so that it stays in step
                with r.cep on the server. Nonetheless, the TCP client still
                triggers the congestion control actions necessary to respond
                to the CE feedback.</t>

                <t>If a TCP client in AccECN mode receives a CE mark in the
                IP-ECN field of a SYN/ACK, it MUST increment r.cep, but no
                more than once no matter how many CE-marked SYN/ACKs it
                receives (i.e. incremented from 5 to 6, but no further).
                <vspace blankLines="1"/>Reason: Incrementing r.cep ensures the
                client will eventually deliver any CE marking to the server
                reliably when it starts using the ACE field. Even though the
                client also feeds back any CE marking on the ACK of the
                SYN/ACK using the encoding in <xref
                target="accecn_Tab_SYN-ACK_fb2"/>, this ACK is not delivered
                reliably, so it can be considered as a timely notification
                that is redundant but unreliable. The client does not
                increment r.cep more than once, because the server can only
                increment s.cep once (see next bullet). Also, this limits the
                unnecessarily consumed initial values of the ACE field to
                two.</t>

                <t>If a TCP server in AccECN mode and in SYN-RCVD state
                receives CE feedback in the TCP flags of a pure ACK with no
                SACK blocks, it MUST increment s.cep (from 5 to 6). The TCP
                server then triggers the congestion control actions necessary
                to respond to the CE feedback.<vspace
                blankLines="1"/>Reasoning: The TCP server can only increment
                s.cep once, because the first ACK it receives will cause it to
                transition out of SYN-RCVD state. The server's congestion
                response would be no different even if it could receive
                feedback of more than one CE-marked SYN/ACK.<vspace
                blankLines="1"/>Once the TCP server transitions to ESTABLISHED
                state, it might later receive other pure ACK(s) with the
                handshake encoding in the ACE field. The conditions for this
                to occur are quite unusual, but not impossible, e.g. a SYN/ACK
                (or ACK of the SYN/ACK) that is delayed for longer than the
                server's retransmission timeout; or packet duplication by the
                network. Nonetheless, once in the ESTABLISHED state, the
                server will consider the ACE field to be encoded as the normal
                ACE counter on all packets with SYN=0 (given it will be
                following the above rule in this bullet). The server MAY
                include a test to avoid this case.</t>
              </list></t>
          </section>

          <section anchor="accecn_sec_ACE_init_invalid"
                   title="Testing for Zeroing of the ACE Field">
            <t><xref target="accecn_ACE"/> required the Data Receiver to
            initialize the r.cep counter to a non-zero value. Therefore, in
            either direction the initial value of the ACE counter ought to be
            non-zero.</t>

            <t>If AccECN has been successfully negotiated, the Data Sender
            SHOULD check the value of the ACE counter in the first packet
            (with or without data) that arrives with SYN=0. If the value of
            this ACE field is zero (0b000), the Data Sender disables sending
            ECN-capable packets for the remainder of the half-connection by
            setting the IP/ECN field in all subsequent packets to Not-ECT.
            <!--There is no need to say the following for forward compatibility:
"If a data receiver negotiates AccECN but then zeros the ACE field in its first segment with SYN=0, it MUST continue the connection 
even if the data sender does not disable sending ECN-capable packets."--></t>

            <t>Usually, the server checks the ACK of the SYN/ACK from the
            client, while the client checks the first data segment from the
            server. However, if reordering occurs, "the first packet ... that
            arrives" will not necessarily be the same as the first packet in
            sequence order. The test has been specified loosely like this to
            simplify implementation, and because it would not have been any
            more precise to have specified the first packet in sequence order,
            which would not necessarily be the first ACE counter that the Data
            Receiver fed back anyway, given it might have been a
            retransmission.</t>

            <t>The possibility of re-ordering means that there is a small
            chance that the ACE field on the first packet to arrive is
            genuinely zero (without middlebox interference). This would cause
            a host to unnecessarily disable ECN for a half connection.
            Therefore, in environments where there is no evidence of the ACE
            field being zeroed, implementations can skip this test.</t>

            <t>Note that the Data Sender MUST NOT test whether the arriving
            counter in the initial ACE field has been initialized to a
            specific valid value - the above check solely tests whether the
            ACE fields have been incorrectly zeroed. This allows hosts to use
            different initial values as an additional signalling channel in
            future.</t>
          </section>

          <!--{ToDo: Consider reordering sections: Mangling, Zeroing (matches the order the tests are conducted)}-->

          <section anchor="accecn_sec_ecn-mangling"
                   title="Testing for Mangling of the IP/ECN Field">
            <t>The value of the ACE field on the SYN/ACK indicates the value
            of the IP/ECN field when the SYN arrived at the server. The client
            can compare this with how it originally set the IP/ECN field on
            the SYN. If this comparison implies an unsafe transition (see
            below) of the IP/ECN field, for the remainder of the connection
            the client MUST NOT send ECN-capable packets, but it MUST continue
            to feed back any ECN markings on arriving packets.<!--There is no need to say the following for forward compatibility:
"If the server deliberately sends false feedback in the ACE field that implies an unsafe transition, it MUST continue the connection 
even if the client does not disable sending ECN-capable packets"--></t>

            <t>The value of the ACE field on the last ACK of the 3WHS
            indicates the value of the IP/ECN field when the SYN/ACK arrived
            at the client. The server can compare this with how it originally
            set the IP/ECN field on the SYN/ACK. If this comparison implies an
            unsafe transition of the IP/ECN field, for the remainder of the
            connection the server MUST NOT send ECN-capable packets, but it
            MUST continue to feedback any ECN markings on arriving
            packets.<!--There is no need to say the following for forward compatibility:
"If the client deliberately sends false feedback in the ACE field that implies an unsafe transition, it MUST continue the connection 
even if the server does not disable sending ECN-capable packets"--></t>

            <t>The ACK of the SYN/ACK is not reliably delivered (nonetheless,
            the count of CE marks is still eventually delivered reliably). If
            this ACK does not arrive, the server can continue to send
            ECN-capable packets without having tested for mangling of the
            IP/ECN field on the SYN/ACK. Experiments with AccECN deployment
            will assess whether this limitation has any effect in
            practice.</t>

            <t>Invalid transitions of the IP/ECN field are defined in <xref
            target="RFC3168"/> and repeated here for convenience:<list
                style="symbols">
                <t>the not-ECT codepoint changes;</t>

                <t>either ECT codepoint transitions to not-ECT;</t>

                <t>the CE codepoint changes.</t>
              </list></t>

            <t>RFC 3168 says that a router that changes ECT to not-ECT is
            invalid but safe. However, from a host's viewpoint, this
            transition is unsafe because it could be the result of two
            transitions at different routers on the path: ECT to CE (safe)
            then CE to not-ECT (unsafe). This scenario could well happen where
            an ECN-enabled home router congests its upstream mobile broadband
            bottleneck link, then the ingress to the mobile network clears the
            ECN field <xref target="Mandalari18"/>.</t>

            <t>The above fall-back behaviours are necessary in case mangling
            of the IP/ECN field is asymmetric, which is currently common over
            some mobile networks <xref target="Mandalari18"/>. Then one end
            might see no unsafe transition and continue sending ECN-capable
            packets, while the other end sees an unsafe transition and stops
            sending ECN-capable packets.</t>
          </section>

          <section anchor="accecn_ACE_Safety"
                   title="Safety against Ambiguity of the ACE Field">
            <t>If too many CE-marked segments are acknowledged at once, or if
            a long run of ACKs is lost or thinned out, the 3-bit counter in
            the ACE field might have cycled between two ACKs arriving at the
            Data Sender. The following safety procedures minimize this
            ambiguity.</t>

            <section anchor="accecn_ACE_Safety_R"
                     title="Data Receiver Safety Procedures">
              <t>An AccECN Data Receiver:<list style="symbols">
                  <t>SHOULD immediately send an ACK whenever a data packet
                  marked CE arrives after the previous data packet was not
                  CE.</t>

                  <t>MUST immediately send an ACK once 'n' CE marks have
                  arrived since the previous ACK, where 'n' SHOULD be 2 and
                  MUST be no greater than 6.</t>
                </list>These rules for when to send an ACK are designed to be
              complemented by those in <xref target="accecn_option_usage"/>,
              which concern whether the AccECN TCP Option ought to be included
              on ACKs.</t>

              <t>For the avoidance of doubt, the change-triggered ACK
              mechanism is deliberately worded to solely apply to data
              packets, and to ignore the arrival of a control packet with no
              payload, because it is important that TCP does not acknowledge
              pure ACKs. The change-triggered ACK approach can lead to some
              additional ACKs but it feeds back the timing and the order in
              which ECN marks are received with minimal additional complexity.
              If only CE marks are infrequent, or there are multiple marks in
              a row, the additional load will be low. Other marking patterns
              could increase the load significantly. Investigating the
              additional load is a goal of the proposed experiment.</t>

              <t>Even though the first bullet is stated as a "SHOULD", it is
              important for a transition to immediately trigger an ACK if at
              all possible, so that the Data Sender can rely on
              change-triggered ACKs to detect queue growth as soon as
              possible, e.g. at the start of a flow. This requirement can only
              be relaxed if certain offload hardware needed for high
              performance cannot support change-triggered ACKs (although high
              performance protocols such as DCTCP already successfully use
              change-triggered ACKs). One possible experimental compromise
              would be for the receiver to heuristically detect whether the
              sender is in slow-start, then to implement change-triggered ACKs
              while the sender is in slow-start, and offload otherwise.</t>
            </section>

            <section anchor="accecn_ACE_Safety_S"
                     title="Data Sender Safety Procedures">
              <t>If the Data Sender has not received AccECN TCP Options to
              give it more dependable information, and it detects that the ACE
              field could have cycled, it SHOULD deem whether it cycled by
              taking the safest likely case under the prevailing conditions.
              It can detect if the counter could have cycled by using the jump
              in the acknowledgement number since the last ACK to calculate or
              estimate how many segments could have been acknowledged. An
              example algorithm to implement this policy is given in <xref
              target="accecn_Algo_ACE_Wrap"/>. An implementer MAY develop an
              alternative algorithm as long as it satisfies these
              requirements.</t>

              <t>If missing acknowledgement numbers arrive later (reordering)
              and prove that the counter did not cycle, the Data Sender MAY
              attempt to neutralize the effect of any action it took based on
              a conservative assumption that it later found to be
              incorrect.</t>

              <t>The Data Sender can estimate how many packets (of any
              marking) an ACK acknowledges. If the ACE counter on an ACK seems
              to imply that the minimum number of newly CE-marked packets is
              greater that the number of newly acknowledged packets, the Data
              Sender SHOULD believe the ACE counter, unless it can be sure
              that it is counting all control packets correctly.</t>
            </section>
          </section>
        </section>

        <section anchor="accecn_option" title="The AccECN Option">
          <t>The AccECN Option is defined as shown in <xref
          target="accecn_Fig_TCPopt"/>. The initial 'E' of each field name
          stands for 'Echo'.</t>

          <figure align="center" anchor="accecn_Fig_TCPopt"
                  title="The AccECN TCP Option">
            <artwork><![CDATA[ 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Kind = TBD1  |  Length = 11  |          EE0B field           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| EE0B (cont'd) |           ECEB field                          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                  EE1B field                   |             Order 0
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Kind = TBD1  |  Length = 11  |          EE1B field           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| EE1B (cont'd) |           ECEB field                          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                  EE0B field                   |             Order 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
          </figure>

          <t>When a Data Receiver sends an AccECN Option, it MUST set the Kind
          field to TBD1, which is registered in <xref
          target="accecn_IANA_Considerations"/> as a new TCP option Kind
          called AccECN. An experimental TCP option with Kind=254 MAY be used
          for initial experiments, with magic number 0xACCE.</t>

          <t><xref target="accecn_Fig_TCPopt"/> shows two option field orders;
          order 0 and order 1. They both consists of three 24-bit fields.
          Order 0 provides the 24 least significant bits of the r.e0b, r.ceb
          and r.e1b counters, respectively. Order 1 provides the same fields,
          but in the opposite order. Each half-connection can use a different
          field order, but a Data Receiver MUST consistently send the same
          field order within the same half-connection.</t>

          <t>The field order to use for each half-connection is up to the Data
          Receiver implementation. It might use the same hard-coded order for
          all half-connections, or it might make a different choice for each
          half-connection. For instance, the implementation of a Data Receiver
          might default to using order 0, unless the ECN field in the IP
          header of the packet it received during the 3WHS is ECT(1). A Data
          Receiver just starts using its chosen field order and the field
          immediately after the length field in the first AccECN TCP Option of
          a half-connection will intrinsically indicate which order it is
          using, because the initial counter values that it is required to use
          depend on its chosen field order (see <xref
          target="accecn_init_counters"/>).</t>

          <t>A Data Sender can know which field order the Data Receiver is
          using for a half-connection from the most significant bit (MSB) of
          the counter in the field immediately after the length field in the
          first non-empty AccECN TCP Option to arrive. If this MSB = 0, field
          order 0 is being used, and if MSB = 1, field order 1 is being used.
          Note that the Data Sender only tests the most significant bit, not
          the value of the whole field, because the counters in the first
          packet to arrive might have started to increment (e.g. if the first
          packet to arrive is not the first packet sent due to loss or
          reordering).</t>

          <t>Note that there is no field to feed back Not-ECT bytes.
          Nonetheless an algorithm for the Data Sender to calculate the number
          of payload bytes received as Not-ECT is given in <xref
          target="accecn_Algo_Not-ECT"/>.</t>

          <t>Whenever a Data Receiver sends an AccECN Option, the rules in
          <xref target="accecn_option_usage"/> expect it to usually send a
          full-length option. To cope with option space limitations, it can
          omit unchanged fields from the tail of the option, as long as it
          preserves the order of the remaining fields and includes any field
          that has changed. The length field MUST indicate which fields are
          present as follows:</t>

          <texttable>
            <ttcol>Length</ttcol>

            <ttcol>Type 0</ttcol>

            <ttcol>Type 1</ttcol>

            <c>11</c>

            <c>EE0B, ECEB, EE1B</c>

            <c>EE1B, ECEB, EE0B</c>

            <c>8</c>

            <c>EE0B, ECEB</c>

            <c>EE1B, ECEB</c>

            <c>5</c>

            <c>EE0B</c>

            <c>EE1B</c>

            <c>2</c>

            <c>(empty)</c>

            <c>(empty)</c>
          </texttable>

          <t>The empty option of Length=2 is provided to allow for a case
          where an AccECN Option has to be sent (e.g. on the SYN/ACK to test
          the path), but there is very limited space for the option. For
          initial experiments, the Length field MUST be 2 greater to
          accommodate the 16-bit magic number.</t>

          <t>All implementations of a Data Sender that read any AccECN Option
          MUST be able to read in AccECN Options of any of the above lengths.
          For forward compatibility, if the AccECN Option is of any other
          length, implementations MUST use those whole 3 octet fields that fit
          within the length and ignore the remainder of the option.<!--ToDo: I'm sure we can make this more flexible, so we can introduce a 1 byte initial field later.--></t>

          <t>The AccECN Option has to be optional to implement, because both
          sender and receiver have to be able to cope without the option
          anyway - in cases where it does not traverse a network path. It is
          RECOMMENDED to implement both sending and receiving of the AccECN
          Option. If sending of the AccECN Option is implemented, the
          fall-backs described in this document will need to be implemented as
          well (unless solely for a controlled environment where path
          traversal is not considered a problem). Even if a developer does not
          implement sending of the AccECN Option, it is RECOMMENDED that they
          still implement logic to receive and understand any AccECN Options
          sent by remote peers.</t>

          <t>If a Data Receiver intends to send the AccECN Option at any time
          during the rest of the connection it is strongly recommended to also
          test path traversal of the AccECN Option as specified in <xref
          target="accecn_Mbox_Interference"/>.</t>

          <section title="Encoding and Decoding Feedback in the AccECN Option Fields">
            <t>Whenever the Data Receiver includes any of the counter fields
            (ECEB, EE0B, EE1B) in an AccECN Option, it MUST encode the 24
            least significant bits of the current value of the associated
            counter into the field (respectively r.ceb, r.e0b, r.e1b).</t>

            <t>Whenever the Data Sender receives ACK carrying an AccECN
            Option, it first checks whether the ACK has already been
            superseded by another ACK in which case it ignores the ECN
            feedback. If the ACK has not been superseded, the Data Sender MUST
            decode the fields in the AccECN Option as follows. For each field,
            it takes the least significant 24 bits of its associated local
            counter (s.ceb, s.e0b or s.e1b) and subtracts them from the
            counter in the associated field of the incoming AccECN Option
            (respectively ECEB, EE0B, EE1B), to work out the minimum positive
            increment it could apply to s.ceb, s.e0b or s.e1b (assuming the
            field in the option only wrapped at most once).</t>

            <t><xref target="accecn_Algo_Option_Coding"/> gives an example
            algorithm for the Data Receiver to encode its byte counters into
            the AccECN Option, and for the Data Sender to decode the AccECN
            Option fields into its byte counters.</t>

            <t>Note that, as specified in <xref target="accecn_feedback"/>,
            any data on the SYN (SYN=1, ACK=0) is not included in any of the
            locally held octet counters nor in the AccECN Option on the
            wire.</t>
          </section>

          <section anchor="accecn_Mbox_Interference"
                   title="Path Traversal of the AccECN Option">
            <t/>

            <section anchor="accecn_AccECN_Option_3WHS"
                     title="Testing the AccECN Option during the Handshake">
              <t>The TCP client MUST NOT include the AccECN TCP Option on the
              SYN. (A fall-back strategy for the loss of the SYN (possibly due
              to middlebox interference) is specified in <xref
              target="accecn_sec_SYN_rexmt"/>.)</t>

              <t>A TCP server that confirms its support for AccECN (in
              response to an AccECN SYN from the client as described in <xref
              target="accecn_Negotiation"/>) SHOULD include an AccECN TCP
              Option on the SYN/ACK.</t>

              <t>A TCP client that has successfully negotiated AccECN SHOULD
              include an AccECN Option in the first ACK at the end of the
              3WHS. However, this first ACK is not delivered reliably, so the
              TCP client SHOULD also include an AccECN Option on the first
              data segment it sends (if it ever sends one).</t>

              <t>A host MAY NOT include an AccECN Option in any of these three
              cases if it has cached knowledge that the packet would be likely
              to be blocked on the path to the other host if it included an
              AccECN Option.</t>
            </section>

            <section anchor="accecn_AccECN_Option_Loss"
                     title="Testing for Loss of Packets Carrying the AccECN Option">
              <t><!--Should we make rexmt of SYN/ACK with AccECN flags, but not AccECN Option that default?-->If
              after the normal TCP timeout the TCP server has not received an
              ACK to acknowledge its SYN/ACK, the SYN/ACK might just have been
              lost, e.g. due to congestion, or a middlebox might be blocking
              the AccECN Option. To expedite connection setup, the TCP server
              SHOULD retransmit the SYN/ACK repeating the same AE, CWR and ECE
              TCP flags as on the original SYN/ACK but with no AccECN Option.
              If this retransmission times out, to expedite connection setup,
              the TCP server SHOULD disable AccECN and ECN for this connection
              by retransmitting the SYN/ACK with AE=CWR=ECE=0 and no AccECN
              Option.</t>

              <t>Implementers MAY use other fall-back strategies if they are
              found to be more effective (e.g. retrying the AccECN Option for
              a second time before fall-back - most appropriate during high
              levels of congestion). However, other fall-back strategies will
              need to follow all the rules in <xref
              target="accecn_implications_accecn_mode"/>, which concern
              behaviour when SYNs or SYN/ACKs negotiating different types of
              feedback have been sent within the same connection.</t>

              <t>If the TCP client detects that the first data segment it sent
              with the AccECN Option was lost, it SHOULD fall back to no
              AccECN Option on the retransmission. Again, implementers MAY use
              other fall-back strategies such as attempting to retransmit a
              second segment with the AccECN Option before fall-back, and/or
              caching whether the AccECN Option is blocked for subsequent
              connections. <xref target="I-D.ietf-tcpm-2140bis"/> further
              discusses caching of TCP parameters and status information.</t>

              <t>If a host falls back to not sending the AccECN Option, it
              will continue to process any incoming AccECN Options as
              normal.</t>

              <t>Either host MAY include the AccECN Option in a subsequent
              segment to retest whether the AccECN Option can traverse the
              path.</t>

              <t>If the TCP server receives a second SYN with a request for
              AccECN support, it should resend the SYN/ACK, again confirming
              its support for AccECN, but this time without the AccECN Option.
              This approach rules out any interference by middleboxes that may
              drop packets with unknown options, even though it is more likely
              that the SYN/ACK would have been lost due to congestion. The TCP
              server MAY try to send another packet with the AccECN Option at
              a later point during the connection but should monitor if that
              packet got lost as well, in which case it SHOULD disable the
              sending of the AccECN Option for this half-connection.</t>

              <t>Similarly, an AccECN end-point MAY separately memorize which
              data packets carried an AccECN Option and disable the sending of
              AccECN Options if the loss probability of those packets is
              significantly higher than that of all other data packets in the
              same connection.</t>
            </section>

            <section title="Testing for Absence of the AccECN Option">
              <t>If the TCP client has successfully negotiated AccECN but does
              not receive an AccECN Option on the SYN/ACK (e.g. because is has
              been stripped by a middlebox or not sent by the server), the
              client switches into a mode that assumes that the AccECN Option
              is not available for this half connection.</t>

              <t>Similarly, if the TCP server has successfully negotiated
              AccECN but does not receive an AccECN Option on the first
              segment that acknowledges sequence space at least covering the
              ISN, it switches into a mode that assumes that the AccECN Option
              is not available for this half connection.</t>

              <t>While a host is in this mode that assumes incoming AccECN
              Options are not available, it MUST adopt the conservative
              interpretation of the ACE field discussed in <xref
              target="accecn_ACE_Safety"/>. However, it cannot make any
              assumption about support of outgoing AccECN Options on the other
              half connection, so it SHOULD continue to send the AccECN Option
              itself (unless it has established that sending the AccECN Option
              is causing packets to be blocked as in <xref
              target="accecn_AccECN_Option_Loss"/>).</t>

              <t>If a host is in the mode that assumes incoming AccECN Options
              are not available, but it receives an AccECN Option at any later
              point during the connection, this clearly indicates that the
              AccECN Option is not blocked on the respective path, and the
              AccECN endpoint MAY switch out of the mode that assumes the
              AccECN Option is not available for this half connection.</t>
            </section>

            <section anchor="accecn_sec_zero_option"
                     title="Test for Zeroing of the AccECN Option">
              <t>For a related test for invalid initialization of the ACE
              field, see <xref target="accecn_sec_ACE_init_invalid"/></t>

              <t><xref target="accecn_feedback"/> required the Data Receiver
              to initialize the r.e0b counter to a non-zero value. Therefore,
              in either direction the initial value of the EE0B field in the
              AccECN Option (if one exists) ought to be non-zero. If AccECN
              has been negotiated:<list style="symbols">
                  <t>the TCP server MAY check the initial value of the EE0B
                  field in the first segment that acknowledges sequence space
                  that at least covers the ISN plus 1. If the initial value of
                  the EE0B field is zero, the server will switch into a mode
                  that ignores the AccECN Option for this half connection.</t>

                  <t>the TCP client MAY check the initial value of the EE0B
                  field on the SYN/ACK. If the initial value of the EE0B field
                  is zero, the client will switch into a mode that ignores the
                  AccECN Option for this half connection.</t>
                </list></t>

              <t>While a host is in the mode that ignores the AccECN Option it
              MUST adopt the conservative interpretation of the ACE field
              discussed in <xref target="accecn_ACE_Safety"/>.</t>

              <t>Note that the Data Sender MUST NOT test whether the arriving
              byte counters in the initial AccECN Option have been initialized
              to specific valid values - the above checks solely test whether
              these fields have been incorrectly zeroed. This allows hosts to
              use different initial values as an additional signalling channel
              in future. Also note that the initial value of either field
              might be greater than its expected initial value, because the
              counters might already have been incremented. Nonetheless, the
              initial values of the counters have been chosen so that they
              cannot wrap to zero on these initial segments.</t>
            </section>

            <section title="Consistency between AccECN Feedback Fields">
              <t>When the AccECN Option is available it supplements but does
              not replace the ACE field. An endpoint using AccECN feedback
              MUST always consider the information provided in the ACE field
              whether or not the AccECN Option is also available.</t>

              <t>If the AccECN option is present, the s.cep counter might
              increase while the s.ceb counter does not (e.g. due to a
              CE-marked control packet). The sender's response to such a
              situation is out of scope, and needs to be dealt with in a
              specification that uses ECN-capable control packets.
              Theoretically, this situation could also occur if a middlebox
              mangled the AccECN Option but not the ACE field. However, the
              Data Sender has to assume that the integrity of the AccECN
              Option is sound, based on the above test of the well-known
              initial values and optionally other integrity tests (<xref
              target="accecn_Integrity"/>).</t>

              <t>If either end-point detects that the s.ceb counter has
              increased but the s.cep has not (and by testing ACK coverage it
              is certain how much the ACE field has wrapped), this invalid
              protocol transition has to be due to some form of feedback
              mangling. So, the Data Sender MUST disable sending ECN-capable
              packets for the remainder of the half-connection by setting the
              IP/ECN field in all subsequent packets to Not-ECT.<!--There is no need to say the following for forward compatibility:
"If a data receiver negotiates AccECN but then deliberately makes the counters inconsistent, it MUST continue the connection 
even if the data sender does not disable sending ECN-capable packets."--></t>
            </section>
          </section>

          <section anchor="accecn_option_usage"
                   title="Usage of the AccECN TCP Option">
            <t>If the Data Receiver intends to use the AccECN TCP Option to
            provide feedback, the following rules determine when a Data
            Receiver in AccECN mode sends an ACK with the AccECN TCP Option,
            and which fields to include:<list style="hanging">
                <!--ToDo: Has to be a MUST in some way (e.g. if send some MUST send all) for sender to guuess which counter is changing when its not receiving anything.
See emails with Ilpo, 3/1/20.

-->

                <t hangText="Change-Triggered ACKs:">If an arriving packet
                increments a different byte counter to that incremented by the
                previous packet, the Data Receiver SHOULD immediately send an
                ACK with an AccECN Option, without waiting for the next
                delayed ACK (this is in addition to the safety recommendation
                in <xref target="accecn_ACE_Safety"/> against ambiguity of the
                ACE field). <vspace blankLines="1"/>Even though this bullet is
                stated as a "SHOULD", it is important for a transition to
                immediately trigger an ACK if at all possible, as already
                argued when specifying change-triggered ACKs for the ACE.</t>

                <t hangText="Continual Repetition:">Otherwise, if arriving
                packets continue to increment the same byte counter, the Data
                Receiver can include an AccECN Option on most or all (delayed)
                ACKs, but it does not have to.<list style="symbols">
                    <t>It SHOULD include a counter that has continued to
                    increment on the next scheduled ACK following a
                    change-triggered ACK;</t>

                    <t>while the same counter continues to increment, it
                    SHOULD include the counter every n ACKs as consistently as
                    possible, where n can be chosen by the implementer;</t>

                    <t>It SHOULD always include an AccECN Option if the r.ceb
                    counter is incrementing and it MAY include an AccECN
                    Option if r.ec0b or r.ec1b is incrementing</t>

                    <t>It SHOULD, include each counter at least once for every
                    2^22 bytes incremented to prevent overflow during
                    continual repetition.</t>
                  </list>If the smallest allowed AccECN Option would leave
                insufficient space for two SACK blocks on a particular ACK,
                the Data Receiver MUST give precedence to the SACK option
                (total 18 octets), because loss feedback is more critical.</t>

                <t hangText="Necessary Option Length:">It MAY exclude
                counter(s) that have not changed for the whole connection (but
                beacons still include all fields - see below). It SHOULD
                include counter(s) that have incremented at some time during
                the connection. It MUST include the counter(s) that have
                incremented since the previous AccECN Option and it MUST only
                truncate fields from the right-hand tail of the option to
                preserve the order of the remaining fields (see <xref
                target="accecn_option"/>);</t>

                <t hangText="Beaconing Full-Length Options:">Nonetheless, it
                MUST include a full-length AccECN TCP Option on at least three
                ACKs per RTT, or on all ACKs if there are less than three per
                RTT (see <xref target="accecn_Algo_Beacon"/> for an example
                algorithm that satisfies this requirement).</t>
              </list>The above rules complement those in <xref
            target="accecn_ACE_Safety"/>, which determine when to generate an
            ACK irrespective of whether an AccECN TCP Option is to be
            included.</t>

            <!--Further an AccECN host MAY send the AccECN TCP Option immediately if a different counter changes than triggered by the previous received segment. 'immediate' in this case does not only mean 
that the AccECN Option will be included in the next ACK, but also means that the host might send an ACK immediately after reception of the current segment and does not wait for the next delayed ACK. 
Note that the ACK for the next segment could be delayed again if it carries the same ECN mark. -->

            <t>The following example series of arriving IP/ECN fields
            illustrates when a Data Receiver will emit an ACK with an AccECN
            Option if it is using a delayed ACK factor of 2 segments and
            change-triggered ACKs: 01 -&gt; ACK, 01, 01 -&gt; ACK, 10 -&gt;
            ACK, 10, 01 -&gt; ACK, 01, 11 -&gt; ACK, 01 -&gt; ACK.</t>

            <t>Even though first bullet is stated as a "SHOULD", it is
            important for a transition to immediately trigger an ACK if at all
            possible, so that the Data Sender can rely on change-triggered
            ACKs to detect queue growth as soon as possible, e.g. at the start
            of a flow. This requirement can only be relaxed if certain offload
            hardware needed for high performance cannot support
            change-triggered ACKs (although high performance protocols such as
            DCTCP already successfully use change-triggered ACKs). One
            possible experimental compromise would be for the receiver to
            heuristically detect whether the sender is in slow-start, then to
            implement change-triggered ACKs while the sender is in slow-start,
            and offload otherwise.</t>

            <t>For the avoidance of doubt, this change-triggered ACK mechanism
            is deliberately worded to ignore the arrival of a control packet
            with no payload, which therefore does not alter any byte counters,
            because it is important that TCP does not acknowledge pure ACKs.
            The change-triggered ACK approach can lead to some additional ACKs
            but it feeds back the timing and the order in which ECN marks are
            received with minimal additional complexity. If only CE marks are
            infrequent, or there are multiple marks in a row, the additional
            load will be low. Other marking patterns could increase the load
            significantly, Investigating the additional load is a goal of the
            proposed experiment.</t>

            <t>Implementation note: sending an AccECN Option each time a
            different counter changes and including a full-length AccECN
            Option on every delayed ACK will satisfy the requirements
            described above and might be the easiest implementation, as long
            as sufficient space is available in each ACK (in total and in the
            option space).</t>

            <t><xref target="accecn_Algo_ACE_Bytes"/> gives an example
            algorithm to estimate the number of marked bytes from the ACE
            field alone, if the AccECN Option is not available.</t>

            <t>If a host has determined that segments with the AccECN Option
            always seem to be discarded somewhere along the path, it is no
            longer obliged to follow the above rules.</t>
          </section>
        </section>
      </section>

      <!-- <section anchor="accecn_Rcvr_Operation"
               title="Accurate ECN Receiver Operation">
        <t>A TCP receiver MUST only feedback ECN information arriving in a
        segment that it deems is part of the flow, by using regular TCP
        techniques based on sequence numbers.</t>

        <t>{ToDo: It might be useful to describe receiver end of the feedback
        process, including special cases, e.g. pure ACKs, retransmissions,
        window probes, partial ACKs, etc. Does AccECN feed back each ECN
        codepoint when a data packet is duplicated?}</t>
      </section>

      <section anchor="accecn_Sndr_Operation"
               title="Accurate ECN Sender Operation">
        <t>A TCP sender MUST only accept ECN feedback on ACKs that it deems is
        part of the flow, by using regular TCP techniques based on sequence
        numbers.</t>

        <t>{ToDo: It might be useful to describe the sender end of the
        feedback process, including special cases, e.g. pure ACKs,
        retransmissions, window probes, partial ACKs, etc.}</t>
      </section> -->

      <!-- Comment by Mirja: not sure if the following section is needed. Of
       course a proxy should comply to the spec. Just writing this down explicitly
       doesn't help the problem; especially as the problem is old boxes that
       never get updated...! 
       Bob adds: Of course it doesn't stop legacy middleboxes being wrong, 
       but it allows us (or an operator that buys a middlebox) to say a middlebox 
       does not comply with this RFC, which can be important if the contract 
       to maintain the box says it has to comply with updated standards -->

      <section anchor="accecn_Mbox_Operation"
               title="Requirements for TCP Proxies, Offload Engines and other Middleboxes on AccECN Compliance">
        <t>A large class of middleboxes split TCP connections. Such a
        middlebox would be compliant with the AccECN protocol if the TCP
        implementation on each side complied with the present AccECN
        specification and each side negotiated AccECN independently of the
        other side.</t>

        <t>Another large class of middleboxes intervenes to some degree at the
        transport layer, but attempts to be transparent (invisible) to the
        end-to-end connection. A subset of this class of middleboxes attempts
        to `normalize' the TCP wire protocol by checking that all values in
        header fields comply with a rather narrow interpretation of the TCP
        specifications. To comply with the present AccECN specification, such
        a middlebox MUST NOT change the ACE field or the AccECN Option and it
        SHOULD preserve the timing of each ACK (for example, if it coalesced
        ACKs it would not be AccECN-compliant) as these can be used by the
        Data Sender to infer further information about the path congestion
        level.<!-- This includes the explicitly stated requirements to forward
        Reserved (Rsvd) and Currently Unused (CU) values unaltered. 
An 'ideal' TCP normalizer would not have to change to accommodate AccECN, because AccECN does not directly contravene any existing TCP specifications, 
even though it uses existing TCP fields in unorthodox ways.
--> A middlebox claiming to be transparent at the transport layer MUST forward
        the AccECN TCP Option unaltered, whether or not the length value
        matches one of those specified in <xref target="accecn_option"/>, and
        whether or not the initial values of the byte-counter fields are
        correct. This is because blocking apparently invalid values does not
        improve security (because AccECN hosts are required to ignore invalid
        values anyway), while it prevents the standardized set of values being
        extended in future (because outdated normalizers would block updated
        hosts from using the extended AccECN standard).</t>

        <t>Hardware to offload certain TCP processing represents another large
        class of middleboxes, even though it is often a function of a host's
        network interface and rarely in its own 'box'. Leeway has been allowed
        in the present AccECN specification in the expectation that offload
        hardware could comply and still serve its function. Nonetheless, such
        hardware SHOULD also preserve the timing of each ACK (for example, if
        it coalesced ACKs it would not be AccECN-compliant).</t>

        <t>The ACE field changes with every received CE marking, so today's
        receive offloading could lead to many interrupts in high congestion
        situations. Although that would be useful (because congestion
        information is received sooner), it could also significantly increase
        processor load, particularly in scenarios such as DCTCP or L4S where
        the marking rate is generally higher.</t>

        <t>In data centres it has been fortunate for offload hardware that
        DCTCP-style feedback changes less often when there are long sequences
        of CE marks, which is more common with a step marking threshold. In
        order to enable DCTCP to improve its responsiveness, DCs will need to
        move beyond step marking. Before this can happen, offload hardware
        will have to explicitly address the variability of ECN feedback.</t>

        <t>ECN encodes a varying signal in the ACK stream, so it is inevitable
        that offload hardware will ultimately need to handle any form of ECN
        feedback exceptionally. The purpose of working towards standardized
        TCP ECN feedback is to reduce the risk for hardware developers, who
        would otherwise have to guess which scheme is likely to become
        dominant.</t>
      </section>
    </section>

    <section anchor="accecn_Interact_Variants"
             title="Interaction with Other TCP Variants">
      <t>This section is informative, not normative.</t>

      <section anchor="accecn_Interaction_SYN_Cookies"
               title="Compatibility with SYN Cookies">
        <t>A TCP server can use SYN Cookies (see Appendix A of <xref
        target="RFC4987"/>) to protect itself from SYN flooding attacks. It
        places minimal commonly used connection state in the SYN/ACK, and
        deliberately does not hold any state while waiting for the subsequent
        ACK (e.g. it closes the thread). Therefore it cannot record the fact
        that it entered AccECN mode for both half-connections. Indeed, it
        cannot even remember whether it negotiated the use of classic ECN
        <xref target="RFC3168"/>.</t>

        <t>Nonetheless, such a server can determine that it negotiated AccECN
        as follows. If a TCP server using SYN Cookies supports AccECN and if
        it receives a pure ACK that acknowledges an ISN that is a valid SYN
        cookie, and if the ACK contains an ACE field with the value 0b010 to
        0b111 (decimal 2 to 7), it can assume that:<list style="symbols">
            <t>the TCP client must have requested AccECN support on the
            SYN</t>

            <t>it (the server) must have confirmed that it supported
            AccECN</t>
          </list>Therefore the server can switch itself into AccECN mode, and
        continue as if it had never forgotten that it switched itself into
        AccECN mode earlier.</t>

        <t>If the pure ACK that acknowledges a SYN cookie contains an ACE
        field with the value 0b000 or 0b001, these values indicate that the
        client did not request support for AccECN and therefore the server
        does not enter AccECN mode for this connection. Further, 0b001 on the
        ACK implies that the server sent an ECN-capable SYN/ACK, which was
        marked CE in the network, and the non-AccECN client fed this back by
        setting ECE on the ACK of the SYN/ACK.</t>
      </section>

      <section anchor="accecn_Interaction_Other"
               title="Compatibility with Other TCP Options and Experiments">
        <t>AccECN is compatible (at least on paper) with the most commonly
        used TCP options: MSS, time-stamp, window scaling, SACK and TCP-AO. It
        is also compatible with the recent promising experimental TCP options
        TCP Fast Open (TFO <xref target="RFC7413"/>) and Multipath TCP (MPTCP
        <xref target="RFC6824"/>). AccECN is friendly to all these protocols,
        because space for TCP options is particularly scarce on the SYN, where
        AccECN consumes zero additional header space.</t>

        <t>When option space is under pressure from other options, <xref
        target="accecn_option_usage"/> provides guidance on how important it
        is to send an AccECN Option and whether it needs to be a full-length
        option.</t>

        <t>Implementers of TFO need to take careful note of the recommendation
        in <xref target="accecn_ACE_3rdACK"/>. That section recommends that,
        if the client has successfully negotiated AccECN, when acknowledging
        the SYN/ACK, even if it has data to send, it sends a pure ACK
        immediately before the data. Then it can reflect the IP-ECN field of
        the SYN/ACK on this pure ACK, which allows the server to detect ECN
        mangling.</t>
      </section>

      <section anchor="accecn_Integrity"
               title="Compatibility with Feedback Integrity Mechanisms">
        <t>Three alternative mechanisms are available to assure the integrity
        of ECN and/or loss signals. AccECN is compatible with any of these
        approaches:<list style="symbols">
            <t>The Data Sender can test the integrity of the receiver's ECN
            (or loss) feedback by occasionally setting the IP-ECN field to a
            value normally only set by the network (and/or deliberately
            leaving a sequence number gap). Then it can test whether the Data
            Receiver's feedback faithfully reports what it expects (similar to
            para 2 of Section 20.2 of <xref target="RFC3168"/>). Unlike the
            ECN Nonce <xref target="RFC3540"/>, this approach does not waste
            the ECT(1) codepoint in the IP header, it does not require
            standardization and it does not rely on misbehaving receivers
            volunteering to reveal feedback information that allows them to be
            detected. However, setting the CE mark by the sender might conceal
            actual congestion feedback from the network and should therefore
            only be done sparingly.</t>

            <t>Networks generate congestion signals when they are becoming
            congested, so networks are more likely than Data Senders to be
            concerned about the integrity of the receiver's feedback of these
            signals. A network can enforce a congestion response to its ECN
            markings (or packet losses) using congestion exposure (ConEx)
            audit <xref target="RFC7713"/>. Whether the receiver or a
            downstream network is suppressing congestion feedback or the
            sender is unresponsive to the feedback, or both, ConEx audit can
            neutralize any advantage that any of these three parties would
            otherwise gain. <vspace blankLines="1"/>ConEx is a change to the
            Data Sender that is most useful when combined with AccECN. Without
            AccECN, the ConEx behaviour of a Data Sender would have to be more
            conservative than would be necessary if it had the accurate
            feedback of AccECN.</t>

            <t>The TCP authentication option (TCP-AO <xref target="RFC5925"/>)
            can be used to detect any tampering with AccECN feedback between
            the Data Receiver and the Data Sender (whether malicious or
            accidental). The AccECN fields are immutable end-to-end, so they
            are amenable to TCP-AO protection, which covers TCP options by
            default. However, TCP-AO is often too brittle to use on many
            end-to-end paths, where middleboxes can make verification fail in
            their attempts to improve performance or security, e.g. by
            resegmentation or shifting the sequence space.</t>
          </list>Originally the ECN Nonce <xref target="RFC3540"/> was
        proposed to ensure integrity of congestion feedback. With minor
        changes AccECN could be optimized for the possibility that the ECT(1)
        codepoint might be used as an ECN Nonce. However, given RFC 3540 has
        been reclassified as historic, the AccECN design has been generalized
        so that it ought to be able to support other possible uses of the
        ECT(1) codepoint, such as a lower severity or a more instant
        congestion signal than CE.</t>
      </section>
    </section>

    <!-- ================================================================ -->

    <section anchor="accecn_Properties" title="Protocol Properties">
      <t>This section is informative not normative. It describes how well the
      protocol satisfies the agreed requirements for a more accurate ECN
      feedback protocol <xref target="RFC7560"/>.<list style="hanging">
          <t hangText="Accuracy:">From each ACK, the Data Sender can infer the
          number of new CE marked segments since the previous ACK. This
          provides better accuracy on CE feedback than classic ECN. In
          addition if the AccECN Option is present (not blocked by the network
          path) the number of bytes marked with CE, ECT(1) and ECT(0) are
          provided.</t>

          <!-- <t hangText="Accuracy:">The Data Receiver can feed back to the Data
           Sender a list of the order of the IP-ECN markings covered by each
           delayed ACK.</t> -->

          <t hangText="Overhead:">The AccECN scheme is divided into two parts.
          The essential part reuses the 3 flags already assigned to ECN in the
          IP header. The supplementary part adds an additional TCP option
          consuming up to 11 bytes. However, no TCP option is consumed in the
          SYN.</t>

          <t hangText="Ordering:">The order in which marks arrive at the Data
          Receiver is preserved in AccECN feedback, because the Data Receiver
          is expected to send an ACK immediately whenever a different mark
          arrives.</t>

          <!-- <t hangText="Overhead:">Two alternative locations for the
           supplementary protocol field are proposed:<list style="numbers">
           <t>In the 16-bit Urgent Pointer when URG=0. This specification
           reserves 15 bits of this space, but while the specification is
           only experimental it refrains from using this space in the main
           TCP header. If AccECN progresses to the standards track and uses
           these 15b, it will require zero additional overhead, because it
           will overload fields that already takes up space in every TCP
           header</t>
           
           <t>In a TCP option. This takes up 4B; the fifteen bits have to
           be rounded up to 2B, plus 2B for the TCP option Kind and
           Length.</t>
           </list></t> -->

          <t hangText="Timeliness:">While the same ECN markings are arriving
          continually at the Data Receiver, it can defer ACKs as TCP does
          normally, but it will immediately send an ACK as soon as a different
          ECN marking arrives.</t>

          <t hangText="Timeliness vs Overhead:">Change-Triggered ACKs are
          intended to enable latency-sensitive uses of ECN feedback by
          capturing the timing of transitions but not wasting resources while
          the state of the signalling system is stable. Within the constraints
          of the change-triggered ACK rules, the receiver can control how
          frequently it sends the AccECN TCP Option and therefore to some
          extent it can control the overhead induced by AccECN.</t>

          <!-- <t hangText="Timeliness:">{ToDo: Add improved timeliness if the
           Delayed ACK Control (DAC) feature is included.}</t> -->

          <t hangText="Resilience:">All information is provided based on
          counters. Therefore if ACKs are lost, the counters on the first ACK
          following the losses allows the Data Sender to immediately recover
          the number of the ECN markings that it missed. And if data or ACKs
          are reordered, stale congestion information can be identified and
          ignored.</t>

          <t hangText="Resilience against Bias:">Because feedback is based on
          repetition of counters, random losses do not remove any information,
          they only delay it. Therefore, even though some ACKs are
          change-triggered, random losses will not alter the proportions of
          the different ECN markings in the feedback.</t>

          <t hangText="Resilience vs Overhead:">If space is limited in some
          segments (e.g. because more options are needed on some segments,
          such as the SACK option after loss), the Data Receiver can send
          AccECN Options less frequently or truncate fields that have not
          changed, usually down to as little as 5 bytes. However, it has to
          send a full-sized AccECN Option at least three times per RTT, which
          the Data Sender can rely on as a regular beacon or checkpoint.</t>

          <t hangText="Resilience vs Timeliness and Ordering:">Ordering
          information and the timing of transitions cannot be communicated in
          three cases: i) during ACK loss; ii) if something on the path strips
          the AccECN Option; or iii) if the Data Receiver is unable to support
          Change-Triggered ACKs. Following ACK reordering, the Data Sender can
          reconstruct the order in which feedback was sent, but not until all
          the missing feedback has arrived.</t>

          <!-- reworked end -->

          <!-- <t hangText="Resilience:">Subsequent ACKs will allow it to recover
           the number of other ECN markings that it missed.</t>
          
           <t hangText="Resilience against Bias:">Undetected ACK loss is as
           likely to decrease as increase congestion signals detected by the
           Data Sender.</t>
           
           <t hangText="Resilience against Bias:">However, if the supplementary
           part is unavailable, the required conservative decoding of feedback
           during ACK loss is more likely to increase perceived congestion
           signals, which would otherwise be more likely to be
           under-reported.</t> 
          
           <t hangText="Timeliness vs Overhead:">For efficiency, each delayed
           ACK only includes one of the counters at a time, therefore recovery
           of the count of the other signals might not be immediate if an ACK
           is lost that covers more than one signal. The receiver cannot
           predict which ACKs might get lost, if any. Therefore it repeats the
           count of each signal roughly in proportion to how often each signal
           changes.</t>
           
           <t hangText="Ordering:">The order of arriving ECN codepoints is
           communicated in a 10-bit field in the supplementary part;</t>
           
           <t hangText="Resilience vs. Ordering:">Following an ACK loss, only a
           count of the lost ECN signals is recovered, not their order of
           arrival over the sequence covered by the loss.</t>
           
           <t hangText="Ordering vs. Overhead:">The encoding is tailored for
           sequences of ECN codepoints expected to be typical. It can encode
           sequences of up to 15 segments but, if the pattern of arrivals
           becomes too complex, the protocol forces the Data Receiver to emit
           an ACK. The protocol can always encode any sequence of 3 segments in
           one delayed ACK;</t>
           
           <t hangText="Ordering, Timeliness and Resilience:">If one delayed
           ACK covers changes to more than one congestion counter the
           supplementary sequence information provides more timely congestion
           feedback than waiting for the other congestion counters on future
           ACKs, and it provides resilience against the possibility of those
           future ACKs going missing;</t> -->

          <!-- new -->

          <t hangText="Complexity:">An AccECN implementation solely involves
          simple counter increments, some modulo arithmetic to communicate the
          least significant bits and allow for wrap, and some heuristics for
          safety against fields cycling due to prolonged periods of ACK loss.
          Each host needs to maintain eight additional counters. The hosts
          have to apply some additional tests to detect tampering by
          middleboxes, but in general the protocol is simple to understand,
          simple to implement and requires few cycles per packet to
          execute.</t>

          <t hangText="Integrity:">AccECN is compatible with at least three
          approaches that can assure the integrity of ECN feedback. If the
          AccECN Option is stripped the resolution of the feedback is
          degraded, but the integrity of this degraded feedback can still be
          assured.</t>

          <t hangText="Backward Compatibility:">If only one endpoint supports
          the AccECN scheme, it will fall-back to the most advanced ECN
          feedback scheme supported by the other end.</t>

          <!-- <t hangText="Backward Compatibility:">Each endpoint can detect
           normalization of the Supplementary AccECN field by middleboxes at
           any time during a connection. It could then fall-back to the
           essential part using only the fewer but safer bits in the TCP
           header.</t> -->

          <!-- new -->

          <t hangText="Backward Compatibility:">If the AccECN Option is
          stripped by a middlebox, AccECN still provides basic congestion
          feedback in the ACE field. Further, AccECN can be used to detect
          mangling of the IP ECN field; mangling of the TCP ECN flags;
          blocking of ECT-marked segments; and blocking of segments carrying
          the AccECN Option. It can detect these conditions during TCP's 3WHS
          so that it can fall back to operation without ECN and/or operation
          without the AccECN Option.</t>

          <!-- new end -->

          <t hangText="Forward Compatibility:">The behaviour of endpoints and
          middleboxes is carefully defined for all reserved or currently
          unused codepoints in the scheme. Then, the designers of security
          devices can understand which currently unused values might appear in
          future. So, even if they choose to treat such values as anomalous
          while they are not widely used, any blocking will at least be under
          policy control not hard-coded. Then, if previously unused values
          start to appear on the Internet (or in standards), such policies
          could be quickly reversed.</t>
        </list></t>
    </section>

    <!-- ================================================================ -->

    <section anchor="accecn_IANA_Considerations" title="IANA Considerations">
      <t>This document reassigns bit 7 of the TCP header flags to the AccECN
      experiment. This bit was previously called the Nonce Sum (NS) flag <xref
      target="RFC3540"/>, but RFC 3540 has been reclassified as historic <xref
      target="RFC8311"/>. The flag will now be defined as:</t>

      <texttable>
        <ttcol>Bit</ttcol>

        <ttcol>Name</ttcol>

        <ttcol>Reference</ttcol>

        <c>7</c>

        <c>AE (Accurate ECN)</c>

        <c>RFC XXXX</c>
      </texttable>

      <t>[TO BE REMOVED: IANA is requested to update the existing entry in the
      Transmission Control Protocol (TCP) Header Flags registration
      (https://www.iana.org/assignments/tcp-header-flags/tcp-header-flags.xhtml#tcp-header-flags-1)
      for Bit 7 to "AE (Accurate ECN), previously used as NS (Nonce Sum) by
      [RFC3540], which is now Historic [RFC8311]" and change the reference to
      this RFC-to-be instead of RFC8311.]</t>

      <t>This document also defines a new TCP option for AccECN, assigned a
      value of TBD1 (decimal) from the TCP option space. This value is defined
      as:</t>

      <texttable>
        <ttcol>Kind</ttcol>

        <ttcol>Length</ttcol>

        <ttcol>Meaning</ttcol>

        <ttcol>Reference</ttcol>

        <c>TBD1</c>

        <c>N</c>

        <c>Accurate ECN (AccECN)</c>

        <c>RFC XXXX</c>
      </texttable>

      <t>[TO BE REMOVED: This registration should take place at the following
      location:
      http://www.iana.org/assignments/tcp-parameters/tcp-parameters.xhtml#tcp-parameters-1
      ]</t>

      <t>Early implementation before the IANA allocation MUST follow <xref
      target="RFC6994"/> and use experimental option 254 and magic number
      0xACCE (16 bits), then migrate to the new option after the
      allocation.</t>
    </section>

    <!-- ================================================================ -->

    <section anchor="accecn_Security_Considerations"
             title="Security Considerations">
      <t>If ever the supplementary part of AccECN based on the new AccECN TCP
      Option is unusable (due for example to middlebox interference) the
      essential part of AccECN's congestion feedback offers only limited
      resilience to long runs of ACK loss (see <xref
      target="accecn_ACE_Safety"/>). These problems are unlikely to be due to
      malicious intervention (because if an attacker could strip a TCP option
      or discard a long run of ACKs it could wreak other arbitrary havoc).
      However, it would be of concern if AccECN's resilience could be
      indirectly compromised during a flooding attack. AccECN is still
      considered safe though, because if the option is not presented, the
      AccECN Data Sender is then required to switch to more conservative
      assumptions about wrap of congestion indication counters (see <xref
      target="accecn_ACE_Safety"/> and <xref
      target="accecn_Algo_ACE_Wrap"/>).</t>

      <t><xref target="accecn_Interaction_SYN_Cookies"/> describes how a TCP
      server can negotiate AccECN and use the SYN cookie method for mitigating
      SYN flooding attacks.</t>

      <t>There is concern that ECN markings could be altered or suppressed,
      particularly because a misbehaving Data Receiver could increase its own
      throughput at the expense of others. AccECN is compatible with the three
      schemes known to assure the integrity of ECN feedback (see <xref
      target="accecn_Integrity"/> for details). If the AccECN Option is
      stripped by an incorrectly implemented middlebox, the resolution of the
      feedback will be degraded, but the integrity of this degraded
      information can still be assured.</t>

      <!--Bob adds: I removed the following 3 sentences, which I felt were weak. I think it is better to admit there is a security concern, than try to claim it is not a problem (when it is). 
If a receiver has driven a network from marking into loss, it has already probably harmed other flows and gained a large share of resources for itself. 
Anyway, a receiver can regulate concealment of ECN marks to give itself more resources without driving a link into loss.-->

      <!--The motivation for concealing ECN marks is generally considered to be self-interest. Causing congestion collapse would not be in the interest of a receiver, 
and it has not been identified as a realistic motivation for attacks that conceal ECN marks.-->

      <!--
-->

      <!--"However, if congestion is persistent but no congestion notification is provided to the Data Sender, the congestion will lead to packet loss which cannot easily be concealed by a reliable TCP connection. 
Therefore the absence of ECN-based packet feedback will not lead to  congestion collapse. Further note that classic ECN also do not have an integrity check. 
ECN Nonce was specified separately therefore a end point that wants to conceal ECN feedback can simply present to not support ECN Nonce."-->

      <t>There is a potential concern that a receiver could deliberately omit
      the AccECN Option pretending that it had been stripped by a middlebox.
      No known way can yet be contrived to take advantage of this downgrade
      attack, but it is mentioned here in case someone else can contrive
      one.</t>

      <t>The AccECN protocol is not believed to introduce any new privacy
      concerns, because it merely counts and feeds back signals at the
      transport layer that had already been visible at the IP layer.</t>
    </section>

    <!-- ================================================================ -->

    <section anchor="accecn_Acknowledgements" title="Acknowledgements">
      <t>We want to thank Koen De Schepper, Praveen Balasubramanian, Michael
      Welzl, Gorry Fairhurst, David Black, Spencer Dawkins, Michael Scharf,
      Michael Tuexen, Yuchung Cheng, Kenjiro Cho, Olivier Tilmans and Ilpo
      J&auml;rvinen for their input and discussion. The idea of using the
      three ECN-related TCP flags as one field for more accurate TCP-ECN
      feedback was first introduced in the re-ECN protocol that was the
      ancestor of ConEx.</t>

      <t>Bob Briscoe was part-funded by the Comcast Innovation Fund, the
      European Community under its Seventh Framework Programme through the
      Reducing Internet Transport Latency (RITE) project (ICT-317700) and
      through the Trilogy 2 project (ICT-317756), and the Research Council of
      Norway through the TimeIn project. The views expressed here are solely
      those of the authors.</t>

      <t>Mirja Kuehlewind was partly supported by the European Commission
      under Horizon 2020 grant agreement no. 688421 Measurement and
      Architecture for a Middleboxed Internet (MAMI), and by the Swiss State
      Secretariat for Education, Research, and Innovation under contract no.
      15.0268. This support does not imply endorsement.</t>
    </section>

    <!-- ================================================================ -->

    <section anchor="accecn_Comments_Solicited" title="Comments Solicited">
      <t>Comments and questions are encouraged and very welcome. They can be
      addressed to the IETF TCP maintenance and minor modifications working
      group mailing list &lt;tcpm@ietf.org&gt;, and/or to the authors.</t>
    </section>
  </middle>

  <back>
    <!-- ================================================================ -->

    <references title="Normative References">
      <?rfc include="reference.RFC.0793" ?>

      <?rfc include="reference.RFC.2119" ?>

      <?rfc include="reference.RFC.3168" ?>

      <?rfc include="reference.RFC.5681" ?>

      <?rfc include="reference.RFC.8174" ?>
    </references>

    <references title="Informative References">
      <?rfc include="reference.RFC.2018" ?>

      <?rfc include="reference.RFC.3540" ?>

      <?rfc include="reference.RFC.4987" ?>

      <?rfc include="reference.RFC.5562" ?>

      <?rfc include="reference.RFC.5925" ?>

      <?rfc include="reference.RFC.5961" ?>

      <?rfc include="reference.RFC.6824" ?>

      <?rfc include="reference.RFC.6994" ?>

      <?rfc include="reference.RFC.7560" ?>

      <?rfc include="reference.RFC.7413" ?>

      <?rfc include="reference.RFC.7713" ?>

      <?rfc include="reference.RFC.8257" ?>

      <?rfc include="reference.RFC.8311" ?>

      <?rfc include="reference.I-D.kuehlewind-tcpm-ecn-fallback" ?>

      <?rfc include="reference.I-D.ietf-tcpm-generalized-ecn" ?>

      <?rfc include="reference.I-D.ietf-tcpm-2140bis" ?>

      <?rfc include="reference.RFC.8511" ?>

      <?rfc include="reference.I-D.ietf-tsvwg-l4s-arch" ?>

      <reference anchor="Mandalari18">
        <front>
          <title>Measuring ECN++: Good News for ++, Bad News for ECN over
          Mobile</title>

          <author fullname="Anna Mandalari" initials="A." surname="Mandalari">
            <organization>UC3M</organization>
          </author>

          <author fullname="Andra Lutu" initials="A." surname="Lutu">
            <organization>Simula</organization>

            <address>
              <postal>
                <street/>

                <city/>

                <region/>

                <code/>

                <country/>
              </postal>

              <phone/>

              <facsimile/>

              <email/>

              <uri/>
            </address>
          </author>

          <author fullname="Bob Briscoe" initials="B." surname="Briscoe">
            <organization>Simula</organization>

            <address>
              <postal>
                <street/>

                <city/>

                <region/>

                <code/>

                <country/>
              </postal>

              <phone/>

              <facsimile/>

              <email/>

              <uri/>
            </address>
          </author>

          <author fullname="Marcelo Bagnulo" initials="M." surname="Bagnulo">
            <organization>UC3M</organization>

            <address>
              <postal>
                <street/>

                <city/>

                <region/>

                <code/>

                <country/>
              </postal>

              <phone/>

              <facsimile/>

              <email/>

              <uri/>
            </address>
          </author>

          <author fullname="&Ouml;zg&uuml; Alay" initials="&Ouml;."
                  surname="Alay">
            <organization>Simula</organization>

            <address>
              <postal>
                <street/>

                <city/>

                <region/>

                <code/>

                <country/>
              </postal>

              <phone/>

              <facsimile/>

              <email/>

              <uri/>
            </address>
          </author>

          <date month="March" year="2018"/>
        </front>

        <seriesInfo name="IEEE Communications Magazine" value=""/>

        <format target="http://www.it.uc3m.es/amandala/ecn++/ecn_commag_2018.html"
                type="PDF"/>
      </reference>
    </references>

    <!-- <section anchor="accecn_Algo_Examples" title="Example Algorithms">
      <t>This appendix is informative, not normative. It gives examples in
      pseudocode for the various algorithms used by AccECN.</t> -->

    <section anchor="accecn_Algo_Examples" title="Example Algorithms">
      <t>This appendix is informative, not normative. It gives example
      algorithms that would satisfy the normative requirements of the AccECN
      protocol. However, implementers are free to choose other ways to
      implement the requirements.</t>

      <section anchor="accecn_Algo_Option_Coding"
               title="Example Algorithm to Encode/Decode the AccECN Option">
        <t><!--ToDo: Example code to check the AccECN Option fields are consistent with the ACE field.-->The
        example algorithms below show how a Data Receiver in AccECN mode could
        encode its CE byte counter r.ceb into the ECEB field within the AccECN
        TCP Option, and how a Data Sender in AccECN mode could decode the ECEB
        field into its byte counter s.ceb. The other counters for bytes marked
        ECT(0) and ECT(1) in the AccECN Option would be similarly encoded and
        decoded.</t>

        <t>It is assumed that each local byte counter is an unsigned integer
        greater than 24b (probably 32b), and that the following constant has
        been assigned:<list style="empty">
            <t>DIVOPT = 2^24</t>
          </list></t>

        <t>Every time a CE marked data segment arrives, the Data Receiver
        increments its local value of r.ceb by the size of the TCP Data.
        Whenever it sends an ACK with the AccECN Option, the value it writes
        into the ECEB field is <list style="empty">
            <t>ECEB = r.ceb % DIVOPT</t>
          </list></t>

        <t>where '%' is the remainder operator.</t>

        <t>On the arrival of an AccECN Option, the Data Sender first makes
        sure the ACK has not been superseded in order to avoid winding the
        s.ceb counter backwards. It uses the TCP acknowledgement number and
        any SACK options to calculate newlyAckedB, the amount of new data that
        the ACK acknowledges in bytes (newlyAckedB can be zero but not
        negative). If newlyAckedB is zero, either the ACK has been superseded
        or CE-marked packet(s) without data could have arrived. To break the
        tie for the latter case, the Data Sender could use timestamps (if
        present) to work out newlyAckedT, the amount of new time that the ACK
        acknowledges. If the Data Sender determines that the ACK has been
        superseded it ignores the AccECN Option. Otherwise, the Data Sender
        calculates the minimum non-negative difference d.ceb between the ECEB
        field and its local s.ceb counter, using modulo arithmetic as
        follows:</t>

        <figure>
          <artwork><![CDATA[   if ((newlyAckedB > 0) || (newlyAckedT > 0)) {
       d.ceb = (ECEB + DIVOPT - (s.ceb % DIVOPT)) % DIVOPT
       s.ceb += d.ceb
   }
]]></artwork>
        </figure>

        <t>For example, if s.ceb is 33,554,433 and ECEB is 1461 (both
        decimal), then</t>

        <figure>
          <artwork><![CDATA[   s.ceb % DIVOPT = 1
         d.ceb = (1461 + 2^24 - 1) % 2^24
               = 1460
         s.ceb = 33,554,433 + 1460
               = 33,555,893
]]></artwork>
        </figure>
      </section>

      <section anchor="accecn_Algo_ACE_Wrap"
               title="Example Algorithm for Safety Against Long Sequences of ACK Loss">
        <t>The example algorithms below show how a Data Receiver in AccECN
        mode could encode its CE packet counter r.cep into the ACE field, and
        how the Data Sender in AccECN mode could decode the ACE field into its
        s.cep counter. The Data Sender's algorithm includes code to
        heuristically detect a long enough unbroken string of ACK losses that
        could have concealed a cycle of the congestion counter in the ACE
        field of the next ACK to arrive.</t>

        <t>Two variants of the algorithm are given: i) a more conservative
        variant for a Data Sender to use if it detects that the AccECN Option
        is not available (see <xref target="accecn_ACE_Safety"/> and <xref
        target="accecn_Mbox_Interference"/>); and ii) a less conservative
        variant that is feasible when complementary information is available
        from the AccECN Option.</t>

        <section title="Safety Algorithm without the AccECN Option">
          <t>It is assumed that each local packet counter is a sufficiently
          sized unsigned integer (probably 32b) and that the following
          constant has been assigned:<list style="empty">
              <t>DIVACE = 2^3</t>
            </list></t>

          <t>Every time an Acceptable CE marked packet arrives (<xref
          target="accecn_sec_ACE_feedback"/>), the Data Receiver increments
          its local value of r.cep by 1. It repeats the same value of ACE in
          every subsequent ACK until the next CE marking arrives, where<list
              style="empty">
              <t>ACE = r.cep % DIVACE.</t>
            </list></t>

          <t>If the Data Sender received an earlier value of the counter that
          had been delayed due to ACK reordering, it might incorrectly
          calculate that the ACE field had wrapped. Therefore, on the arrival
          of every ACK, the Data Sender ensures the ACK has not been
          superseded using the TCP acknowledgement number, any SACK options
          and timestamps (if available) to calculate newlyAckedB, as in <xref
          target="accecn_Algo_Option_Coding"/>. If the ACK has not been
          superseded, the Data Sender calculates the minimum difference d.cep
          between the ACE field and its local s.cep counter, using modulo
          arithmetic as follows:</t>

          <figure>
            <artwork><![CDATA[   if ((newlyAckedB > 0) || (newlyAckedT > 0))
       d.cep = (ACE + DIVACE - (s.cep % DIVACE)) % DIVACE
]]></artwork>
          </figure>

          <t><xref target="accecn_ACE_Safety"/> expects the Data Sender to
          assume that the ACE field cycled if it is the safest likely case
          under prevailing conditions. The 3-bit ACE field in an arriving ACK
          could have cycled and become ambiguous to the Data Sender if a row
          of ACKs goes missing that covers a stream of data long enough to
          contain 8 or more CE marks. We use the word `missing' rather than
          `lost', because some or all the missing ACKs might arrive
          eventually, but out of order. Even if some of the missing ACKs were
          piggy-backed on data (i.e. not pure ACKs) retransmissions will not
          repair the lost AccECN information, because AccECN requires
          retransmissions to carry the latest AccECN counters, not the
          original ones.</t>

          <t>The phrase `under prevailing conditions' allows for
          implementation-dependent interpretation. A Data Sender might take
          account of the prevailing size of data segments and the prevailing
          CE marking rate just before the sequence of missing ACKs. However,
          we shall start with the simplest algorithm, which assumes segments
          are all full-sized and ultra-conservatively it assumes that ECN
          marking was 100% on the forward path when ACKs on the reverse path
          started to all be dropped. Specifically, if newlyAckedB is the
          amount of data that an ACK acknowledges since the previous ACK, then
          the Data Sender could assume that this acknowledges newlyAckedPkt
          full-sized segments, where newlyAckedPkt = newlyAckedB/MSS. Then it
          could assume that the ACE field incremented by</t>

          <figure>
            <artwork><![CDATA[    dSafer.cep = newlyAckedPkt - ((newlyAckedPkt - d.cep) % DIVACE),]]></artwork>
          </figure>

          <t>For example, imagine an ACK acknowledges newlyAckedPkt=9 more
          full-size segments than any previous ACK, and that ACE increments by
          a minimum of 2 CE marks (d.cep=2). The above formula works out that
          it would still be safe to assume 2 CE marks (because 9 - ((9-2) % 8)
          = 2). However, if ACE increases by a minimum of 2 but acknowledges
          10 full-sized segments, then it would be necessary to assume that
          there could have been 10 CE marks (because 10 - ((10-2) % 8) =
          10).</t>

          <t>ACKs that acknowledge a large stretch of packets might be common
          in data centres to achieve a high packet rate or might be due to ACK
          thinning by a middlebox. In these cases, cycling of the ACE field
          would often appear to have been possible, so the above algorithm
          would be over-conservative, leading to a false high marking rate and
          poor performance. Therefore it would be reasonable to only use
          dSafer.cep rather than d.cep if the moving average of newlyAckedPkt
          was well below 8.</t>

          <t>Implementers could build in more heuristics to estimate
          prevailing average segment size and prevailing ECN marking. For
          instance, newlyAckedPkt in the above formula could be replaced with
          newlyAckedPktHeur = newlyAckedPkt*p*MSS/s, where s is the prevailing
          segment size and p is the prevailing ECN marking probability.
          However, ultimately, if TCP's ECN feedback becomes inaccurate it
          still has loss detection to fall back on. Therefore, it would seem
          safe to implement a simple algorithm, rather than a perfect one.</t>

          <t>The simple algorithm for dSafer.cep above requires no monitoring
          of prevailing conditions and it would still be safe if, for example,
          segments were on average at least 5% of full-sized as long as ECN
          marking was 5% or less. Assuming it was used, the Data Sender would
          increment its packet counter as follows:<list style="empty">
              <t>s.cep += dSafer.cep</t>
            </list></t>

          <t>If missing acknowledgement numbers arrive later (due to
          reordering), <xref target="accecn_ACE_Safety"/> says "the Data
          Sender MAY attempt to neutralize the effect of any action it took
          based on a conservative assumption that it later found to be
          incorrect". To do this, the Data Sender would have to store the
          values of all the relevant variables whenever it made assumptions,
          so that it could re-evaluate them later. Given this could become
          complex and it is not required, we do not attempt to provide an
          example of how to do this.</t>
        </section>

        <section title="Safety Algorithm with the AccECN Option">
          <!--ToDo: Ilpo says this algo is useless, 'cos (I think) you don't have the state of d.ceb and d.cep at the same time.
See emails 3/1/20.-->

          <t>When the AccECN Option is available on the ACKs before and after
          the possible sequence of ACK losses, if the Data Sender only needs
          CE-marked bytes, it will have sufficient information in the AccECN
          Option without needing to process the ACE field. If for some reason
          it needs CE-marked packets, if dSafer.cep is different from d.cep,
          it can determine whether d.cep is likely to be a safe enough
          estimate by checking whether the average marked segment size (s =
          d.ceb/d.cep) is less than the MSS (where d.ceb is the amount of
          newly CE-marked bytes - see <xref
          target="accecn_Algo_Option_Coding"/>). Specifically, it could use
          the following algorithm:</t>

          <figure>
            <artwork><![CDATA[   SAFETY_FACTOR = 2
   if (dSafer.cep > d.cep) {
       if (d.ceb <= MSS * d.cep) {  % Same as (s <= MSS), but no DBZ
          sSafer = d.ceb/dSafer.cep
          if (sSafer < MSS/SAFETY_FACTOR)
              dSafer.cep = d.cep    % d.cep is a safe enough estimate
       } % else
           % No need for else; dSafer.cep is already correct, 
           % because d.cep must have been too small
   }
]]></artwork>
          </figure>

          <t>The chart below shows when the above algorithm will consider
          d.cep can replace dSafer.cep as a safe enough estimate of the number
          of CE-marked packets:</t>

          <figure>
            <artwork><![CDATA[                 ^
           sSafer|
                 |
              MSS+
                 |
                 |         dSafer.cep
                 |                  is
MSS/SAFETY_FACTOR+--------------+    safest
                 |              |
                 | d.cep is safe|
                 |    enough    |
                 +-------------------->
                               MSS     s

]]></artwork>
          </figure>

          <t>The following examples give the reasoning behind the algorithm,
          assuming MSS=1460 [B]:<list style="symbols">
              <t>if d.cep=0, dSafer.cep=8 and d.ceb=1460, then s=infinity and
              sSafer=182.5.<vspace blankLines="0"/>Therefore even though the
              average size of 8 data segments is unlikely to have been as
              small as MSS/8, d.cep cannot have been correct, because it would
              imply an average segment size greater than the MSS.</t>

              <t>if d.cep=2, dSafer.cep=10 and d.ceb=1460, then s=730 and
              sSafer=146.<vspace blankLines="0"/>Therefore d.cep is safe
              enough, because the average size of 10 data segments is unlikely
              to have been as small as MSS/10.</t>

              <t>if d.cep=7, dSafer.cep=15 and d.ceb=10200, then s=1457 and
              sSafer=680.<vspace blankLines="0"/>Therefore d.cep is safe
              enough, because the average data segment size is more likely to
              have been just less than one MSS, rather than below MSS/2.</t>
            </list></t>

          <t>If pure ACKs were allowed to be ECN-capable, missing ACKs would
          be far less likely. However, because <xref target="RFC3168"/>
          currently precludes this, the above algorithm assumes that pure ACKs
          are not ECN-capable.</t>
        </section>
      </section>

      <section anchor="accecn_Algo_ACE_Bytes"
               title="Example Algorithm to Estimate Marked Bytes from Marked Packets">
        <t>If the AccECN Option is not available, the Data Sender can only
        decode CE-marking from the ACE field in packets. Every time an ACK
        arrives, to convert this into an estimate of CE-marked bytes, it needs
        an average of the segment size, s_ave. Then it can add or subtract
        s_ave from the value of d.ceb as the value of d.cep increments or
        decrements. Some possible ways to calculate s_ave are outlined below.
        The precise details will depend on why an estimate of marked bytes is
        needed.</t>

        <t>The implementation could keep a record of the byte numbers of all
        the boundaries between packets in flight (including control packets),
        and recalculate s_ave on every ACK. However it would be simpler to
        merely maintain a counter packets_in_flight for the number of packets
        in flight (including control packets), which is reset once per RTT.
        Either way, it would estimate s_ave as:<list style="empty">
            <t>s_ave ~= flightsize / packets_in_flight,</t>
          </list>where flightsize is the variable that TCP already maintains
        for the number of bytes in flight. To avoid floating point arithmetic,
        it could right-bit-shift by lg(packets_in_flight), where lg() means
        log base 2.</t>

        <t>An alternative would be to maintain an exponentially weighted
        moving average (EWMA) of the segment size:<list style="empty">
            <t>s_ave = a * s + (1-a) * s_ave,</t>
          </list>where a is the decay constant for the EWMA. However, then it
        is necessary to choose a good value for this constant, which ought to
        depend on the number of packets in flight. Also the decay constant
        needs to be power of two to avoid floating point arithmetic.</t>
      </section>

      <section anchor="accecn_Algo_Beacon"
               title="Example Algorithm to Beacon AccECN Options">
        <t><xref target="accecn_option_usage"/> requires a Data Receiver to
        beacon a full-length AccECN Option at least 3 times per RTT. This
        could be implemented by maintaining a variable to store the number of
        ACKs (pure and data ACKs) since a full AccECN Option was last sent and
        another for the approximate number of ACKs sent in the last round trip
        time:</t>

        <figure>
          <artwork><![CDATA[   if (acks_since_full_last_sent > acks_in_round / BEACON_FREQ)
       send_full_AccECN_Option()]]></artwork>
        </figure>

        <t>For optimized integer arithmetic, BEACON_FREQ = 4 could be used,
        rather than 3, so that the division could be implemented as an integer
        right bit-shift by lg(BEACON_FREQ).</t>

        <t>In certain operating systems, it might be too complex to maintain
        acks_in_round. In others it might be possible by tagging each data
        segment in the retransmit buffer with the number of ACKs sent at the
        point that segment was sent. This would not work well if the Data
        Receiver was not sending data itself, in which case it might be
        necessary to beacon based on time instead, as follows:</t>

        <figure>
          <artwork><![CDATA[   if ( time_now > time_last_option_sent + (RTT / BEACON_FREQ) )
       send_full_AccECN_Option()]]></artwork>
        </figure>

        <t>This time-based approach does not work well when all the ACKs are
        sent early in each round trip, as is the case during slow-start. In
        this case few options will be sent (evtl. even less than 3 per RTT).
        However, when continuously sending data, data packets as well as ACKs
        will spread out equally over the RTT and sufficient ACKs with the
        AccECN option will be sent.</t>
      </section>

      <section anchor="accecn_Algo_Not-ECT"
               title="Example Algorithm to Count Not-ECT Bytes">
        <t>A Data Sender in AccECN mode can infer the amount of TCP payload
        data arriving at the receiver marked Not-ECT from the difference
        between the amount of newly ACKed data and the sum of the bytes with
        the other three markings, d.ceb, d.e0b and d.e1b. Note that, because
        r.e0b is initialized to 1 and the other two counters are initialized
        to 0, the initial sum will be 1, which matches the initial offset of
        the TCP sequence number on completion of the 3WHS.</t>

        <!--ToDo: write-up pseudocode, rather than just describe it.-->

        <t>For this approach to be precise, it has to be assumed that spurious
        (unnecessary) retransmissions do not lead to double counting. This
        assumption is currently correct, given that RFC 3168 requires that the
        Data Sender marks retransmitted segments as Not-ECT. However, the
        converse is not true; necessary retransmissions will result in
        under-counting.</t>

        <t>However, such precision is unlikely to be necessary. The only known
        use of a count of Not-ECT marked bytes is to test whether equipment on
        the path is clearing the ECN field (perhaps due to an out-dated
        attempt to clear, or bleach, what used to be the ToS field). To detect
        bleaching it will be sufficient to detect whether nearly all bytes
        arrive marked as Not-ECT. Therefore there should be no need to keep
        track of the details of retransmissions.</t>
      </section>
    </section>

    <section anchor="accecn_flags_rationale"
             title="Rationale for Usage of TCP Header Flags">
      <section title="Three TCP Header Flags in the SYN-SYN/ACK Handshake">
        <t>AccECN uses a rather unorthodox approach to negotiate the highest
        version TCP ECN feedback scheme that both ends support, as justified
        below. It follows from the original TCP ECN capability negotiation
        <xref target="RFC3168"/>, in which the client set the 2 least
        significant of the original reserved flags in the TCP header, and fell
        back to no ECN support if the server responded with the 2 flags
        cleared, which had previously been the default.</t>

        <t>ECN originally used header flags rather than a TCP option because
        it was considered more efficient to use a header flag for 1 bit of
        feedback per ACK, and this bit could be overloaded to indicate support
        for ECN during the handshake. During the development of ECN, 1 bit
        crept up to 2, in order to deliver the feedback reliably and to work
        round some broken hosts that reflected the reserved flags during the
        handshake.</t>

        <t>In order to be backward compatible with RFC 3168, AccECN continues
        this approach, using the 3rd least significant TCP header flag that
        had previously been allocated for the ECN nonce (now historic). Then,
        whatever form of server an AccECN client encounters, the connection
        can fall back to the highest version of feedback protocol that both
        ends support, as explained in <xref target="accecn_Negotiation"/>.</t>

        <t>If AccECN had used the more orthodox approach of a TCP option, it
        would still have had to set the two ECN flags in the main TCP header,
        in order to be able to fall back to Classic RFC 3168 ECN, or to
        disable ECN support, without another round of negotiation. Then AccECN
        would also have had to handle all the different ways that servers
        currently respond to settings of the ECN flags in the main TCP header,
        including all the conflicting cases where a server might have said it
        supported one approach in the flags and another approach in the new
        TCP option. And AccECN would have had to deal with all the additional
        possibilities where a middlebox might have mangled the ECN flags, or
        removed the TCP option. Thus, usage of the 3rd reserved TCP header
        flag simplified the protocol.</t>

        <t>The third flag was used in a way that could be distinguished from
        the ECN nonce, in case any nonce deployment was encountered. Previous
        usage of this flag for the ECN nonce was integrated into the original
        ECN negotiation. This further justified the 3rd flag's use for AccECN,
        because a non-ECN usage of this flag would have had to use it as a
        separate single bit, rather than in combination with the other 2 ECN
        flags.</t>

        <t>Indeed, having overloaded the original uses of these three flags
        for its handshake, AccECN overloads all three bits again as a 3-bit
        counter.</t>
      </section>

      <section title="Four Codepoints in the SYN/ACK">
        <t>Of the 8 possible codepoints that the 3 TCP header flags can
        indicate on the SYN/ACK, 4 already indicated earlier (or broken)
        versions of ECN support. In the early design of AccECN, an AccECN
        server could use only 2 of the 4 remaining codepoints. They both
        indicated AccECN support, but one fed back that the SYN had arrived
        marked as CE. Even though ECN support on a SYN is not yet on the
        standards track, the idea is for either end to act as a dumb
        reflector, so that future capabilities can be unilaterally deployed
        without requiring 2-ended deployment (justified in <xref
        target="accecn_demb_reflector"/>).</t>

        <t>During traversal testing it was discovered that the ECN field in
        the SYN was mangled on a non-negligible proportion of paths. Therefore
        it was necessary to allow the SYN/ACK to feed all four IP/ECN
        codepoints that the SYN could arrive with back to the client. Without
        this, the client could not know whether to disable ECN for the
        connection due to mangling of the IP/ECN field (also explained in
        <xref target="accecn_demb_reflector"/>). This development consumed the
        remaining 2 codepoints on the SYN/ACK that had been reserved for
        future use by AccECN in earlier versions.</t>
      </section>

      <section anchor="accecn_space_evolution"
               title="Space for Future Evolution">
        <t>Despite availability of usable TCP header space being extremely
        scarce, the AccECN protocol has taken all possible steps to ensure
        that there is space to negotiate possible future variants of the
        protocol, either if the experiment proves that a variant of AccECN is
        required, or if a completely different ECN feedback approach is
        needed:<list style="hanging">
            <t hangText="Future AccECN variants:">When the AccECN capability
            is negotiated during TCP's 3WHS, the rows in <xref
            target="accecn_Tab_Negotiation"/> tagged as 'Nonce' and 'Broken'
            in the column for the capability of node B are unused by any
            current protocol in the RFC series. These could be used by TCP
            servers in future to indicate a variant of the AccECN protocol. In
            recent measurement studies in which the response of large numbers
            of servers to an AccECN SYN has been tested, e.g. <xref
            target="Mandalari18"/>, a very small number of SYN/ACKs arrive
            with the pattern tagged as 'Nonce', and a small but more
            significant number arrive with the pattern tagged as 'Broken'. The
            'Nonce' pattern could be a sign that a few servers have
            implemented the ECN Nonce <xref target="RFC3540"/>, which has now
            been reclassified as historic <xref target="RFC8311"/>, or it
            could be the random result of some unknown middlebox behaviour.
            The greater prevalence of the 'Broken' pattern suggests that some
            instances still exist of the broken code that reflects the
            reserved flags on the SYN.<vspace blankLines="1"/>The requirement
            not to reject unexpected initial values of the ACE counter (in the
            main TCP header) in the last para of <xref
            target="accecn_sec_ACE_init_invalid"/> ensures that 3 unused
            codepoints on the ACK of the SYN/ACK, 6 unused values on the first
            SYN=0 data packet from the client and 7 unused values on the first
            SYN=0 data packet from the server could be used to declare future
            variants of the AccECN protocol. The word 'declare' is used rather
            than 'negotiate' because, at this late stage in the 3WHS, it would
            be too late for a negotiation between the endpoints to be
            completed. A similar requirement not to reject unexpected initial
            values in the TCP option (<xref target="accecn_sec_zero_option"/>)
            is for the same purpose. If traversal of the TCP option were
            reliable, this would have enabled a far wider range of future
            variation of the whole AccECN protocol. Nonetheless, it could be
            used to reliably negotiate a wide range of variation in the
            semantics of the AccECN Option.</t>

            <t hangText="Future non-AccECN variants:">Five codepoints out of
            the 8 possible in the 3 TCP header flags used by AccECN are unused
            on the initial SYN (in the order AE,CWR,ECE): 001, 010, 100, 101,
            110. <xref target="accecn_sec_forward_compat"/> ensures that the
            installed base of AccECN servers will all assume these are
            equivalent to AccECN negotiation with 111 on the SYN. These
            codepoints would not allow fall-back to Classic ECN support for a
            server that did not understand them, but this approach ensures
            they are available in future, perhaps for uses other than ECN
            alongside the AccECN scheme. All possible combinations of SYN/ACK
            could be used in response except either 000 or reflection of the
            same values sent on the SYN. <vspace blankLines="1"/>Of course,
            other ways could be resorted to in order to extend AccECN or ECN
            in future, although their traversal properties are likely to be
            inferior. They include a new TCP option; using the remaining
            reserved flags in the main TCP header (preferably extending the
            3-bit combinations used by AccECN to 4-bit combinations, rather
            than burning one bit for just one state); a non-zero urgent
            pointer in combination with the URG flag cleared; or some other
            unexpected combination of fields yet to be invented.</t>
          </list></t>
      </section>
    </section>
  </back>
</rfc>
