<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<?rfc strict='yes'?>
<?rfc iprnotified='no'?>
<rfc category="std" docName="draft-templin-intarea-parcels-01"
     ipr="trust200902" updates="RFC2675">
  <front>
    <title abbrev="IP Parcels">IP Parcels</title>

    <author fullname="Fred L. Templin" initials="F. L." role="editor"
            surname="Templin">
      <organization>Boeing Research &amp; Technology</organization>

      <address>
        <postal>
          <street>P.O. Box 3707</street>

          <city>Seattle</city>

          <region>WA</region>

          <code>98124</code>

          <country>USA</country>
        </postal>

        <email>fltemplin@acm.org</email>
      </address>
    </author>

    <date day="18" month="December" year="2021"/>

    <keyword>I-D</keyword>

    <keyword>Internet-Draft</keyword>

    <abstract>
      <t>IP packets (both IPv4 and IPv6) are understood to contain a unit of
      data which becomes the retransmission unit in case of loss. Upper layer
      protocols such as the Transmission Control Protocol (TCP) prepare data
      units known as "segments", with traditional arrangements including a
      single segment per packet. This document presents a new construct known
      as the "IP Parcel" which permits a single packet to carry multiple
      segments, essentially creating a "packet-of-packets". The parcel can be
      opened at middleboxes on the path with the
      included segments broken out into individual packets, then rejoined into
      one or more repackaged parcels to be forwarded further toward the final
      destination. Reordering of segments within parcels is unimportant; what
      matters is that the number of parcels delivered to the final destination
      should be kept to a minimum, and that loss or receipt of individual
      segments (and not parcel size) determines the retransmission unit.</t>
    </abstract>
  </front>

  <middle>
    <section anchor="intro" title="Introduction">
      <t>IP packets (both IPv4 <xref target="RFC0791"/> and IPv6 <xref
      target="RFC8200"/>) are understood to contain a unit of data which
      becomes the retransmission unit in case of loss. Upper layer protocols
      such as the Transmission Control Protocol (TCP) <xref target="RFC0793"/>
      prepare data units known as "segments", with traditional arrangements
      including a single segment per packet. This document presents a new
      construct known as the "IP Parcel" which permits a single packet to
      carry multiple segments. This essentially creates a "packet-of-packets"
      with the IP layer headers appearing only once but with possibly
      multiple upper layer protocol segments.</t>

      <t>Parcels are formed when an upper layer protocol entity (identified by
      the "5-tuple" source IP address/port number, destination IP address/port
      number and protocol number) prepares a buffer of data with the
      concatenation of up to 64 properly-formed segments that could stand
      alone if broken out into individual packets using a copy of the IP header.
      All segments except the final segment must be equal in size, while the
      final segment must not be larger than the others but may be smaller. Each
      non-final segment must be no larger than 65535 minus the length of the IP
      header plus extensions. The upper layer protocol entity then delivers the
      buffer and non-final segment size to the IP layer, which appends the
      necessary IP headers to identify this as a parcel and not an ordinary
      packet.</t>

      <t>Each parcel can be opened at a first-hop middlebox on the path with
      the included segments broken out into individual packets, then rejoined
      into one or more parcels at a last-hop middlebox to be forwarded to the
      final destination. Reordering of segments within a parcel or even
      repackaging of parcels entirely is unimportant; what matters is that
      the number of parcels delivered to the final destination should be kept
      to a minimum, and that loss or receipt of individual segments (and not
      parcel size) determines the retransmission unit.</t>

      <t>The following sections discuss rationale for creating and shipping
      parcels as well as the actual protocol constructs and procedures
      involved. It is expected that the parcel concept may drive future
      innovation in application, operating system, network equipment
      and data link design.</t>
    </section>

    <section anchor="aero-omni" title="Motivation">
      <t>Studies have shown that applications that send and receive large
      packets can realize greater performance due to reduced numbers of system
      calls and interrupts as well as larger atomic data copies between kernel
      and user space. Within the network, large packets also result in reduced
      numbers of device interrupts and better network utilization in
      comparison with smaller packet sizes.</t>

      <t>The issue with sending large packets is that they are often lost at
      links with smaller Maximum Transmission Units (MTUs), and the resulting
      Packet Too Big (PTB) message may be lost somewhere in the path back to
      the original source. This "Path MTU black hole" condition can cripple
      application performance unless also supplemented with robust path
      probing techniques, however the best case performance always occurs when
      no packets are lost due to size restrictions.</t>

      <t>These considerations therefore motivate a design where the maximum
      segment size should be no larger than 65535 minus IP header sizes, while
      parcels that carry the segments may themselves be significantly larger.
      Then, even if a middlebox needs to open the parcels to deliver
      individual segments further toward final hops as separate IP packets,
      an important performance optimization for both the original source and
      final destination can be realized.</t>

      <t>An analogy: when an end user orders 50 small items from a major
      online retailer, the retailer does not ship the order in 50 separate
      small boxes. Instead, the retailer puts as many of the small boxes as
      possible into one or a few larger boxes (or parcels) then puts these
      parcels on a semi-truck or airplane. The parcels arrive at a regional
      distribution center where they may be further redistributed into slightly
      smaller parcels that get delivered to the end user. But most often, the
      end user will only find one or a few parcels at his doorstep and not 50
      individual boxes.</t>
    </section>

    <section anchor="parcels" title="IP Parcel Formation">
      <t>IP parcel formation is invoked by an upper layer protocol (identified
      by the 5-tuple as above) when it produces a data buffer containing the
      concatenation of up to 64 segments. All non-final segments MUST be equal
      in length while the final segment MUST NOT be larger but MAY be smaller.
      Each non-final segment MUST be no larger than
      65535 minus the length of the IP header plus extensions. The application
      then presents the buffer and non-final segment size to the IP layer
      which appends a single IP header (plus any extension headers) before
      presenting the parcel to lower layers.</t>

      <t>For IPv4, the IP layer prepares the parcel by appending an IPv4
      header with a Jumbo Payload option (identified by option code TBD)
      formed as follows:<figure>
          <artwork><![CDATA[+--------+--------+--------+--------+--------+--------+
|000(TBD)|00000110|       Jumbo Payload Length        |
+--------+--------+--------+--------+--------+--------+]]></artwork>
        </figure>where "Jumbo Payload Length" is a 32-bit unsigned integer
      value (in network byte order) set to the lengths of the IPv4 header plus
      all concatenated segments. The IP layer next sets the IPv4 header DF bit
      to 1, then sets the IPv4 header Total Length field to the length of the
      IPv4 header plus the length of the first segment only. Note that the IP
      layer can form true IPv4 jumbograms (as opposed to parcels) by instead
      setting the IPv4 header Total Length field to 0.</t>

      <t>For IPv6, the IP layer forms a parcel by appending an IPv6 header
      with a Jumbo Payload option <xref target="RFC2675"/> the same as for
      IPv4 above where "Jumbo Payload Length" is set to the lengths of the
      IPv6 Hop-by-Hop Options header and any other extension headers present
      plus all concatenated segments. The IP layer next sets the IPv6 header
      Payload Length field to the lengths of the IPv6 Hop-by-Hop Options header
      and any other extension headers present plus the length of the first
      segment only. As with IPv4 the IP layer can form true IPv6 jumbograms
      (as opposed to parcels) by instead setting the IPv6 header Payload
      Length field to 0.</t>
    </section>

    <section anchor="xmit" title="Transmission of IP Parcels">
      <t>The IP layer next presents the parcel to the next lower layer. If the
      lower layer is the OMNI Adaptation Layer (OAL) <xref
      target="I-D.templin-6man-omni"/>, the OAL source can open the parcel if
      necessary and forward each segment as an individual IP packet. These
      individual packets eventually arrive at the OAL destination which
      re-combines them into a new parcel or parcels to forward to the
      final destination. Details for OAL parcel forwarding are discussed
      in <xref target="I-D.templin-6man-omni"/>.</t>

      <t>If the lower layer is a true data link layer interface, however, the
      IP layer instead forwards the parcel according to the path MTU to either
      the first middlebox that configures an OAL layer or the final
      destination itself, whichever comes first. If the parcel is no larger
      than the path MTU, the IP layer simply forwards the parcel the same as
      it would an ordinary IP packet and processes any PTB messages that may
      be returned (but, see below for compatibility issues). If the parcel is
      larger than 65535 (minus encapsulation headers) and also larger than the
      path MTU, the IP layer instead discards the parcel and returns a packet
      size error to the upper layer protocol.</t>

      <t>If the parcel is no larger than 65535 (minus encapsulation headers)
      but larger than the path MTU, the IP layer instead performs IP
      encapsulation with destination set to the IP address of the middlebox or
      final destination and (Payload Length / Total Length) set to the Jumbo
      Payload Length plus encapsulation header length then performs
      source-fragmentation on the encapsulated parcel the same as for an
      ordinary IP packet by generating IP fragments destined for the
      middlebox or final destination.</t>

      <t>When the middlebox or final destination receives the fragments or
      whole parcels, it reassembles then discards the encapsulation headers if
      necessary then presents the parcel to the OAL in the middlebox case or
      the upper layer protocol in the final destination case.</t>
    </section>

    <section anchor="integrity" title="Integrity">
      <t>Parcels can range in length from as small as the size of the IP
      headers plus a single octet to as large as the IP headers plus
      (64 * (2**16 minus headers)) octets. Although link layer integrity
      checks provide sufficient protection for contiguous blocks of data
      up to approximately 9KB, reliance on the presence of link-layer
      integrity checks may not be possible over links such as tunnels.
      Moreover, the segment contents of a received parcel may arrive in
      an incomplete and/or rearranged order with respect to their
      original packaging.</t>

      <t>For these reasons, upper layers should include individual integrity
      checks with each segment included in the parcel with a strength
      compatible with the segment length. The integrity check should then
      be verified at the receiver on a per-segment basis, which discards
      any corrupted segments and considers them as a loss event.</t>
    </section>

    <section anchor="comapt" title="Compatibility">
      <t>Legacy networking gear that forwards parcels over ordinary data links
      may not recognize this new coding of the Jumbo Payload extension header
      and may act only on what is observed in the IPv4 Total Length or IPv6
      Payload Length field. In that case, the legacy gear would likely forward
      the first segment of the parcel only while truncating the remainder
      since only the length of the first segment is included in the IP
      header.</t>

      <t>In networks where compatibility is thought to be an issue, the
      original source can perform encapsulation on parcels uniformly whether
      or not fragmentation is required to ensure they are delivered to the OAL
      source or final destination (whichever comes first). In the same way the
      OAL destination can uniformly perform encapsulation to ensure that
      parcels are delivered to the final destination.</t>
    </section>

    <section anchor="issues" title="RFC2675 Updates">
      <t>Section 3 of <xref target="RFC2675"/> provides a list of certain
      conditions to be considered as errors. In particular:<list style="empty">
          <t>error: IPv6 Payload Length != 0 and Jumbo Payload option
          present</t>

          <t>error: Jumbo Payload option present and Jumbo Payload Length &lt;
          65,536</t>
        </list></t>

      <t>Implementations that obey this specification ignore these conditions
      and do not consider them as errors.</t>
    </section>

    <section anchor="implement" title="Implementation Status">
      <t>TBD.</t>
    </section>

    <section anchor="iana" title="IANA Considerations">
      <t>The IANA is instructed to allocate a new IP option code in the 'ip
      option numbers' registry for the IPv4 Jumbo Payload option. The Copy and
      Class fields must both be set to 0, and the Number field must be set to
      'TBD'.</t>
    </section>

    <section anchor="secure" title="Security Considerations">
      <t>Communications networking security is necessary to preserve
      confidentiality, integrity and availability.</t>
    </section>

    <section anchor="ack" title="Acknowledgements">
      <t>This work was inspired by ongoing AERO/OMNI/DTN investigations. The
      concepts were further motivated through discussions on the intarea
      list.</t>

      <t>A considerable body of work over recent years has produced useful
      "segmentation offload" facilities available in widely-deployed
      implementations.</t>

      <t>.</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc ?>

      <?rfc ?>

      <?rfc include="reference.RFC.0793"?>

      <?rfc include="reference.RFC.2675"?>

      <?rfc include="reference.RFC.0791"?>

      <?rfc include="reference.RFC.8200" ?>
    </references>

    <references title="Informative References">
      <?rfc include="reference.I-D.templin-6man-aero"?>

      <?rfc include="reference.I-D.templin-6man-omni"?>

      <?rfc ?>

      <?rfc ?>

      <?rfc ?>
    </references>
  </back>
</rfc>
