<?xml version="1.0" encoding="US-ASCII"?>
<!-- This template is for creating an Internet Draft using xml2rfc,
     which is available here: http://xml.resource.org. -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!-- One method to get references from the online citation libraries.
     There has to be one entity for each item to be referenced. 
     An alternate method (rfc include) is described in the references. -->

<!ENTITY RFC2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY RFC2629 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2629.xml">
<!ENTITY RFC3552 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3552.xml">
<!ENTITY I-D.narten-iana-considerations-rfc2434bis SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.narten-iana-considerations-rfc2434bis.xml">
]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!-- used by XSLT processors -->
<!-- For a complete list and description of processing instructions (PIs), 
     please see http://xml.resource.org/authoring/README.html. -->
<!-- Below are generally applicable Processing Instructions (PIs) that most I-Ds might want to use.
     (Here they are set differently than their defaults in xml2rfc v1.32) -->
<?rfc strict="yes" ?>
<!-- give errors regarding ID-nits and DTD validation -->
<!-- control the table of contents (ToC) -->
<?rfc toc="yes"?>
<!-- generate a ToC -->
<?rfc tocdepth="4"?>
<!-- the number of levels of subsections in ToC. default: 3 -->
<!-- control references -->
<?rfc symrefs="yes"?>
<!-- use symbolic references tags, i.e, [RFC2119] instead of [1] -->
<?rfc sortrefs="yes" ?>
<!-- sort the reference entries alphabetically -->
<!-- control vertical white space 
     (using these PIs as follows is recommended by the RFC Editor) -->
<?rfc compact="yes" ?>
<!-- do not start each main section on a new page -->
<?rfc subcompact="no" ?>
<!-- keep one blank line between list items -->
<!-- end of list of popular I-D processing instructions -->
<rfc category="info" docName="draft-moiseenko-icnrg-flowclass-00" ipr="trust200902">
  <!-- category values: std, bcp, info, exp, and historic
     ipr values: full3667, noModification3667, noDerivatives3667
     you can add the attributes updates="NNNN" and obsoletes="NNNN" 
     they will automatically be output with "(if approved)" -->

  <!-- ***** FRONT MATTER ***** -->

  <front>
    <!-- The abbreviated title is used in the page header - it is only necessary if the 
         full title is longer than 39 characters -->

    <title abbrev="Flow Classification in ICN">Flow Classification in Information
        Centric Networking</title>

    <!-- add 'role="editor"' below for the editors if appropriate -->

    <!-- Another author who claims to be an editor -->
    <author fullname="Ilya Moiseenko" surname="I. Moiseenko">
      <organization>Cisco Systems</organization>

      <address>
        <postal>
          <street></street>

          <!-- Reorder these if your country does things differently -->

          <city></city>

          <region></region>

          <code></code>

          <country></country>
        </postal>

        <phone></phone>

        <email>ilmoisee@cisco.com</email>

        <!-- uri and facsimile elements may also be added -->
      </address>
    </author>

    <author fullname="Dave Oran" surname="D. Oran">
        <organization>Cisco Systems</organization>

        <address>
            <postal>
                <street></street>

                <!-- Reorder these if your country does things differently -->

                <city></city>

                <region></region>

                <code></code>

                <country></country>
            </postal>

            <phone></phone>

            <email>oran@cisco.com</email>

        <!-- uri and facsimile elements may also be added -->
        </address>
    </author>

    <date month="July" year="2016" />

    <!-- If the month and year are both specified and are the current ones, xml2rfc will fill 
         in the current day for you. If only the current year is specified, xml2rfc will fill 
	 in the current day and month for you. If the year is not the current one, it is 
	 necessary to specify at least a month (xml2rfc assumes day="1" if not specified for the 
	 purpose of calculating the expiry date).  With drafts it is normally sufficient to 
	 specify just the year. -->

    <!-- Meta-data Declarations -->

    <area>General</area>

    <workgroup>icnrg</workgroup>

    <!-- WG name at the upperleft corner of the doc,
         IETF is fine for individual submissions.  
	 If this element is not present, the default is "Network Working Group",
         which is used by the RFC Editor as a nod to the history of the IETF. -->

    <keyword>template</keyword>

    <!-- Keywords will be incorporated into HTML output
         files in a meta tag but they have no effect on text or nroff
         output. If you submit your draft to the RFC Editor, the
         keywords will be used for the search engine. -->

    <abstract>
        <t>For the ubiquitous and highly important Internet protocols (TCP, UDP, IP), flows are conventionally identified by the "5-tuple" of source and destination IP addresses, source and destination port, and protocol type in an IP packet.
            Information Centric Networking (ICN) is a new paradigm where network
        communications are accomplished by requesting named content, instead
        of sending packets to destination addresses.  This document describes
        mechanisms allowing ICN forwarders, consumers, producers and other ICN nodes to encode, decode, and process equivalence class identifiers (flows) at any desired granularity of a routable name prefix and beyond the routable name prefix. This document is
        a product of the IRTF Information-Centric Networking Research Group (ICNRG).</t>
    </abstract>
  </front>

  <middle>
    <section anchor="intro" title="Introduction">
        <t>The problem of identifying groups of packets that get consistent treatment in a network and allowing that treatment to be independent and isolated from the treatment of other groups of packets, is ubiquitous and long-standing. The purposes to which this identification can be put is highly varied, including such functions are providing differentiated quality of service, traffic engineering, traffic filtering for security functions like intrusion detection and firewalling, etc.</t>

        <t>Providing the capability to apply different functions to groupings (formally equivalence classes) of packets is generally known as the "flow identification problem" where the definition of what constitutes a "flow" is highly dependent on the particular protocol or protocols carrying the packets. Some of the above uses of flows also bring a mechanism requirement that the flow identification technique be useful to have not just equivalence classes, but the ability to apply some useful notion of fairness among the instances of each equivalence class. There are many possible flow identification techniques that are either too granular (spatially or temporally) to establish fairness, or conversely too coarse and cannot separate traffic a fine enough level to have useful fairness.</t>

        <t>For the ubiquitous and highly important Internet protocols (TCP, UDP, IP), flows are conventionally identified by the "5-tuple" of source and destination IP addresses, source and destination port, and protocol type in an IP packet. Some systems augment this by further distinguishing equivalence classes by the TOS/DSCP field, but this is secondary to the 5-tuple methods. 2-party flows are present where the source and destination addresses are unicast IP addresses. Multi-party flows can exist when the destination IP address is a multicast address. One key common characteristic is that the identification of flows depends in a very deep way on the presence of source addresses in the packets, and the limited richness of IP addresses is correspondingly constraining as a means to classify traffic in a semantically meaningful way.</t>

        <t> The purpose of this document is to devise a mechanism allowing ICN forwarders, consumers, producers and other ICN nodes to encode, decode, and process equivalence class identifiers (flows) at any desired granularity of a routable name prefix and beyond the routable name prefix.
            </t>

      <section title="Requirements Language">
        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
        document are to be interpreted as described in <xref
        target="RFC2119">RFC 2119</xref>.</t>
      </section>
    </section>

    <section anchor="challenges" title="Flow Identification Challenges and Opportunities in ICN">
        <t>
            ICN systems differ from IP-based designs in a number of ways, three of which are quite fundamental.
            <list style="numbers">
                <t>The packets are addressed to a rich namespace of packets, which is hierarchical and carry semantic information that can be useful for classification of flows.</t>

                <t>Conversely, the packets do not contain source addresses of any kind, which means that identifying flows as groups of packets between a single pair of endpoints (in the unicast case) is not possible for intermediate forwarders (other than possibly the first-hop forwarder if it serves a single consumer per interface).</t>

                <t>Instead of group-based multicast, ICN systems use multi-destination delivery semantics. This allows a different way to map packets to flows, and in fact in the IP world multicast has been difficult to use partly because there is no good way to make use of flow identification for multicast flows (for a variety of reasons).</t>
            </list>

            These differences lead to a need to find a different method to identify flows than used in the IP protocol suite. Ideally, the method would provide semantics that map well with the expected uses of ICN to build applications. It would also use native capabilities in the ICN protocols rather than having to change the protocol architecture in ways that affect the semantics or utility of an ICN approach to networking.
        </t>

        <t>In NDN and CCN protocols, Interest and Data names are the only identifiers in the network; neither source addresses nor destination addresses are employed. Each Interest packet is responded by exactly one Data packet, producing a useful property known as "flow balance". This means that flow identification can be tied directly to the Interest/Data exchanges. The key to having useful flow identification is for the equivalence classes to be associated with the names in the corresponding Interest and Data packets, and to be stable over multiple exchanges using different names that share some common "handle" that can be used to separate the names into equivalence classes.

            As mentioned above, simply using the routing state that maps name prefixes to routes does not provide a useful set of equivalence classes, because:
            <list style="symbols">
                <t>in general, routing prefixes are too coarse; many equivalence classes of packets are generally covered by a single routing prefix because they are present at the same set of destinations from a routing perspective;</t>
                <t>practical, scalable routing needs to do route aggregation, which further blurs the discrimination of the equivalence classes.</t>
            </list>

            Therefore, NDN and CCN protocols need to have something that both relates to the name structure but provides finer granularity for flow classification purposes. This document describes two alternative mechanisms addressing these issues.
        </t>
    </section>

    <section anchor="techniques" title="Flow Encoding Schemes">

        <t>Flow encoding schemes described in this document allow ICN systems to perform flow identification at any desired granularity of a routable name prefix and beyond the routable name prefix. Techniques described herein permit both consumer nodes and forwarders to use equivalence classes to perform per-flow functions. The encoding to achieve the flow classification is lightweight and does not require changes to the protocol architecture in ways that affect the semantics or utility of an ICN approach to networking. Furthermore, equivalence classes can be specified by the data producer, in contrast to IP protocols in which the data producer can only control the destination port as an equivalence-class discriminator.</t>

        <t>No matter what method is used to identify equivalence classes that can be treated as flows, there is the independent but critically important issue of how to scale any state that is kept on a per-flow basis when the flow count is very high. For consumers and producers, this state scales naturally with the number of applications and application interactions are going on simultaneously. Therefore the scaling limit is not likely to be in the producers or consumers. For ICN forwarders that are operating at high speed and/or handling the traffic of many producers and consumers however, this state can scale quadratically or worse. If the ICN forwarder cannot keep all the state due to memory or processing limitations, it faces the common problem of which flows to remember and which to forget. This document does not solve this problem, which is fundamental. Flow encoding schemes described in this document provide a method for identifying equivalence classes using protocol machinery that already has to scale (e.g. name parsing and lookup) and hence does not introduce a new class of problems not inherently present.</t>

        <section title="Equivalence class component count (EC3)">
            <t>For this encoding scheme a new field called equivalence class component count (EC3) is introduced into the Data packets. It is set by a producer and counts the number of name components in the corresponding name that are to be considered, when grouped together under the same prefix part of the name, to be one equivalence class instance. This allows either finer (or coarser) granularity than provided by routing prefixes. Because the EC3 is a separate field of the packet (<xref target="ec3-figure" />), producers can "regroup" equivalence classes dynamically by including more or fewer levels of the name hierarchy when they respond to Interests for the corresponding Data packets. Therefore, the behavior of EC3 encoding scheme is somewhat different from ECNCT encoding scheme and has both advantages and disadvantages. The advantage is flexibility in re-grouping equivalence classes, especially in aggregating flows at different granularities. The disadvantage is that the binding of the equivalence class into the namespace is not explicit, and hence it is harder to enforce consistent interpretations.</t>

            <t>An additional consideration with the EC3 encoding scheme is whether or not the field is inside or outside the security envelope that provides cryptographic packet integrity to the name and data in the data packet. Either approach is possible; however having the field outside the security envelope would allow ICN forwarders to modify it, allowing the aggregation/disaggregation of flows to be performed by the forwarders as well as the consumers. Conversely, leaving the field outside the security envelope may enhance certain attack scenarios against flow classification for quality of service or firewall filtering.</t>

            <figure align="center" anchor="ec3-figure">
                <preamble></preamble>

                <artwork align="left"><![CDATA[
+-------------------------------------------------------------------+
|  /youtube |  /<mediaID>  |  /video  OR |  <frameID>  | <segment#> |
|           |              |  /audio     |             |            |
+-----------+--------------+-------------+-------------+------------+
| Name      | Name         | Name        | Name        | Segment    |
| component | component    | component   | component   | component  |
| type      | type         | type        | type        | type       |
+-----------+--------------+-------------+-------------+------------+
|                                                                   |
| Equivalence Class Component Count = 2 (up to MediaID stream)      |
|                          OR                                       |
| Equivalence Class Component Count = 3 (video or audio substream)  |
+-------------------------------------------------------------------+
                ]]></artwork>

                <postamble>An example of EC3 encoding of flow information.</postamble>
            </figure>
        </section>

        <section title="Equivalence class name component type (ECNCT)">
            <t>For this scheme the equivalence class information is encoded directly in the name, by adding a name component to the name of the Interest and Data packets. This new typed named component is called equivalence class name component type (ECNCT). It is set by the producer as part of constructing all Data packets in the desired equivalence class and is therefore immutable for the lifetime of the associated named data. A consequence of this is that the ECNCT is present in Interest packets as well, and hence may affect both PIT matching and FIB matching. The Equivalence Class name component both names the equivalence class explicitly, and implicitly makes all Data packets named below it in the hierarchy part of that equivalence class. In other words, the name can have multiple equivalence class (e.g. flow and subflows) markings using this scheme (<xref target="ecnct-figure" />). As in EC3 encoding scheme, depending where in the name component hierarchy the ECNCT is placed, one can have either finer or coarser granularity than provided by routing prefixes.</t>

            <t>The exact details of how to encode the ECNCT name component may differ among ICN architectures. The CCN design has explicitly typed name components, so for that protocol an explicit name component type can be assigned straightforwardly. The NDN design eschews typed name components and instead uses textual naming conventions for name components. In that case an architectural constant string would be chosen to distinguish ECNCT from other name component semantics.</t>

            <figure align="center" anchor="ecnct-figure">
                <preamble></preamble>

                <artwork align="left"><![CDATA[
+------------------------------------------------------------+
| /youtube | /<mediaID> | /video OR | <frameID> | <segment#> |
|          |            | /audio    |           |            |
+----------+------------+-----------+-----------+------------+
| Name     | Flow       | Flow      | Name      | Segment    |
| component| component  | component | component | component  |
| type     | type       | type      | type      | type       |
+----------+------------+-----------+-----------+------------+
                ]]></artwork>

                <postamble>An example of ECNCT encoding of flow information.</postamble>
            </figure>

            <t>When an ICN forwarder receives a packet with a name carrying ECNCT(s), it can be processed on a component-by-component basis, and substreams can be identified according to name prefixes indicated by the equivalence class identifiers. The identification of substreams enables special treatment of selected substreams. For example, video substreams can be discriminated from other substreams, such as audio substreams.
                In the example in <xref target="ecnct-figure" />, two name components include equivalence class identifiers to define a hierarchy of flows (or substreams). Specifically, two flow components are encoded to define the following hierarchy of flows:</t>

            <t>First level name prefix: 	/youtube/<![CDATA[<mediaID>]]></t>
            <t>Second level name prefix:	/youtube/<![CDATA[<mediaID>]]>/video</t>
            <t>Second level name prefix:	/youtube/<![CDATA[<mediaID>]]>/audio</t>
        </section>
    </section>

    <section anchor="producer" title="Producer operation">
        <t> In ECNCT encoding scheme, an ICN producer receives an Interest packet carrying equivalence class identifiers in the name. A producer might use the equivalence class identifiers for demultiplexing, load sharding and other purposes, and reply with a Data packet matching the Interest name.</t>

        <t>In EC3 encoding scheme, an ICN producer receives an Interest packet that might not carry an equivalence class identifier. In such case, the producer may refer to the name schemas used in a particular application to dynamically determine the equivalence class identifier for Interest demultiplexing, load sharding and other purposes, and for replying with a Data packet carring the equivalence class identifer in EC3 field.</t>
    </section>

    <section anchor="consumer" title="Consumer operation">
        <t> An ICN consumer may also use the knowledge of equivalence classes of packets to take certain actions. For example, when a Data packet with a name specifying a particular equivalence class arrives at a consumer in response to a previously sent Interest packet, the consumer can associate the data packet with the correct equivalence class. Consequently, the consumer can manage subsequent Interest/Data exchanges with the same name prefix and equivalence class identifier (e.g., EC3 or ECNCT) as one flow. Associated measurements such as round trip time (RTT) or marginal delay can be leveraged to perform flow and congestion management for the equivalence class as a whole.
            </t>
    </section>

    <section anchor="forwarder" title="Forwarder operation">

        <t>A flow table may be provisioned in ICN node to enable the node to make decisions about performing actions on Interest and/or Data packets based on one or more equivalence classes. The flow table can include name prefixes mapped to equivalence class identifiers obtained from previous Interest-Data exchanges. In ECNCT encoding scheme, Interest packets carry the equivalence class identifier, therefore flow table may only include name prefixes. Typically, name prefixes in flow table are more granular than prefixes in the FIB, but less granular than names in the PIT. Flow table could be separate from other elements of ICN node or could be integrated with FIB or PIT.
        </t>

        <t> Flow management logic can be configured to treat flows having the same equivalence class similarly. Actions taken that are related to flows or objects having a similar equivalence class can include, but are not limited to, dropping a packet, using a particular interface for a packet, security related actions (e.g., filtering traffic for security functions like intrusion detection and firewalling), quality of service (QoS) related actions (e.g., types of resources to allocate to the packets, moving a packet up in the queue for forwarding purposes, etc.), and/or traffic engineering (e.g., selecting one path over another path). Flow management logic can enable such actions to be taken on a particular flow based on the equivalence class associated with the flow or object and policies related to the equivalence class.
        </t>

        <t> Specific examples of how ICN node can use the knowledge of equivalence classes of packets include, but are not limited to, the following:
            <list style="numbers">
                <t>Enforce rate control for the equivalence class as a whole (e.g., dropping packets, queuing packets, etc.);</t>
                <t>Estimate the number of simultaneous flows traversing a bottleneck link, which can improve the performance of many congestion control schemes; and</t>
                <t>Make more intelligent selections of which packets to cache at the ICN forwarder, for example, to prefer to cache many packets of the same equivalence class.</t>
            </list>
        </t>
    </section>

    <!-- This PI places the pagebreak correctly (before the section title) in the text output. -->

    <!--<?rfc needLines="8" ?>-->

    <!-- Possibly a 'Contributors' section ... -->

    <section anchor="IANA" title="IANA Considerations">
      <t>This memo includes no request to IANA.</t>


    </section>

    <section anchor="Security" title="Security Considerations">
        <t>Certain attack scenarios against flow classification for quality of service or firewall filtering may be prevented if the EC3 field located inside the security envelope. ICN forwarders can read, but not change, the EC3 value, because the EC3 field is covered by a security signature and not encrypted.</t>

        <t>If the EC3 field is outside of the security envelope, it can be placed in the hop-by-hop headers and, therefore, be modified by the transit ICN forwarders. This allows the transit ICN forwarders to ovverride the flow definitions set by the producer applications, but opens the system to various attack scenarios.</t>

        <t> Modification of equivalence class identifiers in ECNCT encoding scheme effectively modifies the packet name, and therefore, ECNCT does not introduce any additional security threats.</t>
    </section>
  </middle>

  <!--  *****BACK MATTER ***** -->

  <back>
    <!-- References split into informative and normative -->

    <!-- There are 2 ways to insert reference entries from the citation libraries:
     1. define an ENTITY at the top, and use "ampersand character"RFC2629; here (as shown)
     2. simply use a PI "less than character"?rfc include="reference.RFC.2119.xml"?> here
        (for I-Ds: include="reference.I-D.narten-iana-considerations-rfc2434bis.xml")

     Both are cited textually in the same manner: by using xref elements.
     If you use the PI option, xml2rfc will, by default, try to find included files in the same
     directory as the including file. You can also define the XML_LIBRARY environment variable
     with a value containing a set of directories to search.  These can be either in the local
     filing system or remote ones accessed by http (http://domain/dir/... ).-->

    <references title="Normative References">
      <!--?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml"?-->
      &RFC2119;
    </references>

    <!-- Change Log

v00 2016-07-13  EBD   Initial version  -->
  </back>
</rfc>
