<?xml version="1.0" encoding="US-ASCII"?>
<!-- This template is for creating an Internet Draft using xml2rfc,
     which is available here: http://xml.resource.org. -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!-- One method to get references from the online citation libraries.
     There has to be one entity for each item to be referenced.
     An alternate method (rfc include) is described in the references. -->

<!ENTITY RFC2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY RFC6550 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.6550.xml">
<!ENTITY RFC6206 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.6206.xml">
<!ENTITY RFC6775 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.6775.xml">
]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc strict="yes" ?>
<?rfc toc="yes"?>
<?rfc tocdepth="4"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes" ?>
<?rfc compact="yes" ?>
<!-- do not start each main section on a new page -->
<?rfc subcompact="no" ?>
<!-- keep one blank line between list items -->
<!-- end of list of popular I-D processing instructions -->
<rfc category="std" docName="draft-ietf-roll-rpl-observations-06" ipr="trust200902">
    <!-- category values: std, bcp, info, exp, and historic
     ipr values: full3667, noModification3667, noDerivatives3667
     you can add the attributes updates="NNNN" and obsoletes="NNNN"
     they will automatically be output with "(if approved)" -->

  <!-- ***** FRONT MATTER ***** -->

  <front>
      <!-- The abbreviated title is used in the page header - it is only necessary if the
         full title is longer than 39 characters -->

    <title abbrev="RPL Observations">RPL Observations</title>

    <author fullname="Rahul Arvind Jadhav" initials="R.A." role="editor" surname="Jadhav">
        <address>
            <postal>
                <street>Marathahalli</street>
                <city>Bangalore</city>
                <region>Karnataka</region>
                <code>560037</code>
                <country>India</country>
            </postal>
            <email>rahul.ietf@gmail.com</email>
        </address>
    </author>

    <author fullname="Rabi Narayan Sahoo" initials="R.N." surname="Sahoo">
        <organization>Juniper</organization>
        <address>
            <postal>
                <street>Whitefield</street>
                <city>Bangalore</city>
                <region>Karnataka</region>
                <code>560037</code>
                <country>India</country>
            </postal>
            <email>rabinarayans0828@gmail.com</email>
        </address>
    </author>

    <author fullname="Yuefeng Wu" initials="Y" surname="Wu">
        <organization>Huawei</organization>
        <address>
            <postal>
                <street>No.101, Software Avenue, Yuhuatai District,</street>
                <city>Nanjing</city>
                <region>Jiangsu</region>
                <code>210012</code>
                <country>China</country>
            </postal>
            <phone>+86-15251896569</phone>
            <email>wuyuefeng@huawei.com</email>
        </address>
    </author>

    <date year="2021" />
    <!-- Meta-data Declarations -->
    <area>Routing</area>
    <workgroup>ROLL</workgroup>

    <!-- WG name at the upperleft corner of the doc,
         IETF is fine for individual submissions.
     If this element is not present, the default is "Network Working Group",
         which is used by the RFC Editor as a nod to the history of the IETF. -->

    <keyword>RPL, 6lo, metrics, constraints</keyword>

    <abstract>
        <t>
            This document describes RPL protocol design issues, various
            observations and possible consequences of the design and
            implementation choices.
        </t>
    </abstract>
</front>

<middle>
    <section title="Motivation">
        <t>
            The primary motivation for this draft is to enlist different issues
            with RPL operation and invoke a discussion within the working
            group. This draft by itself is not intended for RFC tracks but as a
            WG discussion track. This draft may in turn result in other work
            items taken up by the WG which may improvise on the issues
            mentioned herewith.
        </t>
    </section>
    <section title="Introduction">
        <t>
            RPL <xref target="RFC6550"/> specifies a proactive distance-vector
            routing scheme designed for LLNs (Low Power and Lossy Networks).
            RPL enables the network to be formed as a DODAG and
            supports storing mode and non-storing mode of operations.
            Non-storing mode allows reduced memory resource usage on the nodes
            by allowing non-BR nodes to operate without managing a routing
            table and involves use of source routing by the Root to direct the
            traffic along a specific path. In storing mode of operation
            intermediate routers maintain routing tables.
        </t>
        <t>
            This work aims to highlight various issues with RPL which
            makes it difficult to handle certain scenarios. This work will
            highlight such issues in context to RPL's mode of operations
            (storing versus non-storing). There are cases where RPL does not
            provide clear rules and implementations have to make their choices
            hindering interoperability and performance.
        </t>
        <t>
            <xref target="I-D.clausen-lln-rpl-experiences"/> provides some
            interesting points. Some sections in this draft may overlap with
            some observations in [clausen], but this is been done to further
            extend some scenarios or observations. It is highly encouraged that
            readers should also visit <xref
                target="I-D.clausen-lln-rpl-experiences"/> for other insights.
            Regardless, this draft is self-sufficient in a way that it does not
            expect to have read [clausen-draft].
        </t>

        <section title="Requirements Language and Terminology">
            <t>
                The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
                NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
                "OPTIONAL" in this document are to be interpreted as described
                in <xref target="RFC2119">RFC 2119</xref>.
            </t>
            <t>
                NS-MOP = RPL Non-storing Mode of Operation
            </t>
            <t>
                S-MOP = RPL Storing Mode of Operation
            </t>
            <t>
                This document uses terminology described in <xref
                    target="RFC6550"/> and <xref target="RFC6775"/>.
            </t>
        </section>
    </section>

    <section title="DTSN increment in storing MOP" anchor="DTSNincr">
        <t>
            DTSN increment has major impact on the overall RPL control traffic
            and on the efficiency of downstream route update.  DTSN is sent as
            part of DIO message and signals the downstream nodes to trigger the
            target advertisement. The 6LR needs to decide when to update the
            DTSN and usually it should do it in a conservative way. The DTSN
            update mechanism determines how soon the downward routes are
            established along the new path. RPL specifications does not provide
            any clear mechanism on how the DTSN update should happen in case of
            storing mode.
        </t>
        <t>
            <figure align="center" anchor="sample_top" title="Sample
                topology"> <artwork align="center"><![CDATA[
     (6LBR)
        |
        |
        |
       (A)
       / \
      /   \
     /     \
   (B)    -(C)
    |    /  |
    |   /   |
    |  /    |
   (D)-    (E)
     \      ;
      \    ;
       \  ;
        (F)
        / \
       /   \
      /     \
    (G)     (H)
                    ]]></artwork> </figure>
        </t>
        <t>
            Consider example topology shown in <xref target="sample_top"/>,
            assume that node D switches the parent from node B to C. Ideally
            the downstream nodes D and its sub-childs should send their target
            advertisement to the new path via node C.  To achieve this result
            in a efficient way is a challenge.  Incrementing DTSN is the only
            way to trigger the DAO on downstream nodes. But this trigger should
            be sent not only on the first hop but to all the grand-child nodes.
            Thus DTSN has to be incremented in the complete sub-DODAG rooted at
            node D thus resulting in DIO/DAO storm along the sub-DODAG. This is
            specifically a big issue in high density networks where the metric
            deteoration might happen transiently even though the signal
            strength is good.
        </t>
        <t>
            The primary implementation issue is whether a child node increment
            its own DTSN when it receives DTSN update from its parent node?
            This would result in DAO-updates in the sub-DODAG, thus the cost
            could be very high. If not incremented it may result in serious
            loss of connectivity for nodes in the sub-DODAG.
        </t>
        <section title="Deliberations">
            <t>
                <list style="format (%d)">
                    <t>
                        In S-MOP, should the child node increment its DTSN on
                        seeing that its preferred parent has updated its DTSN?
                    </t>
                    <t>
                        What are rules for DTSN increment for S-MOP, which
                        multiple implementations can follow thus allowing
                        consistent performance across different
                        implementations?
                    </t>
                </list>
            </t>
        </section>
    </section>

    <section title="DAO retransmission and use of DAO-ACK in storing MOP">
        <t>
            <xref target="RFC6550"/> has an optional DAO-ACK mechanism using
            which an upstream parent confirms the reception of a DAO from the
            downstream child. In case of storing mode, the DAO is addressed to
            the immediate hop upstream parent resulting in DAO-ACK from the
            parent. There are two implementations possible:
            <list style="format (%d)">
                <t>
                    Hop-by-hop ACK: A parent responds with a DAO-ACK
                    immedetialy after receiving the DAO.
                </t>
                <t>
                    End-to-End ACK: A node waits for the upstream parent to
                    send DAO-ACK to respond with a DAO-ACK downstream. The
                    upstream parent may do as many attempts to successfully
                    send this DAO upstream. In other words, the parent node
                    accepts the responsibilty of sending the DAO upstream till
                    the point it is ACKed the moment it responds back with its
                    own ACK to the child.
                </t>
            </list>
        </t>
        <t>
            <figure align="center" anchor="hbh_ack" title="Hop-by-hop DAO-ACK">
                <artwork align="center"><![CDATA[
           1->          3->
           DAO          DAO
(TgtNode)--------(6LR)-------(root)
           ACK          ACK
           <-2          <-4
                    ]]></artwork> </figure>
        </t>
        <t>
            <figure align="center" anchor="e2e_ack" title="End-to-End DAO-ACK">
                <artwork align="center"><![CDATA[
           1->          2->
           DAO          DAO
(TgtNode)--------(6LR)-------(root)
           ACK          ACK
           <-4          <-3
                    ]]></artwork> </figure>
        </t>
        <section title="Significance of bidirectional Path establishment
            indication and relevance of DAO-ACK">
            <t>
                Lot of application traffic patterns requires that the
                bidirectional path be established between the target node and
                the root. A typical example is that COAP request with ACK bit
                set would require an acknowledgement from the end receiver and
                thus warrants bidirectional path establishment. It is
                imperative that the target node first ascertains whether such a
                bidirectional path is established before initiating such
                application traffic. In case of non-storing MOP, the DAO-ACK
                works perfectly fine to ascertain such bidirectional
                connectivity since it is an indication that the root which
                usually is the direct destination of the DAO has received the
                DAO. But in case of storing MOP, things are more complicated
                since DAO is sent hop-by-hop and the DAO-ACK semantics are not
                clear enough as per the current specification. As mentioned in
                above section, an implementation can choose to implement
                hop-by-hop ACK or end-to-end ACK.
            </t>
        </section>
        <section title="Problems with hop-by-hop DAO-ACK">
            <t>
                The primary issue with this mode is that target node cannot
                ascertain bidirection path connectivity on the reception of the
                DAO-ACK.
            </t>
        </section>
        <section title="Problems with end-to-end DAO-ACK">
            <t>
                In this case, it is possible for the target node to ascertain
                if the DAO has indeed reached the root since the reception of
                DAO-ACK on target node confirms this. However there is extra
                state information that needs to be maintained on the 6LRs on
                behalf of all the child nodes. Also it is very difficult for
                the target node to ascertain a timer value to decide whether
                the DAO transmission has failed to reach the root.
            </t>
        </section>
        <section title="Deliberations">
            <t>
                <list style="format (%d)">
                    <t>
                        How should an implementation interpret the DAO-ACK
                        semantics?
                    </t>
                    <t>
                        What is the best way for the target node to know that
                        the end to end bidirectional path is successfully
                        installed or updated? In NS-MOP, the DAO-ACK provides a
                        clear way to do this. Can the same be achieved for
                        storing-MOP?
                    </t>
                    <t>
                        What happens if the DAO-ACK with Status!=0 is responded by
                        ancestor node?
                    </t>
                    <t>
                        How to selectively NACK subset of targets in case
                        target options are aggregated?
                    </t>
                </list>
            </t>
        </section>
        <section title="Implementation Notes">
            <t>
                Current RPL open source implementations have both types of
                DAO-ACK implementations. For e.g. RIOT supports hop-by-hop
                DAO-ACK. Contiki older versions supported hop-by-hop ACK but
                the recent version have changed to end-to-end ACK
                implementation.
            </t>
            <t>
                The sequence of sending no-path DAO and DAO matters when
                updating the routing adjacencies on a parent switch. If an
                implementation chooses to send no-path DAO before DAO then it
                results in significantly more overhead for route invalidation.
                This is because no-path DAO would traverse all the way up to
                the BR clearing the routes on the way. In case there is a
                common ancestor post which the old and new path remains same
                then it is better to send regular DAO first thus limiting the
                propagation of subsequent no-path DAO till this common
                ancestor.
            </t>
        </section>
    </section>

    <section title="Interpreting Trickle Timer">
        <t>
            Trickle algorithm defines a mechanism to reset the timer. Trickle timer
            reset is unlike regular periodic timers wherein the timer is
            simply reset to start again. Reset of trickle timer implies
            resetting the trickle back to Imin and starting with a new interval
            as mentioned in Section 4.2 of <xref target="RFC6206"/>.
        </t>
        <t>
            <figure align="center" anchor="trickle" title="Trickle Timer Operation">
                <artwork align="center"><![CDATA[
|----|--------|----------------|------------------------------| . . . .
 Imin   I2             I3                     I4                    I5
                    ]]></artwork> </figure>
        </t>
        <t>
            The above figure shows an example of trickle intervals. An interval
            is double that of the previous interval size. Section 4.2. of <xref
                target="RFC6206"/> states that,
        </t>
        <t>
            "If Trickle hears a transmission that is "inconsistent" and I is
            greater than Imin, it resets the Trickle timer.  To reset the
            timer, Trickle sets I to Imin and starts a new interval as in step
            2.  If I is equal to Imin when Trickle hears an "inconsistent"
            transmission, Trickle does nothing.  Trickle can also reset its
            timer in response to external "events"."
        </t>
        <t>
            Thus if the trickle timer has advanced to subsequent intervals
            i.e., >= I2, then a reset of trickle timer implies going back to
            Imin. However, if the trickle timer is currently in Imin and if it hears an
            inconsistent transmission then it does nothing.
        </t>
        <t>
            In context to multicast DIS/DIO operation, this implies that if the
            DIO trickle timer is already at Imin and if the node hears a
            multicast DIS, then the timer does nothing. It MUST NOT reset the
            timer again in this case.
        </t>
        <t>
            An implementation MUST never restart the timer within an interval.
            For e.g., in the above figure, if the timer is in interval I2, the
            implementation MUST never restart the timer to the beginning of the
            current interval i.e., I2. If the timer is in interval T2 and if
            the reset is to be done then the interval is set back to Imin. If
            the timer is already in Imin, then the reset should do nothing.
        </t>
    </section>

    <section title="Handling resource unavailability">
        <t>
            The nodes in the constrained networks have to maintain various
            records such as neighbor cache entries and routing entries on
            behalf of other targets to facilitate packet forwarding. Because of
            the constrained nature of the devices the memory available may be
            very limited and thus the path selection algorithm may have to take
            into consideration such resource constraints as well.
        </t>
        <t>
            RPL currently does not have any mechanism to advertise such
            resource indicator metrics. The primary tables associated with RPL
            are routing table and the neighbor cache. Even though neighbor
            cache is not directly linked with RPL protocol, the maintenance of
            routing adjacencies results in updates to neigbor cache.
        </t>
        <section title="Deliberations">
            <t>
                <list>
                    <t>
                        Is it possible to know that an upstream parent/ancestor
                        cannot hold enough routing entries and thus this path
                        should not be used?
                    </t>
                    <t>
                        Is it possible to know that an upstream parent cannot
                        hold any more neighbor cache entry and thus this
                        upstream parent should not be used?
                    </t>
                </list>
            </t>
        </section>
    </section>

    <section title="Handling aggregated targets">
        <t>
            RPL allows and defines specific procedures so as to aid target
            aggregation in DAO. Having said that, the specification does not
            mandate use of aggregated targets nor does it make any comment on
            whether a receiving node needs to handle it. Target aggregation is
            an useful tool and especially helps with link layer technologies
            that does not suffer from low MTUs such as PLC. Even if the
            implementation does not support aggregating targets, it should
            atleast mandate reception of aggregated targets in DAO.
        </t>
        <t>
            RPL has a mechanism currently to ACK the DAO but it does not have a
            mechanism to ACK the target option. Thus in case of aggregated
            targets in the DAO, if the subset of the targets fail then it is
            impossible for the DAO-ACK to signal this to the DAO sender.
        </t>
        <section title="Deliberations">
            <t>
                <list>
                    <t>
                        Even if the implementation does not support aggregating
                        targets, should it atleast mandate reception and
                        handling of aggregated targets in DAO?
                    </t>
                    <t>
                        There is a good scope for compressing aggregated
                        targets which can significantly reduce the RPL control
                        overhead.
                    </t>
                    <t>
                        How to selectively NACK subset of targets in case
                        target options are aggregated?
                    </t>
                    <t>
                        The DEFAULT_DAO_DELAY of 1sec does not help much with
                        aggregation. The upstream parent nodes should wait for
                        more time then the child nodes so as to effectively
                        aggregate. Can we have DEFAULT_DAO_DELAY a function of
                        the level/rank the node is at?
                    </t>
                </list>
            </t>
        </section>
    </section>

    <!--
    <section title="Network density and DIO Trickle Timer">
        <t>
            DIORedundancy with high node density.
        </t>
    </section>

    <section title="Operating in router mode">
        <t>
            RPL allows a node to be either in a 6LBR, 6LR or 6LN. Nodes are
            provisioned to operate in 6LR mode. This decision should be
            dynamic. RPL does not provide any mechanism for detection and
            converting to 6LR mode.
        </t>
    </section>
    -->

    <section title="RPL Transit Information in DAO">
        <t>
            RPL allows associating a target or set of targets with a Transit
            Information Option which contains attributes for a path to one
            or more destinations identified by the set of targets. In case of
            NS-MOP, the transit Information will contain the all critical
            Parent Address which allows the common ancestor usually the root to
            identify the source route header for the target node. The Transit
            Information also contains other information such as Path Sequence
            and Path Lifetime which are critical for maintaining route
            adjacencies.
        </t>
        <t>
            RPL however does not mandate the use of Transit Information
            Option for targets.
        </t>
        <section title="Deliberations">
            <t>
                <list>
                    <t>
                        Is it ok to let implementations decide on the inclusion
                        of Transit Information Option?
                    </t>
                    <t>
                        Is it possible to achieve interop without mandating use
                        of Transit Information Option?
                    </t>
                    <t>
                        If the Transit Information Option is sent, should
                        the handling of PathSequence be mandated?
                    </t>
                </list>
            </t>
        </section>
    </section>

    <section title="Upgrades or Extensions to RPL protocol">
        <t>
            RPL extensibility is highly desirable and is controlled by protocol
            elements within the messaging framework. In the pursuit to keep the
            signalling overhead less, RPL specification has been restricting in
            its approach to extend its field ranges, thus in some cases putting
            extensibility at stakes. Consider for example, the mode of
            operation bits which is three bits in the RPL specification. These
            bits are already saturated and it may be difficult to add major
            upgrades without extending these bits.
        </t>
        <t>
            Addition of new Control Options or new RPL Codes almost certainly
            results in backward compatibility issues. RFC6550 clearly mentions
            that a message with an unknown RPL Code MUST be silently discarded.
            However, no explicit handling is suggested for unknown RPL control
            option types. In some cases, implementations simply copy-forward an
            unknown option as it is while in other cases the unknown option is
            stripped off before forwarding the message.
        </t>
        <t>
            Deliberations:
            <list style="format (%d)">
                <t>
                    What are the extensibility options RPL could implement? How
                    much overhead would it incur?
                </t>
                <t>
                    Most of the extensions are in the form of new control
                    options. Should RPL have a mechanism to only handle such
                    extensions in a backward compatible but in a generic manner?
                </t>
            </list>
        </t>
    </section>

    <section title="Path Control bits handling">
        <t>
            RPL uses Path Control bits in the DAO's Transit Information Option
            for installing multiple downward routes to the nodes. These multiple
            routes could be used for reliability, latency or traffic
            load-balancing within a DAG. The path control bits are usable both
            in storing and non-storing mode of operation.
        </t>
        <t>
            RFC6550 Section 9.9 bullet point 9 requires a mandatory setting of
            Path Control bits in all the unicast DAOs sent by the Target node.
            However, no existing implementation of RPL supports this. There is
            no reason for a network which only requires a single path to the
            root to mandatorily support path control bits.
        </t>
        <t>
            Deliberations:
            <list style="format (%d)">
                <t>
                    Should the mandatory clause for supporting Path Control Bits
                    in RFC6550 Section 9.9 point 9 be removed?
                </t>
                <t>
                    Handling Path Control Bits may be complex. An implementation
                    guideline explaining the use-cases and resource (memory
                    requirements) assumptions would help implementors decide the
                    utility of this technique.
                </t>
            </list>
        </t>
    </section>

    <section title="Asymmetric Links and RPL">
        <t>
            Section 3.1 of <xref target="I-D.ietf-intarea-adhoc-wireless-com"/>
            explains asymmetric link characteristics and what it takes for a
            protocol to support asymmetric links. RPL depends on bi-directional
            links for control even though near-perfect symmetry is not
            expected. The implication of this is that the upstream and
            downstream path remains same within a given RPL instance for any
            pair of nodes. There are following questions sprouting of this design:

            <list style="format (%d)">
                <t>
                    Is it possible to detect asymmetric links?
                </t>
                <t>
                    In the presence of asymmetric links what is the impact on
                    the control overhead and is there a way to possibly
                    mitigate or alleviate any negative impact?
                </t>
            </list>
        </t>
        <t>
            <xref target="I-D.ietf-roll-aodv-rpl"/> defines a mechanism to use
            a pair of instances which are coupled. This allows disjoint
            upstream and downstream paths between pair of nodes assuming that
            the link asymmetricity is detected using some outside techniques.
            The link assumes that the link asymmetricity is already known to
            the nodes in the form of static configuration. In case of 6tisch
            networks, the availability of transmission slots information can be
            used to identify link asymmetricity. The challenge with regards to
            detecting link asymmetricity arises from scenarios where, for
            example, the nodes transmit with unequal power levels.
        </t>
    </section>

    <section title="Adjacencies probing with RPL">
        <t>
            RPL avoids periodic hello messaging as compared to other
            distance-vector protocols. It uses trickle timer based mechanism to
            update configuration parameters. This significantly reduces the RPL
            control overhead. One of the fallout of this design choice is that,
            in the absence of regular traffic, the adjacencies could not be
            tested and repaired if broken.
        </t>
        <t>
            RPL provides a mechanism in the form of unicast DIS to query a
            particular node for its DIO. A node receiving a unicast DIS MUST
            respond with a unicast DIO with Configuration Option. This
            mechanism could as well be made use of for probing adjacencies and
            certain implementations such as Contiki uses this. The periodicity
            of the probing is implementation dependent, but the node is
            expected to invoke probing only when 
            <list style="format (%d)">
                <t>
                    There is no data traffic based on which the links could be
                    tested.
                </t>
                <t>
                    There is no L2 feedback. In some case, L2 might provide
                    periodic beacons at link layer and the absence of beacons
                    could be used for link tests.
                </t>
            </list>
        </t>
        <section title="Deliberations">
            <t>
                <list style="format (%d)">
                    <t>
                        Should the probing scheme be standardized? In some
                        cases using multicast based probing may prove
                        advantageous.
                    </t>
                    <t>
                        In some cases using multicast based probing may prove
                        advantageous. Currently RPL does not have multicast
                        based probing. Multicast DIS/DIO may not be suitable
                        for probing because it could possibly lead to change of
                        states.
                    </t>
                </list>
            </t>
        </section>
    </section>

    <section title="Control Options eliding mechanism in RPL">
        <t>
            RPL configuration changes are rare and thus various configuration
            options may not change over a long period of time. RPL provides a way
            for the configuration options to be elided but there are no clear
            guidelines on how the eliding should be handled. In the absence of
            such guidelines, it is possible that certain nodes may end up using
            stale configuration in the event of transient link failures.
        </t>
    </section>

<!--
    <section title="Nodes energy level">
        <t>
            Control plane to signal memory level, energy level,. Can we use existing routing metrics in DAO to signal such info?
            In case of non-storing mode, the root can decide to balance paths based on such information.
        </t>
    </section>

    <section title="Management interfaces">
        <t>
            Every time a new route is added/updated/deleted then we need a way to inform above layer.
        </t>
    </section>
    -->

    <section title="Managing persistent variables across node reboots">
        <section title="Persistent storage and RPL state information">
            <t>
                Devices are required to be functional for several years without
                manual maintanence. Usually battery power consumption is
                considered key for operating the devices for several (tens of)
                years. But apart from battery, flash memory endurance may prove
                to be a lifetime bottleneck in constrained networks.  Endurance
                is defined as maximum number of erase-write cycles that a
                NAND/NOR cell can undergo before losing its 'gauranteed' write
                operation. In some cases (cheaper NAND-MLC/TLC), the endurance
                can be as less as 2K cycles. Thus for e.g.  if a given cell is
                written 5 times a day, that NAND-flash cell assuming an
                endurance of 10K cycles may last for less than 6 years.
            </t>
            <t>
                Wear leveling is a popular technique used in flash memory to
                minimize the impact of limited cell endurance. Wear leveling
                works by arranging data so that erasures and re-writes are
                distributed evenly across the medium. The memory sectors are
                over-provisioned so that the writes are distributed across
                multiple sectors. Many IoT platforms do not necessarily
                consider this over-provisioning and usually provision the
                memory only to what is required. Some scenarios such as
                street-lighting may not require the application layer to write
                any information to the persistent storage and thus the
                over-provisioning is often ignored. In such cases if the
                network stack ends up using persistent storage for maintaining
                its state information then it becomes counter-productive.
            </t>
            <t>
                In a star topology, the amount of persistent data write done by
                network protocols is very limited. But ad-hoc networks
                employing routing protocols such as RPL assume certain state
                information to be retained across node reboots. In case of IoT
                devices this storage is mostly floating gate based NAND/NOR
                based flash memory. The impact of loss of this state
                information differs depending upon the type (6LN/6LR/6LBR) of
                the node.
            </t>
        </section>
        <section title="Lollipop Counters">
            <t>
                <xref target="RFC6550"/> Section 7.2. explains sequence counter
                operation defining lollipop <xref target="Perlman83"/> style
                counters. Lollipop counters specify mechanism in which even if
                the counter value wraps, the algorithm would be able to tell
                whether the received value is the latest or not. This mechanism
                also helps in "some cases" to recover from node reboot, but is
                not foolproof.
            </t>
            <t>
                Consider an e.g. where Node A boots up and initialises the
                seqcnt to 240 as recommended in <xref target="RFC6550"/>. Node
                A communicates to Node B using this seqcnt and node B uses this
                seqcnt to determine whether the information node A sent in the
                packet is latest. Now lets assume, the counter value reaches
                250 after some operations on Node A, and node B keeps receiving
                updated seqcnt from node A. Now consider that node A reboots,
                and since it reinitializes the seqcnt value to 240 and sends
                the information to node B (who has seqcnt of 250 stored on
                behalf of node A). As per section 7.2. of <xref
                    target="RFC6550"/>, when node B receives this packet it
                will consider the information to be old (since 240 &lt; 250).
            </t>
            <texttable anchor="lollipop" title="Example lollipop counter operation">
                <ttcol align='center'>A</ttcol>
                <ttcol align='center'>B</ttcol>
                <ttcol align='center'>Output</ttcol>
                <c>240</c>  <c>240</c>   <c>A&lt;B, old</c>
                <c>240</c>  <c>241</c>   <c>A&lt;B, old</c>
                <c>240</c>  <c>::</c>    <c>A&lt;B, old</c>
                <c>240</c>  <c>256</c>   <c>A&lt;B, old</c>
                <c>240</c>  <c>0</c>     <c>A&lt;B, new</c>
                <c>240</c>  <c>1</c>     <c>A&gt;B, new</c>
                <c>240</c>  <c>::</c>    <c>A&gt;B, new</c>
                <c>240</c>  <c>127</c>   <c>A&gt;B, new</c>
                <postamble>Default values for lollipop counters considered from
                    <xref target="RFC6550"/> Section 7.2.</postamble>
            </texttable>
            <t>
                Based on this figure, there is dead zone (240 to 0) in which if
                A operates after reboot then the seqcnt will always be
                considered smaller.  Thus node A needs to maintain the seqcnt
                in persistent storage and reuse this on reboot.
            </t>
        </section>
        <section title="RPL State variables">
            <t>
                The impact of loss of RPL state information differs depending
                upon the node type (6LN/6LR/6LBR). Following sections explain
                different state variables and the impact in case this
                information is lost on reboot.
            </t>
            <section title="DODAG Version">
                <t>
                    The tuple (RPLInstanceID, DODAGID, DODAGVersionNumber)
                    uniquely identifies a DODAG Version.  DODAGVersionNumber is
                    incremented everytime a global repair is initiated for the
                    instance (global or local). A node receiving an older
                    DODAGVersionNumber will ignore the DIO message assuming it
                    to be from old DODAG version. Thus a 6LBR node (and 6LR
                    node in case of local DODAG) needs to maintain the
                    DODAGVersionNumber in the persistent storage, so as to be
                    available on reboot.  In case the 6LBR could not use the
                    latest DODAGVersionNumber the implication are that it won't
                    be able to recover/re-establish the routing table.
                </t>
            </section>
            <section title="DTSN field in DIO">
                <t>
                    DTSN (Destination advertisement Trigger Sequence Number) is
                    a DIO message field used as part of procedure to maintain
                    Downward routes. A 6LBR/6LR node may increment a DTSN in
                    case it requires the downstream nodes to send DAO and thus
                    update downward routes on the 6LBR/6LR node. In case of RPL
                    NS-MOP, only the 6LBR maintains the downward routes and
                    thus controls this field update. In case of S-MOP, 6LRs
                    additionally keep downward routes and thus control this
                    field update.
                </t>
                <t>
                    In S-MOP, when a 6LR node switches parent it may have to
                    issue a DIO with incremented DTSN to trigger downstream
                    child nodes to send DAO so that the downward routes are
                    established in all parent/ancestor set. Thus in S-MOP, the
                    frequency of DTSN update might be relatively high (given
                    the node density and hysteresis set by objective function to
                    switch parent).
                </t>
            </section>
            <section title="PathSequence">
                <t>
                    PathSequence is part of RPL Transit Option, and associated
                    with RPL Target option. A node whichs owns a target address
                    can associate a PathSequence in the DAO message to denote
                    freshness of the target information. This is especially
                    useful when a node uses multiple paths or multiple parents
                    to advertise its reachability.
                </t>
                <t>
                    Loss of PathSequence information maintained on the target
                    node can result in routing adjacencies been lost on
                    6LRs/6LBR/6BBR.
                </t>
            </section>
        </section>
        <section title="State variables update frequency">
            <!--
                TODO: Show contiki data as in how many number of times does the
                DTSN, DAOSequence change?  REF for TI-CC2538
                [http://www.ti.com/lit/ug/swru319c/swru319c.pdf]
                [http://www.ti.com/lit/wp/spry164/spry164.pdf] CC2538 which
                uses MLC-NAND for flash storage has 3000-5000 endurance rating.
            -->
            <texttable anchor="rpl_state" title="RPL State variables">
                <ttcol align='center'>State variable</ttcol>
                <ttcol align='center'>Update frequency</ttcol>
                <ttcol align='center'>Impacts node type</ttcol>
                <c>DODAGVersionNumber</c> <c>Low</c>               <c>6LBR, 6LR(local DODAG)</c>
                <c>DTSN</c>               <c>High(SM),Low(NSM)</c> <c>6LBR, 6LR</c>
                <c>PathSequence</c>       <c>High(SM),Low(NSM)</c> <c>6LR, 6LN</c>
                <postamble>Low=&lt;5 per day, High=&gt;5 per day; SM=Storing MOP, NSM=Non-Storing MOP</postamble>
            </texttable>
        </section>
        <section title="Deliberations">
            <t>
                <list style="format (%d)">
                    <t>
                        Is it possible that RPL removes the use of persistent
                        storage for maintaining state information?
                    </t>
                    <t>
                        In most cases, the node reboots will happen very rarely. Thus
                        doing a persistent storage book-keeping for handling node
                        reboot might not make sense. Is it possible to consider
                        signaling (especially after the node reboots) so as to avoid
                        maintaining this persistent state?  Is it possible to use
                        one-time on-reboot signalling to recover some state
                        information?
                    </t>
                    <t>
                        It is necessary that RPL avoids using persistent
                        storage as far as possible. Ideally, extensions to RPL
                        should consider this as a design requirement especially
                        for 6LR and 6LN nodes. DTSN and PathSequence are the
                        primary state variables which have major impact.
                    </t>
                </list>
            </t>
        </section>
        <section title="Implementation Notes">
            <t>
                An implementation should use a random DAOSequence number on
                reboot so as to avoid a risk of reusing the same DAOSequence on
                reboot. Regardless the sequence counter size of 8bits does not
                provide much gurantees towards choosing a good random number. A
                parent node will not respond with a DAO-ACK in case it sees a
                DAO with the same previous DAOSequence.
            </t>
            <t>
                Write-Before-Use: The state information should be written
                to the flash before using it in the messaging. If it is
                done the other way, then the chances are that the node
                power downs before writing to the persistent storage.
            </t>
        </section>
    </section>

    <section title="Capabilities and its role in RPL">
        <t>
            RPL is a distributed protocol and it requires that the
            participating nodes agree on basic set of primitives to follow.
            RPL currently handles this using MOP (Mode of Operation) bits
            in the DIO. MOP bits inform the nodes the basic mode of operation a
            node MUST support to join the Instance as a 6LR. The MOP is decided
            and advertised by the root of the RPL Instance. A node not
            supporting the given MOP may still join the Instance as a leaf node
            or 6LN.
        </t>
        <t>
            RPL further uses DIO Configuration Option to advertise the
            configuration each node needs to use (for e.g., for trickle timer).
        </t>
        <section title="Handshaking node capabilities">
            <t>
                Currently there exist no mechanism to handshake capabilities of
                the root or 6LRs or 6LNs. If a feature is optional and is
                supported by 6LRs/6LNs then currently there exists no mechanism
                to signal it. There are several RPL extension proposals which
                are possibly optional features. Root needs to know if the
                6LR/6LN supports these optional features to enable the
                extension in that path context. Similarly 6LRs and 6LNs need to
                know whether the root supports certain extensions that it can
                make use of.
            </t>
        </section>
        <section title="How do Capabilities differ from MOP and Configuration Option?">
            <t>
                Unlike MOP and Configuration Option which are issued by the
                root of the Instance, Capabilities can be issued by any node. A
                6LN/6LR node can advertise its capabilities such that those can be
                seen by intermediate 6LRs and the root of the Instance.
            </t>
        </section>
        <section title="Deliberations">
            <t>
                <list style="format (%d)">
                    <t>
                        Is it possible for leaf nodes to advertise their set of
                        capabilities, which can be used by root and/or
                        intermediate 6LRs to make run time decisions?
                    </t>
                    <t>
                        How should these capabilities be carried? Should it be
                        carried in DAO/DIO/DAO-ACK?
                    </t>
                    <t>
                        Should the definition of capabilities be same in both
                        directions (upstream/downstream)?
                    </t>
                </list>
            </t>
        </section>
    </section>

    <section title="Backward Compatibility issues with RPL Options">
        <t>
            Most of the new work in ROLL requires addition of new control
            options. Everytime a new control option is added, it is required
            that all the nodes upgrade to support this option. In many cases,
            the new specification declares using a Flag day to switch to the
            new functionality.
        </t>
        <t>
            New control options may not require mandatory handling on every
            node but it requires at-least some processing. For e.g., assume
            that a new control option is added to DIO message. The option does
            not require any handling on the nodes not supporting it but it
            requires at-least for these nodes to forward this new control
            option downstream. Currently the new control option may be stripped
            off.
        </t>
        <t>
            It should be possible for the unknown control options to be copied
            as-is to the downstream/upstream node(s). The specification
            defining the new control option will decide whether a node should
            strip-off or copy the unknown control option.
        </t>
    </section>
    <section title="RPL under-specification">
        <t>
            <list style="format (%c)">
                <t>
                    PathSequence: Is it mandatory to use PathSequence in DAO
                    Transit Information Option? RPL mentions that a 6LR/6LBR
                    hosting the routing entry on behalf of
                    target node should refresh the lifetime
                    on reception of a new Path Sequence. But RPL does
                    not necessarily mandate use of Path Sequence. Most of the
                    open source implementation [RIOT] [CONTIKI] currently do
                    not issue Path Sequence in the DAO message.
                </t>
                <t>
                    Target Option aggregation in DAO: RPL allows multiple
                    targets to be aggregated in a single DAO message and has
                    introduced a notion of DelayDAO using which a 6LR node
                    could delay its DAO to enable such aggregation. But RPL
                    does not have clear text on handling of aggregated DAOs and
                    thus it hinders interoperability.
                </t>
                <t>
                    DTSN Update: RPL does not clearly define in which cases
                    DTSN should be updated in case of storing mode of
                    operation. More details for this are presented in <xref
                        target="DTSNincr"/>.
                </t>
            </list>
        </t>
    </section>

    <section anchor="Acknowledgements" title="Acknowledgements">
        <t>
            Many thanks to Pascal Thubert for hallway chats and for helping
            understand the existing design rationales. Thanks to Michael
            Richardson for Unstrung RPL implementation rationale. Thanks to ML
            discussions, in particular
            (https://www.ietf.org/mail-archive/web/roll/current/msg09443.html).
        </t>
    </section>

<!-- Possibly a 'Contributors' section ... -->

    <section anchor="IANA" title="IANA Considerations">
        <t>This memo includes no request to IANA.</t>
    </section>

    <section anchor="Security" title="Security Considerations">
        <t>
            This is an information draft and does add any changes to the
            existing specifications.
        </t>
    </section>
</middle>

<back>
    <!-- References split into informative and normative -->
    <references title="Normative References">
        <!--?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml"?-->
        &RFC2119;
        &RFC6550;
        &RFC6206;
        &RFC6775;

    </references>

    <references title="Informative References">
        <reference anchor="Perlman83">
            <front>
                <title>Fault-Tolerant Broadcast of Routing Information</title>
                <author initials="R" surname="Perlman">
                    <organization></organization>
                </author>
                <date year="December 1983" />
            </front>
            <seriesInfo name="North-Holland Computer Networks," value="Vol.7"/>
        </reference>

        <!-- Here we use entities that we defined at the beginning. -->
        <?rfc include="reference.I-D.clausen-lln-rpl-experiences.xml"?>
        <?rfc include='reference.I-D.ietf-intarea-adhoc-wireless-com.xml'?>
        <?rfc include='reference.I-D.ietf-roll-aodv-rpl.xml'?>
    </references>

    <section anchor="app-additional" title="Additional Stuff">
    </section>

</back>
</rfc>
