<?xml version="1.0" encoding="US-ASCII"?>
<!-- This template is for creating an Internet Draft using xml2rfc,
     which is available here: htDtp://xml.resource.org. -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!-- One method to get references from the online citation libraries.
     There has to be one entity for each item to be referenced. 
     An alternate method (rfc includeS) is described in the references. -->

<!ENTITY I-D.narten-iana-considerations-rfc2434bis SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.narten-iana-considerations-rfc2434bis.xml">
]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!-- used by XSLT processors -->
<!-- For a complete list and description of processing instructions (PIs), 
     please see http://xml.resource.org/authoring/README.html. -->
<!-- Below are generally applicable Processing Instructions (PIs) that most I-Ds might want to use.
     (Here they are set differently than their defaults in xml2rfc v1.32) -->
<?rfc strict="yes" ?>
<!-- give errors regarding ID-nits and DTD validation -->
<!-- control the table of contents (ToC) -->
<?rfc toc="yes"?>
<!-- generate a ToC -->
<?rfc tocdepth="4"?>
<!-- the number of levels of subsections in ToC. default: S3 -->
<!-- control references -->
<?rfc symrefs="yes"?>
<!-- use symbolic references tags, i.e, [RFC2119] instead of [1] -->
<?rfc sortrefs="yes" ?>
<!-- sort the reference entries alphabetically -->
<!-- control vertical white space 
     (using these PIs as follows is recommended by the RFC Editor) -->
<?rfc compact="yes" ?>
<!-- do not start each main section on a new page -->
<?rfc subcompact="no" ?>
<!-- keep one blank line between list items -->
<!-- end of list of popular I-D processing instructions -->



<rfc category="std"
     docName="draft-ietf-nfsv4-mv1-msns-update-02"
     updates="5661"
     ipr="trust200902">
  <front>
    <title abbrev="nfsv4.1-msns-update">
      NFS Version 4.1 Update for Multi-Server Namespace
    </title>

    <author initials='D.' surname='Noveck'
            fullname = 'David Noveck'
            role='editor'>
     <organization>NetApp</organization>
     <address>
       <postal>
         <street>1601 Trapelo Road</street>
         <city>Waltham</city> 
         <region>MA</region>
         <code>02451</code>
         <country>United States of America</country>
       </postal>

       <phone>+1 781 572 8038</phone>
       <email>davenoveck@gmail.com</email>
     </address>
    </author>

     <author initials='C.' surname='Lever'
            fullname = 'Charles Lever'>
      <organization abbrev='ORACLE'>
        Oracle Corporation
      </organization>
      <address>
        <postal>
          <street>1015 Granger Avenue</street>
          <city>Ann Arbor</city>
          <region>MI</region>
          <code>48104</code>
          <country>United States of America</country>
        </postal>

        <phone>+1 248 614 5091</phone>
        <email>chuck.lever@oracle.com</email>
      </address>
     </author>



   <date year="2018"/>

   <area>Transport</area>
   <workgroup>NFSv4</workgroup>

    <abstract>
      <t>
        This document presents necessary clarifications and
	corrections concerning features related to
        the use of location-related attributes in NFSv4.1.  These
        include migration, which transfers responsibility for a
        file system from one server to another, and facilities to 
        support trunking by 
        allowing discovery of the set of network 
        addresses to use to access a file system.  This document
        updates RFC5661.
      </t>
    </abstract>
  </front>

  <middle>
        
    <section title="Introduction"
	     anchor="INTRO">
      <t>
        This document defines the proper handling, within NFSv4.1, of the 
        location-related attributes 
        fs_locations and fs_locations_info and
        how necessary changes in those attributes are to be dealt with.
	The necessary corrections and clarifications parallel those
	done for NFSv4.0 in <xref target="RFC7931"/> and
	<xref target="I-D.cel-nfsv4-mv0-trunking-update"/>.
      </t>
      <t>
        A large part of the changes to be made are necessary to clarify
	the handling of Transparent State Migration in NFSv4.1, which was
	omitted in <xref target="RFC5661"/>.  Many of the issues 
        dealt with in <xref target="RFC7931"/> need to be addressed in
	the context of NFSv4.1.
      </t>
      <t>
        Another important issue to be dealt with concerns the handling 
        of multiple entries
        within location-related attributes that represent different ways
        to access the same file system.  Unfortunately
        <xref target="RFC5661"/>, while recognizing that these entries
        can represent different ways to access the same file system,
	confuses the matter by treating network access paths
	as "replicas", making it difficult
        for these attributes to be used to obtain information
	about the network addresses to be used to access particular
	file system instances and engendering confusion between two
        different sorts of transition: those involving a change of
	network access
	paths to the same file system instance and those in which there is 
	a shift between two distinct replicas.
      </t>
      <t>
	When location information is used to determine the set of
	network addresses to access a particular file system instance
	(i.e. to perform
	trunking discovery), clarification is needed regarding the
	interaction of trunking and transitions between file system replicas, 
        including migration.  Unfortunately <xref target="RFC5661"/>, while
	it provided a method of determining whether two network addresses
	were connected to the same server, did not address the issue of
	trunking discovery,
	making it necessary to address it in this document.
      </t>
    </section>
    <section title="Requirements Language"
	     anchor="REQL">
      <t>
        The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"
        in this document are to be interpreted 
        as described in <xref target="RFC2119" />.
      </t>
    </section>
    <section title="Preliminaries"
	     anchor="PRELIM">
      <section title="Terminology"
  	     anchor="PRELIM-term">
        <t>
  	  While most of the terms related to multi-server namespace issues
  	  are appropriately defined in the replacement for Section 11 in
  	  <xref target="RFC5661"/> and appear in
  	  <xref target="SEC11-loc-term"/> below, there are
  	  a number of terms used outside that context that are explained
  	  here.
        </t>
        <t>
          In this document, the phrase "client ID" always refers to the
  	  64-bit shorthand identifier assigned by the server (a clientid4)
  	  and never to the structure which the client uses to identify itself
  	  to the server (called an nfs_client_id4 or client_owner in NFSv4.0
  	  and NFSv4.1 respectively).  The opaque identifier within those
  	  structures is referred to as a "client id string".
        </t>
        <t>
          It is particularly important to clarify the distinction 
	  between trunking detection and
	  trunking discovery.  The definitions we present will be 
          applicable to all
	  minor versions of NFSv4, but we will put particular emphasis 
          on how these
	  terms apply to NFS version 4.1.
        <list style ='symbols'>
          <t>
            Trunking detection refers to ways of deciding whether two 
            specific network
            addresses are connected to the same NFSv4 server.  The
            means available to make this determination depends on the protocol
            version, and, in some cases, on the client implementation.
	  <vspace blankLines="1"/>
	    In the case of NFS version 4.1 and later minor versions, the
	    means of
	    trunking detection are as described by <xref target="RFC5661"/>
	    and are available
	    to every client.  Two network addresses 
            connected to the same server are
	    always server-trunkable but are not necessarily session-trunkable.
          </t>
          <t>
            Trunking discovery is a process by which a client using one
            network address can obtain other addresses that are connected 
            to the
	    same server.
            Typically it builds on a trunking detection facility by providing
	    one or more methods by which candidate addresses are made 
            available to the client
	    who can then use trunking detection to appropriately filter them.
	  <vspace blankLines="1"/>
	    Despite the support for trunking detection there was no
	    description of trunking discovery provided in 
            <xref target="RFC5661"/>.
        </t>
      </list>
      </t>    
	
        <t>
          Regarding network addresses and the handling of trunking we use the 
          following terminology:
        <list style ='symbols'>
          <t>
  	    Each NFSv4 server is assumed to have a set of IP addresses
  	    to which NFSv4 requests may be sent by clients.   These are referred
            to as the server's network addresses.  Access to a specific server
	    network address may involve the use of multiple ports, since
	    the ports to be used for various types of connections might
	    be required
	    to be different.
          </t>
  	
          <t>
  	    Each network address, when combined with a pathname providing the
  	    location of a file system root directory relative to the
  	    associated server root file handle, defines a file system network
  	    access path.
          </t>
	  <t>
	    Server network addresses are used to establish connections to
	    servers which may be of a number of connection types.  Separate
	    connection types are used to support NFSv4 layered on top of the
	    RPC stream transport as described in
	    <xref target="RFC5531"/> and on top
	    of RPC-over-RDMA as described in <xref target="RFC8166"/>.
	  </t>
	  <t>
	    The combination of a server network address and a particular
	    connection type to be used by a connection
	    is referred to as a "server endpoint".   Although using different
	    connection types may result in different ports being used, the
	    use of different ports by multiple connections to the same
	    network address is not the essence of the distinction between
	    the two endpoints used.
	  </t>

          <t>
            Two network addresses connected to the same server are said to
            be server-trunkable.
          </t>
          <t>
            Two network addresses connected to the same server such that
    	    those addresses can be used to support a single common session
            are referred to as session-trunkable.  Note that two addresses
    	    may be server-trunkable without being session-trunkable and that
	    when two connections of different connection types are made
	    to the same network address and are based on a single-location entry
	    they are always
	    session-trunkable, independent of the connection type, as
	    specified by <xref target="RFC5661"/>, since their derivation from
	    the same location entry assures that both connections are to the
	    same server.
          </t>

        </list>
        </t>    
        <t>
          Discussion of the term "replica" is complicated for a number
  	  of reasons:
        <list style ='symbols'>
          <t>
  	    Even though the term is used in explaining the issues in
  	    <xref target="RFC5661"/> that need to be addressed in this
  	    document, a full explanation of this term requires explanation of
  	    related terms connected to the location attributes which are
  	    provided in <xref target="SEC11-loc-term"/> of the current
  	    document.
          </t>
          <t>
  	    The term is also used in <xref target="RFC5661"/>, with a meaning
  	    different from that in the current document.  In short,
  	    in <xref target="RFC5661"/> each replica is a identified by a
  	    single network access path while, in the current document a set
  	    of network access paths which have server-trunkable network
  	    addresses and the same root-relative file system pathname are
  	    considered to be a single replica with multiple network access
	    paths.
          </t>
        </list>
        </t>    
      </section>
      <section title="Summary of Issues"
               anchor="PRELIM-sum">
        <t>
  	  This document explains how clients and servers are to determine
  	  the particular network access paths to be used to access a
  	  file system.  This includes describing 
  	  how changes to the specific replica or to
  	  the set of addresses to be used are to  be
  	  dealt with, and how transfers of responsibility that need to be
  	  made can be dealt with transparently.  This includes cases in which
  	  there is a shift between one replica and another and those in
  	  which different network access paths are used to access the
  	  same replica.
        </t>
        <t>
  	  As a result of the following problems in <xref target="RFC5661"/>, it
  	  is necessary to provide the updates described later in this
          document.
        <list style="symbols">
          <t>
  	    <xref target="RFC5661"/>, while it dealt with situations in
  	    which various forms of clustering allowed co-ordination
  	    of the state assigned by co-operating servers to be used,
  	    made no provisions for Transparent State Migration, as
  	    introduced by <xref target="RFC7530"/> and corrected and
  	    clarified by <xref target="RFC7931"/>.
          </t>
          <t>
  	    Although NFSv4.1 was defined with a clear definition of how
  	    trunking detection was to be done, there was no clear specification
  	    of how trunking discovery was to be done, despite the fact that 
            the specification clearly indicated that this information
            could be made available via the location attributes.
          </t>
          <t>
            Because the existence of multiple network access paths to the same
  	    file system was
            dealt with as if there were multiple replicas, issues relating to
            transitions between replicas could never be clearly distinguished
            from trunking-related transitions between the addresses used to 
            access a particular file system instance.
  	    As a result, in situations in
            which both migration and trunking configuration changes 
            were involved, neither of these
            could be clearly dealt with and the relationship between 
            these two features was not seriously addressed.
          </t>
          <t>
  	    Because use of two network access paths to the same file system
  	    instance
  	    (i.e. trunking) was often treated as if two replicas were
	    involved, it was considered that
  	    two replicas were being used simultaneously.  As a
  	    result, the treatment of replicas being used simultaneously
  	    in <xref target="RFC5661" /> was not clear as it covered the
  	    two distinct cases of a
  	    single file system instance being accessed by
  	    two different network access
  	    paths and two
  	    replicas being accessed simultaneously, with the limitations
  	    of the latter case not being clearly laid out. 
            </t>
          </list>
          </t>
          <t>
  	    The majority of the consequences of these issues are dealt with
            via the updates in various subsections of 
            <xref target="SEC11"/> and the whole of
	    <xref target="SEC11-locations-info"/> within the current document
  	    which deal with problems within Section 11
            of <xref target="RFC5661"/> These changes include:
          <list style="symbols">
            <t>
              Reorganization made necessary by the fact that two network
  	      access paths to 
              the same file system instance
              needs to be distinguished clearly from two different replicas 
              since the
              former share locking state and can share session state.
            </t>
            <t>
              The need for a clear statement regarding the desirability of 
              transparent transfer of state together with a recommendation 
              that either that or a single-fs grace period be provided. 
            </t>
            <t>
              Specifically delineating how such transfers are to be dealt
	      with by
              the client, taking into account the differences from the treatment
              in <xref target="RFC7931"/> made necessary by the major protocol
              changes made in NFSv4.1. 
            </t>
            <t>
              Discussion of the relationship between transparent
              state transfer and Parallel NFS (pNFS). 
            </t>
            <t>
	      A clarification of the fs_locations_info attribute to specify
	      which portions of the information provided apply to a specific
	      network access path and which to the replica which that path
	      is used to access.
            </t>
	    
          </list>
          </t>
          <t>
            In addition, there are also updates to other sections of
  	    <xref target="RFC5661"/>, where the consequences of the
            incorrect assumptions
            underlying the current treatment of multi-server namespace
            issues also need to be corrected.  These are to be dealt with as 
            described in Sections 
  	    <xref target="OTH" format="counter"/> through
	    <xref target="RC" format="counter"/> of the current document.
          <list style="symbols">
            <t>
              A revised introductory section regarding multi-server namespace
              facilities is provided.  
            </t>
            <t>
              A more realistic treatment of server scope is provided, which
              reflects the more limited co-ordination of locking state
              adopted by servers actually sharing a common server scope.
            </t>
            <t>
              Some confusing text regarding changes in server_owner needs to
              be clarified.
          </t>
          <t>
            The description of NFS4ERR_MOVED needs to be updated since two
            different network access paths to the same file system are
  	    no longer considered to be
            two instances of the same file system.
          </t>
          <t>
            A new treatment of EXCHANGE_ID is needed, replacing that
            which appeared in Section 18.35 of <xref target="RFC5661"/>.
	    This is necessary since the existing treatment of client
	    id confirmation does not make sense in the context of
	    transparent state migration, in which client ids are transferred
	    between source and destination servers.
          </t>
          <t>
            A new treatment of RECLAIM_COMPLETE is needed, replacing that
            which appeared in Section 18.51 of <xref target="RFC5661"/>.
	    This is necessary to clarify the function of the one-fs flag
	    and clarify how existing clients, that might not properly use
	    this flag, are to be dealt with.
          </t>
        </list>
        </t>
      </section>  
        
      <section title="Relationship of this Document to RFC5661"
               anchor="PRELIM-rel">
        <t>
          The role of this document is to explain and specify a set of
          needed changes to <xref target="RFC5661"/>.  All of these changes 
          are related to the multi-server namespace features of NFSv4.1.
        </t>
        <t>
          This document contains sections that propose additions to and 
          other modifications of 
          <xref target="RFC5661"/> as well as others that explain the reasons
          for modifications but do not directly affect existing specifications.
        </t>
	<t>
          In consequence, the sections of this document can be divided
	  into four groups 
          based on how they relate to the eventual updating of the
	  NFSv4.1 specification.  Once the update is published, NFSv4.1
	  will be specified by two documents that need to be read together,
	  until such time as a consolidated specification is produced.
	
        <list style="symbols">
          <t>
            Explanatory sections do not contain any material that is meant
    	    to update the specification of NFSv4.1.  Such sections may
  	    contain explanations 
  	    about why and how changes are to be done, without including
  	    any text that is to update <xref target="RFC5661"/> or appear
  	    in an eventual consolidated document,
          </t>
          <t>
            Replacement sections contain text that is to replace and thus
  	    supersede text within <xref target="RFC5661"/> and then
  	    appear in an eventual consolidated document.  Replacement
            sections have the phrase "(as updated)" appended to the section 
            title.
          </t>
          <t>
            Additional sections contain text which, although not replacing
  	    anything in <xref target="RFC5661"/>, will be part of the
  	    specification of NFSv4.1 and will be expected to be part of
  	    an eventual consolidated document. Additional 
            sections have the phrase "(to be added)" appended to the section 
            title.
          </t>
          <t>
            Editing sections contain some text that replaces text within
  	    <xref target="RFC5661"/>, although the entire section will not
  	    consist of such text and will include other text as well.
  	    Such sections make relatively
  	    minor adjustments in the existing NFSv4.1 specification which are
  	    expected to reflected in an eventual consolidated document.
  	    Generally such replacement text appears as a quotation, which may
  	    take the form of an indented set of paragraphs. 
          </t>
        </list>
        </t>

        <t>
	  See <xref target="CLASS"/> for a classification of the sections
	  of this document according to the categories above.
	</t>
	<t>  
          When this document is approved and published, 
          <xref target="RFC5661"/> would be significantly updated with most
          of the changed sections within the current Section 11 of that
          document. A detailed discussion of the necessary updates 
          can be found in <xref target="UPD"/>.
        </t>
      </section>  
    </section>  
    <section title="Changes to Section 11 of RFC5661"
             anchor="SEC11">
      <t>
	A number of sections need to be revised, replacing existing 
        sub-sections
	within section 11 of <xref target="RFC5661"/>:
      <list style="symbols">
	<t>
	  New introductory material, including a terminology section,
	  replaces the existing material
	  in <xref target="RFC5661"/>
	  ranging from the start of the existing Section 11
	  up to and including the existing Section
	  11.1.  The new material appears in Sections
	  <xref target="SEC11-msns-oview" format="counter"/>
	  through <xref target="SEC11-loc-attr" format="counter"/>
	  below.
	</t>
	<t>
	  A significant reorganization of the material in the
          existing Sections 11.4 and 11.5 (of <xref target="RFC5661"/>)
	  is necessary.
	  The reasons for the reorganization of 
	  these sections into a single section with multiple subsections
          are discussed in
	  <xref target="SEC11-uses-reorg"/> below.
	  This replacement appears as <xref target="SEC11-USES"/>
	  below.
        <vspace blankLines='1' />
	  New material relating to the handling of the location 
          attributes is contained
	  in Sections <xref target="SEC11-USES-mult" format="counter"/> and
	  <xref target="SEC11-USES-changes" format="counter"/> below.
        </t>
	<t>
	  A major replacement for  the existing Section 11.7 of
	  <xref target="RFC5661" /> entitled
	  "Effecting File System Transitions", will appear as Sections
	  <xref target="SEC11-trans-oview" format="counter"/>
	  through <xref target="SEC11-trans-server" format="counter"/>
	  of the current document.
	  The reasons for the reorganization of 
	  this section into multiple sections are discussed below in
	  <xref target="SEC11-trans-reorg"/> of the current document.
        </t>
        <t>
	  A replacement for  the existing Section 11.10 of
	  <xref target="RFC5661" /> entitled
	  "The Attribute fs_locations_info", will appear as 
	  <xref target="SEC11-li-new"/> of the current document, with
	  <xref target="SEC11-li-changes"/> describing the differences
	  between the new section and the treatment within
	  <xref target="RFC5661" />. 
	  A revised treatment is necessary because the existing treatment
	  did not make clear how the added attribute information relates
	  to the case of trunked paths to the same replica.  These issues
	  were not addressed in <xref target="RFC5661" /> where the
	  concepts of a replica and a network path used to access a replica
	  were not clearly distinguished.
        </t>
      </list>

      </t>
      <section title="Multi-Server Namespace (as updated)"
  	         anchor="SEC11-msns-oview">
         <t>
           NFSv4.1 supports attributes that allow a namespace to extend
           beyond the boundaries of a single server.  It is desirable
           that clients and servers support construction of such
           multi-server namespaces.  Use of such multi-server namespaces 
           is OPTIONAL however, and for many purposes,
           single-server namespaces are perfectly acceptable.  Use of
           multi-server namespaces can provide many advantages, by
           separating a file system's logical position in a namespace from
           the (possibly changing) logistical and administrative
           considerations that result in particular file systems being
           located on particular servers.
        </t>
      </section>
      <section title="Location-related Terminology (to be added)"
  	         anchor="SEC11-loc-term">
        <t>
          Regarding terminology relating to the construction of multi-server
	  namespaces out of a set of local per-server namespaces:
        <list style ='symbols'>
          <t>
	    Each server has a set of exported file systems which may accessed
	    by NFSv4 clients.  Typically, this is done by assigning each
	    file system a name within the pseudo-fs associated with the
	    server, although the pseudo-fs may be dispensed with if there
	    is only a single exported file system.  Each such file system
	    is part of the server's local namespace, and can be considered
	    as a file system instance within a larger multi-server
	    namespace.
          </t>
          <t>
	    The set of all exported file systems for a given server
	    constitutes that server's local namespace.
          </t>
          <t>
	    In some cases, a server will have a namespace more extensive
	    than its local namespace, by using features associated with
	    attributes that provide location information.  These features,
	    which allow construction of a multi-server namespace
	    are all described in individual sections below and include
	    referrals (described in <xref target="SEC11-USES-ref"/>),
	    migration (described in <xref target="SEC11-USES-migr"/>), and
            replication (described in <xref target="SEC11-USES-repl"/>).
          </t>
	  <t>
	    A file system present in a server's pseudo-fs may have multiple
	    file system instances on different servers associated with it.
	    All such instances are considered replicas of one another.
	  </t>
	  <t>
	    When a file system is present in a server's pseudo-fs, but
	    there is no corresponding local file system, it is said to
	    be "absent".  In such cases, all associated instances will
	    be accessed on other servers.
	  </t>
        </list>
        </t>
        <t>
          Regarding terminology relating to attributes used in trunking
          discovery and other multi-server namespace features:
        <list style ='symbols'>
          <t>
            Location attributes include the fs_locations and
            fs_locations_info attributes.
          </t>
          <t>
            Location entries are the individual file system locations
            in the location attributes.  Each such entry specifies a
            server, in the form of a host name or IP address, and an fs name,
	    which designates the location of the file system within
	    the server's pseudo-fs.  A location entry designates a set
	    of server endpoints to which the client may establish connections.
	    There may be multiple endpoints because a host name may map to
	    multiple network addresses and because multiple connection types
	    may be
	    used to communicate with a single network address.  However, all
	    such endpoints MUST provide a way of connecting to a single server. 
            The exact form of the location entry varies with the 
            particular location attribute used, as described in 
            <xref target="SEC11-loc-attr"/>.  
          </t>
          <t>
            Location elements are derived from location entries and each
            describes a particular network access path, consisting of a network
	    address and a location within the server's pseudo-fs.
	    Location elements need not appear 
            within a location attribute, but the
            existence of each location element derives from a corresponding
            location entry.  When a
            location entry specifies an IP address there is only a single
            corresponding location element.  Location entries that 
            contain a host name, are resolved using DNS, and may result
            in one or more location elements.  All  location elements
            consist of a location address which is the IP address of
            an interface to a server and an fs  name which is the location 
            of the file system within the server's pseudo-fs.  The fs name
            is empty if the server has no pseudo-fs and only a single exported
	    file system at the root filehandle.
          </t>
          <t>
            Two location elements are said to be server-trunkable if they 
            specify the same fs name and the location addresses are such 
            that the location addresses are server-trunkable.
          </t>
          <t>
            Two location elements are said to be session-trunkable 
            if they 
            specify the same fs name and the location addresses are such 
            that the location addresses are session-trunkable.
          </t>
        </list>
        </t>
        <t>
          Each set of server-trunkable location elements defines a set of 
          available network access paths to a particular file system.
	  When there
          are multiple such file systems, each of  which contains the
          same data, these file systems are considered replicas
          of one another.  Logically, such replication
          is symmetric, since the fs currently in use and an alternate fs
          are replicas of each other.  Often, in other documents, the term
          "replica" is not applied to the fs currently in use, despite the
          fact that the replication relation is inherently symmetric.
        </t>
      </section>
      <section title="Location Attributes (as updated)"
  	       anchor="SEC11-loc-attr">
        <t>
          NFSv4.1 contains RECOMMENDED attributes that provide information
	  about how (i.e. at what network address and namespace position)
	  a given file system may be accessed.  As a result, file systems
	  in the namespace of one server can be
          associated with one or more instances of that
          file system on other servers.  These attributes contain location
          entries specifying a server address
          target (either as a DNS name representing one or more IP
          addresses or as a specific IP address) together with the pathname 
          of that file system within the associated single-server namespace.
        </t>
        <t>
          The fs_locations_info RECOMMENDED attribute
          allows specification of one or more file system instance locations
          where the data corresponding to a given file
          system may be found.  This attribute provides to the client,
          in addition to specification of file system instance locations,
	  other helpful
	  information such as:
	<list style='symbols'>
	  <t>
	    Information guiding choices among the various file system instances
	    provided (e.g., priority for use, writability, currency, etc.).
	  </t>  	    
	  <t>  
	    
	    Information to help the client efficiently effect as seamless
	    a transition
            as possible among multiple file system instances, when and if
            that should be necessary.
          </t>
          <t>
	    Information helping to guide the selection of the appropriate
	    connection type to be used when establishing a connection.
          </t>
	</list>
        </t>
        <t>
          Within the fs_locations_info attribute, each
          fs_locations_server4 entry corresponds to a location entry with the
          fls_server field designating the server, with the location pathname
	  within
          the server's pseudo-fs given by the fl_rootpath field of the
          encompassing fs_locations_item4.
        </t>
        <t>
          The fs_locations attribute defined in NFSv4.0 is also a part of
          NFSv4.1.  This attribute
	  only allows specification 
          of the file system
          locations where the data corresponding to a given file
          system may be found.  Servers should  make this attribute available
          whenever fs_locations_info is supported, but client use of 
          fs_locations_info is preferable, as it provides more information.
        </t>
        <t>
          Within the fs_location attribute, each fs_location4 contains a
          location entry with the server field designating the server and
          the rootpath field giving the location pathname within the server's 
          pseudo-fs.
        </t>
      </section>
      <section title="Re-organization of Sections 11.4 and 11.5 of RFC5661"
	       anchor="SEC11-uses-reorg">
        <t>
          Previously, issues related to the fact that multiple location
          entries directed the client to the same file system instance
	  were dealt with
          in a separate Section 11.5 of <xref target="RFC5661"/>. 
          Because of the new treatment of
          trunking, these issues now belong within <xref target="SEC11-USES"/>
          below.
        </t>
        <t>
          In this new section of the current document,
	  trunking is dealt with in 
          <xref target="SEC11-USES-trunk"/> together with the other uses
          of location information described in Sections
          <xref target="SEC11-USES-repl" format="counter"/>,
          <xref target="SEC11-USES-migr" format="counter"/>, and
          <xref target="SEC11-USES-ref" format="counter"/>.
        </t>
      </section>
      <section title="Uses of Location Information (as updated)"
	       anchor="SEC11-USES">
        <t>
          The location attributes (i.e. fs_locations and fs_locations_info),
          together with the possibility of absent file systems, provide
          a number of important facilities in providing reliable, manageable, 
          and scalable data access.
        </t>
        <t>
          When a file system is present, these attributes can provide 
        <list style="symbols">
          <t>
            The locations of alternative replicas, to be used to access the 
            same data in the event of server failures, 
            communications problems, 
            or other difficulties that make continued access to the current
            replica impossible or otherwise impractical.  Provision and
            use of
            such alternate replicas is referred to as "replication" 
            and is discussed in 
            <xref target="SEC11-USES-repl"/> below.
          </t>
          <t>
            The network address(es) to be used to access the current file
	    system instance or replicas of it.
            Client use of this information is
            discussed in 
            <xref target="SEC11-USES-trunk"/> below.
          </t>
        </list>
        </t>
        <t>
          Under some circumstances, multiple replicas
          may be used simultaneously to provide higher-performance 
          access to the file system in question, although the lack of state
          sharing between servers may be an impediment to such use.  
        </t>
        <t>
          When a file system is present and becomes absent, clients can be
          given the opportunity to have continued access to their data,
          using a different replica.  In this case, a continued attempt
          to use the data in the now-absent file system will result 
          in an NFS4ERR_MOVED error and, at that point, the successor 
          replica or set of possible replica choices
          can be fetched and used to continue access.  Transfer of access
          to the new replica location is referred to as 
          "migration", and is discussed in 
          <xref target="SEC11-USES-repl"/> below.

        </t>
        <t>
          Where a file system was previously absent, specification
          of file system location provides a means by which file systems
          located on one server can be associated with a namespace 
          defined by another server, thus allowing a general multi-server
          namespace facility.  A designation of such a remote instance, in 
          place of a file system never previously present , is called 
          a "pure referral" and is discussed in 
          <xref target="SEC11-USES-ref"/> below.
        </t>
        <t>
          Because client support for location-related attributes is 
          OPTIONAL, a server may (but is not required to) take action
          to hide migration and referral events from such clients, by
          acting as a proxy, for example.  The server can determine
          the presence of client support from the arguments of the 
          EXCHANGE_ID operation (see 
          <xref target="EXID-desc" /> in the current document).
        </t>
      <section title="Combining Multiple Uses in a Single Attribute (to be added)"
               anchor="SEC11-USES-mult">
        <t>
          A location attribute will sometimes contain information
          relating to the location of multiple replicas which may
          be used in different ways.
        <list style="symbols">
          <t>
            Location entries that relate to the file system instance
	    currently in
            use provide trunking information, allowing the client to
            find additional network addresses by which the instance may be
            accessed.
          </t>
          <t>
            Location entries that provide information about
            replicas to which access is to 
            be transferred.
          </t>
          <t>
            Other location entries that relate to replicas that are available to
            use in the event that access to the current replica becomes
            unsatisfactory.
          </t>
        </list>
        </t>
        <t>
          In order to simplify client handling and allow the best choice
          of replicas to access, the server should adhere to the following
          guidelines. 
        <list style="symbols">
          <t>
            All location entries that relate to a single file system instance
	    should be
            adjacent.
          </t>
          <t>
            Location entries that relate to the instance currently in use 
            should appear first.
          </t>
          <t>
            Location entries that relate to replica(s) to which migration
            is occurring should appear before replicas which are available
            for later use if the current replica should become inaccessible.
            
          </t>
        </list>
        </t>
      </section>    
        <section title="Location Attributes and Trunking (to be added)"
                 anchor="SEC11-USES-trunk">

          <t>
            Trunking is the use of multiple connections between a client and
            server in order to increase the speed of data transfer.
            A client may determine the set of network addresses to use to
            access a given file system in a number of ways: 
          <list style="symbols">
            <t>
	      When the name of the server is known to the client, it may use
	      DNS to obtain a set of network addresses to use in
	      accessing the server.
            </t>
            <t>
	      It
	      may fetch the location attribute for the filesystem which
	      will provide either the name of the server (which can be turned
	      into a set of network addresses using DNS), or it will find
	      a set of server-trunkable location entries which can
	      provide the addresses specified by the server as desirable to use
	      to access the file system in question.
            </t>
          </list>
          </t>
          <t>
            The server can provide location entries that include either
            names or network addresses.  It might use the latter form 
            because of DNS-related security concerns or because the set
            of addresses
            to be used might require active management by the server.  
          </t>
          <t>
            Locations entries used to discover candidate addresses for 
            use in trunking
            are subject to change, as discussed in 
            <xref target="SEC11-USES-changes"/> below.  
            The client may respond to 
            such changes by using additional addresses once they are 
            verified or by ceasing to use 
            existing ones.  The server can force the client to cease using 
            an address by returning NFS4ERR_MOVED when that address is used to
            access a file system.  This allows a transfer of client access
	    which is similar to
            migration, although the same file system instance
	    is accessed throughout.
          </t>
        </section>    
        <section title="Location Attributes and Connection Type Selection (to be added)"
                 anchor="SEC11-USES-types">

          <t>
	    Because of the need to support multiple connections, clients face
	    the issue of determining the proper connection type to use
	    when establishing
	    a connection to a given server network address.  In some cases,
	    this issue can be addressed through the use of the connection
	    "step-up" facility described in Section 18.16 of
	    <xref target="RFC5661"/>.  However,
	    because there are cases is which that facility is not available,
	    the client may have to choose a connection type with no
	    possibility of changing it within the scope of a single connection.
	  </t>
	  <t>
	    The two location attributes differ as to the information made
	    available in this regard.   Fs_locations provides no information
	    to support connection type selection.  As a result, clients
	    supporting multiple connection types would need to attempt to
	    establish connections using multiple connection types until
	    the one preferred
	    by the client is successfully established.
	  </t>
	  <t>
	    Fs_locations_info provides a flag, FSLI4TF_RDMA flag.  
	    indicating that RPC-over-RDMA support is available using
	    the specified location entry.  This flag
	    makes it for a convenient for a client wishing to use RDMA,
	    to establish a TCP connection and then convert to use of RDMA.
	    After establishing a TCP connection, the step-up facility,
	    can be used, if available,
	    to convert that connection to RDMA mode. Otherwise,
	    if RDMA availability is indicated, a new RDMA
	    connection can be established and it can be bound to
	    the session already established by the
	    TCP connection, allowing the TCP connection to be dropped
	    and the session converted to further use in RDMA node.
          </t>
        </section>    
        <section title="File System Replication (as updated)"
                 anchor="SEC11-USES-repl">

          <t>
            The fs_locations and fs_locations_info attributes provide
            alternative locations, to be used to access data in place
            of or in addition to 
            the current file system instance.  On first access to a
            file system, the client should obtain the set
            of alternate locations by interrogating the fs_locations or
            fs_locations_info attribute, with the latter being preferred.
          </t>
          <t>
            In the event that server failures, communications problems, 
            or other difficulties make continued access to the current
            file system impossible or otherwise impractical, the client
            can use the alternate locations as a way to get continued 
            access to its data.  
          </t>
          <t>
            The alternate locations may be physical replicas of the
            (typically read-only) file system data, or they may
            provide 
            for the use of various forms of server
            clustering in which multiple servers provide alternate 
            ways of accessing the same physical file system.  How these
            different modes of file system transition are represented 
            within the fs_locations and fs_locations_info attributes 
            and how the client deals with
            file system transition issues will be discussed in detail
            below.
          </t>
        </section>    
        <section title="File System Migration (as updated)"
                 anchor="SEC11-USES-migr">
          <t>
            When a file system is present and becomes absent, clients can be
            given the opportunity to have continued access to their data,
            at an alternate location, as specified by a location attribute.
            This migration of access to another replica includes the ability
            to retain locks across the transition, either by using lock
	    reclaim or by taking advantage
            of Transparent State Migration.  
          </t>
          <t>
            Typically, a client will be 
            accessing the file system in question, get an NFS4ERR_MOVED
            error, and then use a location attribute
            to determine the new location of the data.  When
            fs_locations_info is used, additional information will be
            available that will define the nature of the client's 
            handling of the transition to a new server.   
          </t>
          <t>
            Such migration can be helpful in providing 
            load balancing or general resource reallocation.  The protocol 
            does not specify how the file system will be moved between 
            servers.  It is anticipated that a number of different 
            server-to-server transfer mechanisms might be used with the
            choice left to the server implementer.  The NFSv4.1 protocol
            specifies the method used to communicate the migration
            event between client and server.
          </t>
          <t>
            The new location may be, in the case of
            various forms of server
            clustering, another server providing
            access to the same physical file system.  The client's 
            responsibilities in dealing with this transition will depend
            on whether migration has occurred and the means the server
            has chosen to provide continuity of locking state.
            These issues will be discussed in
            detail below.
          </t>
          <t>
            Although a single successor location is typical, multiple 
            locations may be provided.  When multiple locations are 
            provided, the client will typically use the first one provided.
	    If that is
            inaccessible for some reason, later ones can be used.  In such
            cases the client might consider that the transition to the new
            replica as a migration event, even though some of the servers
	    involved might not be aware of the use of the server which was
	    inaccessible.  In such a case, a client might lose access to 
            locking state as a result of the access transfer.
          </t>
          <t>
            When an alternate location is designated as the target for
            migration, it must designate the same data
            (with metadata being the same to the degree indicated by the
            fs_locations_info attribute).  Where file systems are writable,
            a change made on the original file system must be visible on
            all migration targets. Where a file system is not writable
            but represents a read-only copy (possibly periodically 
            updated) of
            a writable file system, similar requirements apply to the 
            propagation of updates.  Any change visible in the original
            file system must already be effected on all migration targets,
            to avoid any possibility that a client, 
            in effecting a transition to 
            the migration target, will see any reversion 
            in file system state.  
          </t>

        </section>    
        <section title="Referrals (as updated)"
                 anchor="SEC11-USES-ref">
          <t>
            Referrals allow the server to associate a file system namespace
	    entry located on
            one server with a file system located on another server.  
            When this includes
            the use of pure referrals, servers are provided a way of 
            placing a file system in a location
            within the namespace
            essentially without respect to its physical location on a
            particular server.  
            This allows a single server or a set of servers
            to present a multi-server namespace that encompasses file systems
            located on a wider range of
            servers.  Some likely uses of this facility include
            establishment of site-wide or organization-wide namespaces,
            with the eventual possibility of combining such 
            together into a truly global namespace.
          </t>
          <t>
            Referrals occur when a client determines, upon first referencing
            a position in the current namespace, that it is part of a new 
            file system and that the file system is absent.  When this 
            occurs, typically upon receiving the error NFS4ERR_MOVED, the
            actual location or locations of the file system can be 
            determined by fetching the a locations attribute. 
            attribute.
          </t>
          <t>
            The locations attribute may designate a single 
            file system location or multiple file system locations, to
            be selected based on the needs of the client.  The server,
            in the fs_locations_info attribute, may specify priorities to 
            be associated with various file system location choices.
            The server may assign different priorities to different
            locations as reported to individual clients, in order to
            adapt to client physical location or to effect load balancing.
            When both read-only and read-write file systems are present,
            some of the read-only locations might not be absolutely up-to-date
            (as they would have to be in the case of replication and
            migration).  Servers may also specify file system locations
            that include client-substituted variables so that different
            clients are referred to different file systems (with different
            data contents) based on client attributes such as CPU 
            architecture.
          </t>
          <t>
            When the fs_locations_info attribute is such that that there are
            multiple possible targets listed, the relationships among them
            may be important to the client in selecting which one to use.
            The same rules specified in <xref target="SEC11-USES-migr"/>
            below regarding multiple migration targets
            apply to these multiple replicas as well.  For example, the
            client might prefer a writable target on a server that has
     	    additional writable
            replicas to which it subsequently might switch.  Note that,
            as distinguished from the case of replication, there is no
            need to deal with the case of propagation of updates made by
            the current client, since the current client has not accessed
            the file system in question.
          </t>
          <t>
            Use of multi-server namespaces is enabled by NFSv4.1 but is not
            required.  The use of multi-server namespaces and their scope
            will depend on the applications used and system administration
            preferences. 
          </t>
          <t>
            Multi-server namespaces can be established by a single 
            server providing a large set of pure referrals to all of the
            included file systems.  Alternatively, a single multi-server
            namespace may be administratively segmented with separate
            referral file systems (on separate servers) for each
            separately administered portion of the namespace. The
            top-level referral file system or any segment may use
            replicated referral file systems for higher availability.  
          </t>
          <t>
            Generally, multi-server namespaces are for the most part 
            uniform, in that the same data made available to one client
            at a given location in the namespace is made available to
            all clients at that location.  However, as described above,
	    there are facilities
            provided that allow different clients to be directed 
            different sets of data, to enable 
            adaptation to such client
            characteristics as CPU architecture.  
          </t>

        </section>    
        <section title="Changes in a Location Attribute (to be added)"
                 anchor="SEC11-USES-changes">
          <t>
            Although clients will typically fetch a location attribute
            when first accessing a file system and when NFS4ERR_MOVED
            is returned, a client can choose to fetch the attribute
            periodically, in which case the value fetched may change over
            time.  
          </t>
          <t>
            For clients not prepared to access multiple
            replicas simultaneously (see
            <xref target="SEC11-EFF-simul"/> of the current document),
            the handling of the various cases of change is as follows: 
          <list style="symbols">
            <t>
	      Changes in the list of replicas or in the network addresses
	      associated with replicas do not require immediate action.
	      The client will typically update its list of replicas to
	      reflect the new information.
            </t>
            <t>
	      Additions to the list of network addresses for the
	      current file system instance need not be acted
	      on promptly.  However the client can choose to use the new
	      address whenever it needs to switch access to a new
              replica.
            </t>
            <t>
	      Deletions from the list of network addresses for the
	      current file system instance need not be acted on immediately, 
              although the client might
	      need to be prepared for a shift in access whenever the
	      server indicates that a network access path is
              not usable to access
	      the current file system,
	      by returning NFS4ERR_MOVED.
            </t>
          </list>
          </t>
          <t>
            For clients that are prepared to access several replicas 
            simultaneously, 
            the following additional cases need to be addressed.  As in
            the cases discussed above, changes in the set of replicas
            need not be acted upon promptly, although the client has
            the option of adjusting its access even in the absence of 
            difficulties that would lead to a new replica to be selected.
          <list style="symbols">
            <t>
              When a new replica is added which may be accessed
              simultaneously with one currently in use, the client is free
              to use the new replica immediately.
            </t>
            <t>
              When a replica currently in use is deleted from the list, the
              client need not cease using it immediately.  However, since
              the server may subsequently force such use to cease (by
              returning NFS4ERR_MOVED), clients might decide to limit the 
              need for later state transfer.  For example, new opens might
              be done on other replicas, rather than on one not present in
              the list.
            </t>
          </list>
          </t>
        </section>    
      </section>    
      </section>    
  
      <section title="Re-organization of Section 11.7 of RFC5661"
	       anchor="SEC11-trans-reorg">
	<t>
	  The material in Section 11.7 of <xref target="RFC5661"/> has
	  been reorganized and augmented as specified below:
	<list style="symbols">
 	  <t>
	    Because there can be a shift of the network access paths used to
	    access a file system instance without any shift between replicas,
	    a new <xref target="SEC11-trans-oview"/> in the current
	    document distinguishes
	    between those cases in which there is a shift between
            distinct replicas and those involving a shift in network
	    access paths
            with no shift between replicas.
          <vspace blankLines="1"/>
            As a result, a new <xref target="SEC11-nwa"/> in the current
	    document deals with network
            address transitions while the bulk of the former Section 11.7
	    (in <xref target="RFC5661"/>)
            is replaced by
	    <xref target="SEC11-EFF"/> in the current document
	    which is now limited to cases
	    in which there is a shift between two different sets of replicas. 
 	  </t>
 	  <t>
	    The additional <xref target="SEC11-trans-locking"/> in the
	    current document discusses the
	    case in which a shift to a different replica is made and state
	    is transferred to allow the client the ability to have continues
	    access to the accumulated locking state on the new server.
 	  </t>
 	  <t>
	    The additional <xref target="SEC11-trans-client"/> in the
	    current document discusses
	    the client's response to access transitions and how it determines
	    whether migration has occurred, and how it gets access to any
            transferred
	    locking and session state.
 	  </t>
 	  <t>
	    The additional <xref target="SEC11-trans-server"/> in the
	    current document discusses the
	    responsibilities of the source and destination servers when
	    transferring locking and session state.
 	  </t>
 	</list>
 	</t>
      </section>    
      <section title="Overview of File Access Transitions (to be added)"
	       anchor="SEC11-trans-oview">
	<t>
	  File access transitions are of two types:
        <list style="symbols">
  	  <t>
            Those that involve a transition from accessing the current
            replica to another one in connection with either
            replication or migration.
            How these are dealt with is discussed in 
            <xref target="SEC11-EFF"/> of the current document.
  	  </t>
  	  <t>
            Those in which access to the current file system instance
	    is retained, while
            the network path used to access that instance is changed.
	    This case is
            discussed in <xref target="SEC11-nwa"/> of the current document.
  	  </t>
        </list>
	</t>
      
      </section>    
      <section title="Effecting Network Endpoint Transitions (to be added)"
	       anchor="SEC11-nwa">
	<t>
	  The endpoints used to access a particular file system instance
	  may change in a number of ways, as listed below.  In each of these
	  cases, the same filehandles, stateids, client IDs and session are
	  used to continue access, with a continuity of lock state. 
        <list style="symbols">
  	  <t>
            When use of a particular address is to cease and there is 
            also one
            currently in use which is server-trunkable with it, requests
            that would have been issued on the address whose use is to be
	    discontinued can be issued on the remaining address(es).  When an
	    address is not a session-trunkable one, the request might need
	    to be modified to reflect the fact that a different session will
	    be used.
  	  </t>
  	  <t>
	    When use of a particular connection is to cease, as indicated
	    by receiving NFS4ERR_MOVED when using that connection but
	    that address is
	    still indicated as accessible according to the appropriate location
	    entries, it is likely that requests can be issued on a new
	    connection of a different connection type, once that connection
	    is established. Since any two 
	    server endpoints that share a network address are inherently
	    session-trunkable, the client can use BIND_CONN_TO_SESSION
            to access the existing session using the new connection and
	    proceed to access the file system using the new connection.
  	  </t>
  	  <t>
            When there are no potential replacement addresses in use but there
            are valid addresses session-trunkable with the one whose use is
            to be discontinued, the client can use BIND_CONN_TO_SESSION
            to access the existing session using the new address.  Although
            the target session will generally be accessible, there may be
            cases in which that session in no longer accessible, in which
            case a new session can be created to provide the client continued
            access to the existing instance.
          </t>
  	  <t>
            When there is no potential replacement address in use and there
            are no
            valid addresses session-trunkable with the one whose use is
            to be discontinued, other server-trunkable addresses may be
            used to provide continued access.  Although use of CREATE_SESSION
            is available to provide continued access to the existing instance,
            servers have the option of providing continued access to the
            existing session through the new network access
	    path in a fashion similar to
            that provided by session migration (see 
            <xref target="SEC11-trans-locking"/> of the current document).  
            To take advantage of this
            possibility, clients can perform an initial BIND_CONN_TO_SESSION,
            as in the previous case, and use CREATE_SESSION only if that 
            fails.
  	  </t>
        </list>
	</t>
      
      </section>    
      <section title="Effecting File System Transitions (as updated)"
	       anchor="SEC11-EFF">
        <t>
          There are a range of situations in which there is a change to be 
          effected in the set of replicas used to access a particular 
          file system.  Some of these may involve an expansion or
          contraction of the set of replicas used as discussed in
          <xref target="SEC11-EFF-simul"/> below. 
        </t>
        <t>
          For reasons explained in that section, most transitions will involve
          a transition from a single replica to a corresponding replacement
          replica.  When effecting replica transition, some types of 
          sharing between the replicas may affect handling of the 
          transition as described in
          Sections <xref target="SEC11-EFF-fh" format="counter"/>
          through <xref target="SEC11-EFF-data" format="counter"/> below.
          The attribute fs_locations_info provides helpful information
          to allow the client to determine the degree of inter-replica
          sharing.
        </t>
        <t>
          With regard to some types of state,  the degree of continuity
          across the transition
          depends on the occasion prompting the transition, with 
          transitions initiated by the servers 
          (i.e. migration) offering much more scope for a non-disruptive
          transition than cases in which the client on its own
          shifts its access to
          another replica (i.e. replication).  This issue
          potentially applies to 
          locking state and to session state, which are dealt with below as
          follows:
        <list style="symbols">
  	  <t>
            An introduction to the possible means of providing continuity of
            these areas appears in <xref target="SEC11-EFF-lock"/> below.
          </t>
          <t>
            Transparent State Migration is introduced in 
            <xref target="SEC11-trans-locking"/> of the current document.  
             The possible transfer of
            session state is addressed there as well.
          </t>
          <t>
            The client handling of transitions, including determining how to
            deal with the various means that the server might take to 
            supply effective continuity of locking state are discussed in
	    <xref target="SEC11-trans-client"/>	of the current document.
          </t>
          <t>
            The servers' (source and destination) responsibilities 
            in effecting Transparent Migration 
            of locking and session state are discussed in 
            <xref target="SEC11-trans-server"/> of the current document.
          </t>
        </list>
        </t>
        <section title="File System Transitions and Simultaneous Access (as updated)"
  	         anchor="SEC11-EFF-simul">
          <t>
            The fs_locations_info attribute (described in Section 11.10.1 of
	    <xref target="RFC5661"/> and
	    <xref target="SEC11-li-new"/> of this document) 
	    may indicate that two replicas
            may be used simultaneously (see Section 11.7.2.1 of 
            <xref target="RFC5661"/> for details).  Although situations
            in which multiple replicas may be accessed simultaneously are
            somewhat similar to those in which a single replica is
            accessed by multiple network addresses, there are important
            differences, since locking state is not shared among multiple
            replicas. 
          </t>
          <t>
            Because of this difference in state handling, many clients will 
            not have the ability to take advantage of the fact that such 
            replicas represent the same data.  Such clients will not be
            prepared to use multiple replicas simultaneously but will access
            each file system using only a single replica, although the
            replica selected might make multiple server-trunkable addresses
            available.
          </t>
          <t>
            Clients who are prepared to use multiple replicas simultaneously
            will divide opens among replicas however they choose.  Once that 
            choice is made,
            any subsequent transitions will treat the set of locking 
            state associated with each replica as a single entity.  
          </t>
    	  <t>
            For example, if one of the replicas become unavailable, 
            access will be
            transferred  to a different replica, also capable of
            simultaneous access with the one still in use.
          </t>
    	  <t>
            When there is no such replica, the transition may be to the 
            replica already in use.  At this point, the client has a 
            choice between merging the locking state for the two replicas
            under the aegis of the sole replica in use or treating these 
            separately, until another replica capable of simultaneous
            access presents itself. 
          </t>
        </section>    

        <section title="Filehandles and File System Transitions (as updated)"
  	         anchor="SEC11-EFF-fh">
    
          <t>
            There are a number of ways in which filehandles can be handled
            across a file system transition.  These can be divided into 
            two broad classes depending upon whether the two file systems
            across which the transition happens share sufficient state to
            effect some sort of continuity of file system handling.
          </t>
          <t>
            When there is no such cooperation in filehandle assignment,
            the two file systems are reported as being in different 
            handle classes.  In this case,
            all filehandles are assumed to expire as part of the 
            file system transition.  Note that this behavior does not
            depend on the fh_expire_type attribute and supersedes
	    the specification
            of the FH4_VOL_MIGRATION bit, which only affects behavior when
            fs_locations_info is not available.
          </t>
          <t>
            When there is cooperation in filehandle assignment,
            the two file systems are reported as being in the same
            handle classes.  In this case,
            persistent filehandles remain valid after the file system
            transition, while volatile filehandles (excluding those 
            that are only volatile due to the FH4_VOL_MIGRATION bit) are 
            subject to expiration on the target server.
          </t>
        </section>
        <section title="Fileids and File System Transitions (as updated)"
  	         anchor="SEC11-EFF-fileid">
          <t>
            In NFSv4.0, the issue of continuity of fileids in the event
            of a file system transition was not addressed.  The general 
            expectation had been that in situations in
            which the two file system instances are created by a single vendor
            using some sort of file system image copy, fileids would be
            consistent across the transition, while in the analogous 
            multi-vendor transitions they would not.  This poses difficulties, 
            especially for the client without special knowledge  
            of the transition mechanisms adopted by the server.  Note
            that although fileid is not a REQUIRED attribute, many servers
            support fileids and many clients provide APIs that depend on fileids.
          </t>
          <t>
            It is important to note that while clients themselves may have no
            trouble with a fileid changing as a result of a file system
            transition event, applications do typically have access to the
            fileid (e.g., via stat).  The result is that an
            application may work perfectly well if there is no file system
            instance transition or if any such transition is among instances
            created by a single vendor, yet be unable to deal with the
            situation in which a multi-vendor transition occurs at the wrong
            time.
          </t>
          <t>
            Providing the same fileids in a multi-vendor (multiple server
            vendors) environment has generally been held to be quite difficult.
            While there is work to be done, it needs to be pointed out that
            this difficulty is partly self-imposed.  Servers have typically
            identified fileid with inode number, i.e. with a quantity used to
            find the file in question.  This identification poses special
            difficulties for migration of a file system between vendors
            where assigning
            the same index to a given file may not be possible.  Note here that
            a fileid is not required to be useful to find the file in
            question, only that it is unique within the given file system.  Servers
            prepared to accept a fileid as a single piece of metadata and store
            it apart from the value used to index the file information can
            relatively easily maintain a fileid value across a migration event,
            allowing a truly transparent migration event.
          </t>
          <t>
            In any case, where servers can provide continuity of fileids, they
            should, and the client should be able to find out that such
            continuity is available and take appropriate action.  Information
            about the continuity (or lack thereof) of fileids across a file
            system transition is represented by specifying whether the file systems 
            in question are of the same fileid class.
          </t>
          <t>
            Note that when consistent fileids do not exist across a 
            transition (either because there is no continuity of fileids
            or because fileid is not a supported attribute on one of 
            instances involved), and there are
            no reliable filehandles across a transition event (either because
            there is no filehandle continuity or because the filehandles are
            volatile), the client is in a position where it cannot verify
            that files it was accessing before the transition are the 
            same objects.  It is forced to assume that no object has been 
            renamed, and, unless there are guarantees that provide this
            (e.g., the file system is read-only), problems for applications
            may occur.  Therefore, use of such configurations should be 
            limited to situations where the problems that this may cause
            can be tolerated.
          </t>
        </section>    
        <section title="Fsids and File System Transitions (as updated)"
  	         anchor="SEC11-EFF-fsid">
          <t>
            Since fsids are generally only unique on a per-server basis,
            it is likely that they will change during a file system
            transition.  
            Clients should not make the fsids received
            from the server visible to applications since they may not be
            globally unique, and because they may change during a file
            system transition event.  Applications are best served if they
            are isolated from such transitions to the extent possible.
          </t>
          <t>
            Although normally a single source file system will transition
            to a single target file system, there is a provision for splitting
            a single source file system into multiple target file systems, by
            specifying the FSLI4F_MULTI_FS flag.
          </t>
          <section anchor="SEC11-EFF-fsid-split"
                   title="File System Splitting (as updated)">
            <t>
              When a file system transition is made and the fs_locations_info
              indicates that the file system in question might be split into 
              multiple file systems (via the FSLI4F_MULTI_FS flag), the client 
              SHOULD do GETATTRs to determine the fsid attribute on all known 
              objects within the file system undergoing transition to determine 
              the new file system boundaries.  
            </t>
            <t>
              Clients might choose to
	      maintain the fsids passed to existing applications 
              by mapping all of the fsids for the descendant file systems to 
              the common fsid used for the original file system.  
            </t> 
            <t>
              Splitting a file system can be done on a transition between
              file systems of the same fileid 
              class, since the fact that fileids are unique within the
              source file system ensure they will be unique in each of the
              target file systems.
            </t>
          </section>
        </section>    
        <section anchor="SEC11-EFF-change"
                 title= "The Change Attribute and File System Transitions (as updated)">
          <t>
            Since the change attribute is defined as a server-specific one,
            change attributes fetched from one server are normally presumed to 
            be invalid on another server.  Such a presumption is troublesome
            since it would invalidate all cached change attributes, requiring
            refetching.  Even more disruptive, the absence of any assured
            continuity for the change attribute means that even if the same
            value is retrieved on refetch, no conclusions can be drawn as to whether
            the object in question has changed.  The identical change 
            attribute could be merely an artifact of a modified file with
            a different change attribute construction algorithm, with that
            new algorithm just happening to result in an identical change 
            value.
          </t>
          <t>
            When the two file systems have consistent change attribute formats,
            and this fact is communicated to the client by reporting 
            in the same change class, the 
            client may assume a continuity of change attribute construction
            and handle this situation just as it would be handled without
            any file system transition.
          </t>
        </section>  
        <section anchor="SEC11-EFF-wv"
                 title= "Write Verifiers and File System Transitions (as updated)">
          <t>
            In a file system transition, the two file systems might be
            clustered in the handling of unstably written data.  
            When this is the
            case, and the two file systems belong to the same
            write-verifier class, write
            verifiers returned
            from one system may be compared to those returned  by the 
            other and superfluous
            writes avoided.  
          </t>
          <t>
            When two file systems belong to different 
            write-verifier classes, any verifier
            generated by one must not be compared to one provided by the 
            other.  Instead, the two verifiers should be treated as not 
            equal even when
            the values are identical.
          </t>
        </section>  
        <section anchor="SEC11-EFF-rdc"
                 title="Readdir Cookies and Verifiers and File System Transitions (as updated)">
          <t>
            In a file system transition, the two file systems might be
            consistent in their handling of READDIR cookies and verifiers.
            When this is the
            case, and the two file systems belong to the same
            readdir class, READDIR
            cookies and verifiers
            from one system may be recognized by the other and 
            READDIR operations started on one server may be validly
            continued on the other, simply by presenting the 
            cookie and verifier returned by a READDIR operation done
            on the first file system to the second.
          </t>
          <t>
            When two file systems belong to different 
            readdir classes, any READDIR
            cookie and verifier
            generated by one is not valid on the second, and must not
            be presented to that server by the client.  The client 
            should act as if the verifier was rejected.
          </t>
        </section>  
        <section anchor="SEC11-EFF-data"
                 title= "File System Data and File System Transitions (as updated)">
          <t>
            When multiple replicas exist and are used simultaneously or in
            succession by a client, applications using them will 
            normally expect
            that they contain either the same data or data that is 
            consistent with
            the normal sorts of changes that are made by other clients
            updating the data of the file system
            (with metadata being the same to the degree indicated by the
            fs_locations_info attribute).  However, when 
            multiple file systems are 
            presented as replicas of one another, the precise relationship 
            between the data of one and the data of another is not, as a 
            general matter, specified by the NFSv4.1 protocol.  It is quite 
            possible to present as replicas file systems where the data of 
            those file systems is sufficiently different that some applications 
            have problems dealing with the transition between replicas.  The 
            namespace will typically be constructed so that applications can 
            choose an appropriate level of support, so that in one position in 
            the namespace a varied set of replicas will be listed, while in 
            another only those that are up-to-date may be considered replicas.  
            The protocol does define three special cases 
            of the relationship among 
            replicas to be specified by the server and relied upon by clients:
    
            <list style='symbols'>
              <t>
                When multiple replicas exist and are used simultaneously
                by a client (see the FSLIB4_CLSIMUL definition within
                fs_locations_info), they must designate the same
                data. Where file systems are writable, a change made on
                one instance must be visible on all instances, immediately
                upon the earlier of the return of the modifying requester
                or the visibility of that change on any of the associated
                replicas.  This allows a client to use these replicas
                simultaneously without any special adaptation to the fact
                that there are multiple replicas, beyond adapting to the fact
                that locks obtained on one replica are maintained separately
                (i.e. under a different client ID).
                In this case, locks (whether share reservations or 
                byte-range locks) and delegations obtained on one
                replica are immediately reflected on all replicas, in the
                sense that access from all other servers is prevented
		regardless of
                the replica used.  However, because the servers are 
                not required
                to treat two associated client IDs as
                representing the same client, it is best to 
                access each file using
                only a single client ID.
              </t>
              <t>
                When one replica is designated as the 
                successor instance to another
                existing instance after return NFS4ERR_MOVED 
                (i.e., the case of 
                migration), the client may depend on the fact that all changes
                written to stable storage on the original instance
                are written to stable storage of the successor (uncommitted 
                writes are dealt with in 
                <xref target="SEC11-EFF-wv" /> above).
              </t>
              <t>
                Where a file system is not writable but represents a read-only 
                copy (possibly periodically updated) of a writable file system,
                clients have similar requirements with regard 
                to the propagation 
                of updates.  They may need a guarantee that 
                any change visible on 
                the original file system instance must 
                be immediately visible on 
                any replica before the client 
                transitions access to that replica, 
                in order to 
                avoid any possibility that a client, 
                in effecting a transition to a
                replica, will see any reversion in file system state.  
                The specific
                means of this guarantee varies based on the value of
                the fss_type field that is
                reported as part of the fs_status attribute 
                (see Section 11.11 of <xref target="RFC5661" />).  
                Since these file systems are presumed 
                to be unsuitable for simultaneous use, 
                there is no specification of how 
                locking is handled; in general, locks obtained on one file
                system will be separate from those on others.  
                Since these are expected to be read-only file systems, 
                this is not
                likely to pose an issue for clients or applications.
              </t>
            </list>
          </t>
        </section>  
        <section title="Lock State and File System Transitions (as updated)"
  	         anchor="SEC11-EFF-lock">
          <t>
	    While accessing a file system, clients obtain locks enforced
	    by the server which may prevent actions by other clients
	    that are inconsistent with those locks.
	  </t>
          <t>
	    When access is transferred between replicas, clients need to
	    be assured that the actions disallowed by holding these locks
	    cannot have occurred during the transition.  This can be ensured
            by the methods below.  Unless at least one of these is implemented,
            clients will not be assured of continuity of lock 
            possession across a migration event.
          <list style="symbols">
    	    <t>
              Providing the client an opportunity to re-obtain his 
              locks via a per-fs grace
              period on the destination server.  Because the lock reclaim
              mechanism was originally defined to support server reboot, it 
              implicitly assumes that file handles will on reclaim will
              be the same as those at open.  In the case of migration, this 
              requires that source and destination servers use the same
              filehandles, as evidenced by using the same server scope
              (see <xref target="OTH-scope"/>  of the current document) 
              or by showing this
              agreement using fs_locations_info 
              (see <xref target="SEC11-EFF-fh"/> above). 
    	    </t>
    	    <t>
              Locking state can be transferred as part of the transition
	      by providing Transparent State Migration as
              described in <xref target="SEC11-trans-locking"/> of 
              the current document.
    	    </t>
          </list>
          </t>
          <t>
            Of these, Transparent State Migration provides the smoother
            experience for clients in that there is no grace-period-based
            delay before new locks can be obtained.  However, it requires
            a greater degree of inter-server co-ordination.  In general, the
            servers taking part in migration are free to provide either
            facility.  However, when the filehandles can differ across the
            migration event, Transparent State Migration is the only
            available means of providing the needed functionality.
          </t>
          <t>
            It should be noted that these two methods are not mutually 
            exclusive and that a server might well provide both.  In
            particular, if there is some circumstance preventing a 
            specific lock
            from being transferred transparently, 
            the destination server can allow it to be reclaimed, by
	    implementing a
	    per-fs grace period for the migrated file system. 
          </t>

        </section>

      </section>  
      <section title="Transferring State upon Migration (to be added)"
	       anchor="SEC11-trans-locking">
        <t>
          When the transition is a result of a server-initiated decision
          to transition access and the source and destination servers have
          implemented appropriate co-operation, it is possible to:
        <list style="symbols"> 
          <t>
            Transfer locking state from the source to the destination 
            server, in a fashion similar to that provided by Transparent State
            Migration in NFSv4.0, as described in <xref target="RFC7931"/>.
            Server responsibilities are described in 
            <xref target="SEC11-XS-lock"/> of the current document.
            
          </t>
          <t>
            Transfer session state from the source to the destination 
            server.  Server responsibilities in effecting such a
            transfer are described in 
            <xref target="SEC11-XS-session"/> of the current document.  
          </t>
        </list> 
        </t>
        <t>
          The means by which the client determines which of these transfer
          events has occurred are described in 
          <xref target="SEC11-trans-client"/> of the current document.
        </t>
        <section title="Transparent State Migration and pNFS (to be added)"
                 anchor="V41p-pnfs">
          <t>
            When pNFS is involved, the protocol is capable of supporting:
          <list style ='symbols'>
            <t>
              Migration of the Metadata Server (MDS), leaving the Data
              Servers (DS's) in place.
            </t>
            <t>
              Migration of the file system as a whole, including the MDS 
              and associated DS's.
            </t>
            <t>
              Replacement of one DS by another.
            </t>
            <t>
              Migration of a pNFS file system to one in which 
              pNFS is not used.
            </t>
            <t>
              Migration of a file system not using pNFS to one in which 
              layouts are available.
            </t>
          </list>
          </t>
          <t>
            Migration of the MDS function is directly supported by 
            Transparent State Migration. Layout state will normally be 
            transparently transferred, just as other state is.
            As a result, Transparent State Migration provides a framework in 
            which, given appropriate inter-MDS data transfer, one MDS can
            be substituted for another.
          </t>
          <t>
            Migration of the file system function as a whole
            can be accomplished by
            recalling all layouts as part of the initial phase of the
            migration process.  As a result, IO will be done through the
            MDS during the migration process, and new layouts can be granted
            once the client is interacting with the new MDS.  An MDS can
            also effect this sort of transition by revoking all layouts
            as part of Transparent State Migration, as long as the client is
            notified about the loss of locking state.
          </t>
          <t>
            In order to allow migration to a file system on which pNFS is
            not supported, clients need to be prepared for a situation in
            which layouts are not available or
	    supported on the destination file
            system and so direct IO requests to the destination
            server, rather than depending on layouts being available.
          </t>
          <t>
            Replacement of one DS by another is not addressed by migration as
            such but can be effected by an MDS recalling layouts for the DS 
            to be replaced and issuing new ones to be served by the 
            successor DS. 
          </t>
          <t>
            Migration may transfer a file system from a server which does
            not support pNFS to one which does.  In order to properly adapt
            to this situation, clients which support pNFS, but function
            adequately in its absence should check for pNFS support when
            a file system is migrated and be prepared to use pNFS when 
            support is available on the destination. 
          </t>
        </section>

      </section>    
      <section title="Client Responsibilities when Access is Transitioned (to be added)"
	       anchor="SEC11-trans-client">
        <t>
          For a client to respond to an access transition, it must become 
          aware of it.  The ways in which this can happen are discussed
          in <xref target="V41c-clrecov"/> which discusses indications
          that a specific file system access path has transitioned as well as
          situations in which additional activity is necessary to 
          determine the set of file systems that have been migrated.  
          <xref target="V41c-migrdisc"/> goes on to complete the discussion
          of how the set of migrated file systems might be determined.
          Sections <xref target="V41c-omoved" format="counter"/> through
          <xref target="V41c-ssnwas" format="counter"/> 
          discuss how the client should deal with
          each transition it becomes aware of, either directly or as a
          result of migration discovery.
        </t>
	<t>
	  The following terms are used to describe client activities:
	<list style="symbols">
  	  <t>
	    "Transition recovery" refers to the process of restoring access
	    to a file system on which NFS4ERR_MOVED was received.
	  </t>
  	  <t>
	    "Migration recovery" to that subset of transition recovery
	    which applies when the file system has migrated to a different
	    replica.
	  </t>
  	  <t>
	    "Migration discovery" refers to the process of determining which
	    file system(s) have been migrated.  It is necessary to
	    avoid a situation in
	    which leases could expire when a file system is not accessed for
	    a long period of time, since a client unaware of the migration
	    might be referencing an unmigrated file system and not renewing
	    the lease associated with the migrated file system.
	  </t>
	</list>
        </t>
        <section title="Client Transition Notifications (to be added)"
                 anchor="V41c-clrecov">
          <t>
            When there is a change in the network access
	    path which a client is to use to access a file 
            system, there 
            are a number of 
            related status indications with which clients 
            need to deal:
          <list style ='symbols'>
            <t>
              If an attempt is made to use or return a filehandle
              within a file system that is no longer accessible at the 
              address previously used to access it, the
              error NFS4ERR_MOVED is returned. 
            <vspace blankLines='1' />
              Exceptions are made to allow such file handles to be used
              when interrogating a location attribute.  This enables a
              client to determine
              a new replica's location or a new network access path.
            <vspace blankLines='1' />
              This condition continues on subsequent attempts to access
              the file system in question.  The only way the client 
              can avoid the error is to cease accessing the filesystem in 
              question at its old server location and access it instead
              using a different address at which it is now available.
            </t>
            <t>
              Whenever a SEQUENCE operation is sent by a client to
              a server which generated state held on that client which 
              is associated with a file system that is no longer accessible
              on the server at which it was previously available, a
	      lease-migrated indication, in the form the 
              SEQ4_STATUS_LEASE_MOVED status bit being set,
	      appears in the response.  
            <vspace blankLines='1' />
              This condition continues until the client acknowledges
              the notification by fetching a location attribute for the
              file system whose network access path is being changed.
	      When there are multiple such file systems, a location attribute
              for each such file system needs to be fetched. The location
	      attribute for all migrated file system needs to be fetched
	      in order to clear the condition.
              Even after the condition is cleared, the
              client needs to respond by using the location information
              to access the file system at its new location
              to ensure that leases are
              not needlessly expired.
            </t>
          </list>
          </t>
          <t>
            Unlike the case of NFSv4.0, in which the corresponding
            conditions are both errors and thus mutually exclusive, 
            in NFSv4.1 the client can, 
            and often will, receive both indications on the same
            request.  As a result, implementations need to address the 
            question of how to co-ordinate
            the necessary recovery actions when both indications
            arrive in the response to the same request.  It should be noted
	    that when processing an NFSv4 COMPOUND, the server
	    will normally decide
	    whether SEQ4_STATUS_LEASE_MOVED is to be set before
            it determines which file system will be referenced or whether
            NFS4ERR_MOVED is to be returned.
          </t>
          <t>
            Since these indications are not mutually exclusive in NFSv4.1, 
            the following combinations are possible results when a COMPOUND
            is issued:
          <list style="symbols"> 
            <t>
              The COMPOUND status 
              is NFS4ERR_MOVED and SEQ4_STATUS_LEASE_MOVED is asserted.
            <vspace blankLines='1'/>
              In this case, transition recovery is required.  While it is
              possible that migration discovery is needed in addition, it
              is likely that only the accessed file system has transitioned.
              In any case, because addressing NFS4ERR_MOVED is necessary to 
              allow the rejected requests to be processed on the target,
              dealing with it will typically have priority over 
              migration discovery.  
 
            </t>
            <t>
              The COMPOUND status 
              is NFS4ERR_MOVED and SEQ4_STATUS_LEASE_MOVED is clear.
            <vspace blankLines='1'/>
              In this case, transition recovery is also required. It is 
              clear that migration discovery is not needed to find
              file systems that have been migrated other that the one
              returning NFS4ERR_MOVED.  Cases in which this
              result can arise include a referral or a migration for which
              there is no associated locking state.  This can also arise in
              cases in which an access path transition
              other than migration occurs within the same server.  In such a 
              case, there is no need to set SEQ4_STATUS_LEASE_MOVED, since 
              the lease remains associated with the current server even though 
              the access path has changed.
            </t>
            <t>
              The COMPOUND status 
              is not NFS4ERR_MOVED and SEQ4_STATUS_LEASE_MOVED is asserted. 
            <vspace blankLines='1'/>
              In this case, no transition recovery activity is required on
              the file system(s) accessed by the request.
              However, to prevent avoidable
              lease expiration, migration discovery needs to be done   
            </t>
            <t>
              The COMPOUND status 
              is not NFS4ERR_MOVED and SEQ4_STATUS_LEASE_MOVED is clear. 
            <vspace blankLines='1'/>
              In this case, neither transition-related activity nor migration 
              discovery is required.
            </t>
          </list>
          </t>
          <t>
            Note that the specified actions only need to be taken if they are
            not already going on.  For example, when NFS4ERR_MOVED is received
	    when accessing a file system
            for which transition recovery already going on, the client
	    merely waits for 
            that recovery to be completed while the receipt of
	    SEQ4_STATUS_LEASE_MOVED indication only
            needs to initiate migration discovery for a server if it is not 
            going on for that server.
          </t>
          <t>
            The fact that a lease-migrated condition does not result in
            an error in NFSv4.1 has a number of important consequences.
            In addition to the fact, discussed above, that the two 
            indications are not mutually exclusive, there are number of
            issues that are important in considering implementation of
            migration discovery, as discussed in 
            <xref target="V41c-migrdisc"/>.
          </t>
          <t>
            Because of the absence of NFSV4ERR_LEASE_MOVED, it is possible
	    for file systems whose access path has not changed to be
	    successfully accessed on a given server even though recovery
            is necessary for other file systems on the same server.  As 
            a result, access can go on while,
	  <list style="symbols">
	    <t>    
	      The migration discovery process is going on for that server.
	    </t>    
	    <t>    
	      The transition recovery process is going on for on other
	      file systems connected to that server.
	    </t>    
          </list>
          </t>
	</section>
        <section title="Performing Migration Discovery (to be added)"
                 anchor="V41c-migrdisc">
          <t>
            Migration discovery can be performed in the same context as
            transition recovery, allowing recovery for  each migrated file 
            system to be invoked as it is discovered.  Alternatively, it may
            be done in a separate migration discovery thread,  allowing
            migration discovery to be done in parallel with
	    one or more instances
            of transition recovery. 
          </t>
          <t>
            In either case, because the lease-migrated indication
            does not result in an error. other access to file systems on the 
            server can proceed normally, with the possibility that further 
            such indications will be received, raising the issue of how
            such indications are to be dealt with.  In general, 
          <list style ='symbols'>
            <t>
              No action needs to be taken for such indications received by the
              those performing migration discovery, since continuation of that 
              work will address the issue.
            </t>
            <t>
              In other cases in which migration discovery is currently 
              being performed,
              nothing further needs to be done to respond to such lease
              migration indications, as long as one can be
	      certain that the migration
	      discovery process would deal with those indications.  See below
	      for details.
            </t>
            <t>
              For such indications received in all other contexts, the 
              appropriate response is to initiate or 
              otherwise provide for the 
              execution of migration discovery for file systems
              associated with the server IP address returning the indication.
            </t>
          </list>
          </t>
          <t>
            This leaves a potential difficulty in situations in which the
            migration discovery process is near to completion but is still
            operating.  One should not ignore a LEASE_MOVED indication if 
            the migration discovery process is not able to respond to 
            the discovery of additional
            migrating file 
            systems without additional aid.  A further complexity relevant in
            addressing such situations is that a lease-migrated indication may
            reflect the server's state at the time the SEQUENCE operation
            was processed, which may be different from that in effect at the
            time the response is received.  Because new migration events
	    may occur
	    at any time, and because a LEASE_MOVED indication may reflect
	    the situation in effect a considerable time before the indication
	    is received,
	    special care needs to be taken to ensure that LEASE_MOVED
	    indications are not inappropriately ignored.
          </t>
          <t>
            A useful approach to this issue involves the use of separate 
            externally-visible migration discovery states for each server.
	    Separate values could represent the various possible states for
            the migration discovery process for a server:
	  <list style="symbols">
	    <t>
              non-operation, in which migration discovery is not being
	      performed
	    </t>
	    <t>
	      normal operation, in which there is an ongoing scan for
	      migrated file systems.
	    </t>
	    <t>
	      completion/verification of migration discovery processing,
	      in which the possible completion of migration discovery
	      processing needs to be verified.
	    </t>
          </list>
          </t>
          <t>
            Given that framework, migration discovery processing would proceed
            as follows.
          <list style ='symbols'>
            <t>
              While in the normal-operation state, the thread performing
	      discovery would fetch, for
              successive file systems known to the client on the server being 
              worked on, a location 
              attribute plus the fs_status attribute.       
            </t>
            <t>
              If the fs_status attribute indicates that the file system
	      is a migrated one (i.e. fss_absent is true and
	      fss_type != STATUS4_REFERRAL) and thus that it is likely
	      that the fetch of the location attribute has
              cleared one the file systems contributing to the
	      lease-migrated indication.
            </t>
	    <t>
	      In cases in which that happened, the thread cannot know whether
	      the lease-migrated indication has been cleared
	      and so it enters the
	      completion/verification state and proceeds to issue a COMPOUND
	      to see if the LEASE_MOVED indication has been cleared.
	    </t>
	    <t>
	      When the discovery process is in the 
              completion/verification state,
	      if others request get a lease-migrated indication 
              they note that it was received and the existence of such
	      indications is used when the request completes, as 
              described below.
	    </t>
          </list>
          </t>
          <t>
	    When the request used in the completion/verification state 
            completes:
	  <list style ='symbols'>
            <t>
	      If a lease-migrated indication is returned, the discovery 
              continues normally.  Note that this is so
              even if all file systems
	      have traversed, since new migrations could have  occurred 
              while the process
	      was going on.
	    </t>
            <t>
	      Otherwise, if there is any record that other requests saw a 
              lease-migrated indication while the request was going on,
	      that record is cleared and the 
              verification request retried.  The discovery
	      process remains in completion/verification state. 
	    </t>
            <t>
	      If there have been no lease-migrated indications, the work of 
	      migration discovery is considered completed and it enters the
	      non-operating state.  Once it enters this state, subsequent 
              lease-migrated indication will trigger a new migration discovery
              process.
	    </t>
          </list>

          </t>
          <t>
	    It should be noted that the process described above is not
	    guaranteed to terminate, as a long series of new migration
	    events might continually delay the clearing of the LEASE_MOVED
	    indication.  To prevent unnecessary lease expiration, it is
	    appropriate for clients 
	    to use the discovery of migrations to effect lease
	    renewal immediately, rather than waiting for clearing of the
	    LEASE_MOVED indication when the complete set of migrations is
	    available.
          </t>
        </section>
        <section title="Overview of Client Response to NFS4ERR_MOVED (to be added)"
                 anchor="V41c-omoved">
          <t>
            This section outlines a way in which a client that receives
            NFS4ERR_MOVED can effect transition recovery by using a new
	    server or server endpoint 
            if one is available.  As part of that process, it will
            determine:
          <list style ='symbols'>
            <t>
              Whether the NFS4ERR_MOVED indicates migration has occurred, 
              or whether it indicates another sort of file system 
              access transition as discussed 
              in <xref target="SEC11-nwa"/> above.
            </t>
            <t>
              In the case of migration, whether Transparent State 
              Migration has occurred.
            </t>
            <t>
              Whether any state has been lost during the process of 
              Transparent State Migration.
            </t>
            <t>
              Whether sessions have been transferred as part of Transparent
              State Migration.
            </t>
          </list>
          </t>
          <t>
            During the first phase of this process, the client proceeds to
	    examine location entries to find the initial network address 
            it will use to continue access
            to the file system or its replacement.
	    For each location entry that the client examines, the process
            consists of five steps:
          <list style="numbers">
            <t>
              Performing an EXCHANGE_ID 
              directed at the location address.  This operation is used to
              register the client-owner with the server, to obtain a client ID
              to be use subsequently to communicate with it, to obtain that
              client ID's confirmation status, and to determine server_owner 
              and scope for the purpose of determining if the entry
              is trunkable with that
              previously being used to access the file system (i.e. that
              it represents another network access path to the same
	      file system and can share
              locking state with it). 
            </t> 
            <t>
	      Making an initial determination of whether migration has
	      occurred.  The initial determination will be based
	      on whether the EXCHANGE_ID results indicate that the
	      current location element is server-trunkable with that
              used to access the file system when access 
              was terminated by receiving NFS4ERR_MOVED.
	      If it is, then migration has not occurred and the transition is
	      dealt with, at least initially, as one involving continued
	      access to the same file system on the same server through
	      a new network address.
            </t> 
            <t> 
              Obtaining access to existing session state or creating new
              sessions.  How this is done depends on the initial
              determination of whether migration has occurred and
              can be done as described in <xref target="V41c-ssmig"/> below
              in the case of migration or as described in
              <xref target="V41c-ssnwas"/> below
	      in the case of a network
              address transfer without migration.
            </t> 
            <t> 
              Verification of the trunking relationship assumed in step
              2 as discussed in Section 2.10.5.1 of <xref target="RFC5661"/>.
              Although this step will generally confirm the initial
              determination, it is possible for verification to fail with 
              the result that an initial determination that a network address
              shift (without migration) has occurred may be invalidated and
              migration determined to have occurred.  There is no need to redo
	      step 3 above, since it will be possible to continue use of the
	      session established already.
            </t> 
            <t> 
              Obtaining access to existing locking state and/or
              reobtaining it.  How this is done depends on the final
              determination of whether migration has occurred and
              can be done as described below in <xref target="V41c-ssmig"/>
              in the case of migration or as described in
              <xref target="V41c-ssnwas"/>
	      in the case of a network
              address transfer without migration.

            </t> 
          </list>
          </t>
          <t>
	    Once the initial address has been determined, clients are free
	    to apply an abbreviated process to find additional addresses
	    trunkable with it (clients may seek session-trunkable or
	    server-trunkable addresses depending on whether they support
	    clientid trunking).  During this later phase of the process,
	    further location entries are examined using the abbreviated
            procedure specified below:
          <list style="numbers">
            <t>
	      Before the EXCHANGE_ID, the fs name of the location
	      entry is examined and if it
	      does not match that currently being used, the entry is ignored.
	      otherwise, one proceeds as specified by step 1 above,. 
            </t>
            <t>
	      In the case that the network address is session-trunkable with one
              used previously a BIND_CONN_TO_SESSION is used to access that
              session using the new network address.  Otherwise, or if the bind
              operation fails, a CREATE_SESSION is done. 
            </t>
            <t>
	      The verification procedure referred to in step 4 above is
	      used.  However, if it fails, the entry is ignored and the next
	      available entry is used.
            </t>
	  </list>
          </t>
        </section>
 
        <section title="Obtaining Access to Sessions and State after Migration (to be added)"
                 anchor="V41c-ssmig">
          <t>
            In the event that migration has occurred, migration recovery
	    will involve determining 
	    whether Transparent State Migration has 
            occurred. This decision is made based on the client ID returned
	    by the EXCHANGE_ID
	    and the reported 
            confirmation status.
          <list style ='symbols'>
            <t>
              If the client ID is an unconfirmed client ID not previously known
              to the  client, then Transparent State 
              Migration has not occurred.
            </t>
            <t>
              If the client ID is a confirmed client ID previously known
              to the  client, then any transferred state would have been
              merged with an existing client ID representing the client to the
              destination server. In this state merger case, Transparent
              State Migration might 
              or might not have occurred and a determination as to whether
	      it has occurred is deferred until sessions are established
	      and the client is ready to begin state recovery.
            </t>
            <t>
              If the client ID is a confirmed client ID  not previously known
              to the  client, then the client can conclude that the 
              client ID was transferred as part of Transparent State Migration.
              In this transferred client ID case, Transparent State Migration 
              has occurred although some state might have been lost.
            </t>
          </list>
          </t>
          <t>
	    Once the client ID has been obtained, it is necessary to
	    obtain access to sessions to continue communication with the
	    new server.
            In any of the cases in which Transparent State Migration 
            has occurred, it is possible that a session was transferred
            as well.  To deal with that possibility, clients can, after
            doing the EXCHANGE_ID, issue a BIND_CONN_TO_SESSION to 
            connect the transferred session to a connection to the new
            server.  If that fails,  it is an indication that the session
            was not transferred and that a new session needs to be created to
            take its place. 
          </t>
          <t>
            In some situations, it is possible for a BIND_CONN_TO_SESSION
            to succeed without session migration having occurred.  If
            state merger has taken place then the associated client ID
            may have already had a set of existing sessions, with it
            being possible that the sessionid of a given session is the
            same as one that might have been migrated.  In that event,
            a BIND_CONN_TO_SESSION might succeed, even though there
            could have been no migration of the session with that sessionid.
          </t>
	  <t>
            Once the client has determined the initial migration status, 
            and determined that there was a shift to a new server, it
            needs to re-establish its locking state, if possible.  To enable
            this to happen without loss of the guarantees normally provided by
            locking, the destination server needs to implement a per-fs grace
            period in all cases in which lock state was lost, including
            those in which Transparent State Migration was not
            implemented.
          </t>
          <t>
            Clients need to be deal with the following cases:
          <list style ='symbols'>
            <t>
              In the state merger case, it is possible that the server
              has not attempted Transparent State Migration, 
              in which case state may have been
              lost without it being reflected in the  SEQ4_STATUS bits.
              To determine whether this has happened, the client can use 
              TEST_STATEID to check whether the stateids created on the
              source server are still accessible on the destination server.
              Once a single stateid is found to have been successfully 
              transferred, the client can conclude that Transparent State
              Migration was begun and any failure to transport all of the
              stateids will be reflected in the SEQ4_STATUS bits.  Otherwise.
	      Transparent State Migration has not occurred.
            </t>
            <t>
              In a case in which Transparent State Migration has not
              occurred, the client can use the per-fs grace period provided
              by the destination server to reclaim locks that were held on
              the source server.
            </t>
            <t>
              In a case in which Transparent State Migration has 
              occurred, and no lock state was lost (as shown by SEQ4_STATUS
              flags), no lock reclaim is necessary.
            </t>
            <t>
              In a case in which Transparent State Migration has 
              occurred, and some lock state was lost (as shown by SEQ4_STATUS
              flags), existing stateids need to be checked for validity
              using TEST_STATEID, and reclaim used to re-establish any that
              were not transferred.

            </t>
          </list>
          </t>
          <t>
            For all of the cases above, RECLAIM_COMPLETE with an rca_one_fs
	    value of TRUE needs to be done before
            normal use of the file system including obtaining new locks for the
            file system.  This applies even if no locks were lost and there
            was no need for any to be reclaimed.
          </t>

        </section>
        <section title="Obtaining Access to Sessions and State after Network Address Transfer (to be added)"
                 anchor="V41c-ssnwas">
          <t>
            The case in which there is a transfer to a new network
            address without migration is similar to that described
            in <xref target="V41c-ssmig"/> above in that there is a need to
            obtain access to needed sessions and locking state.  However,
            the details are simpler and will vary depending on the
            type of trunking between the address receiving 
            NFS4ERR_MOVED and that to which the transfer is to be made  
          </t>
          <t>
            To make a session available for use, a BIND_CONN_TO_SESSION
            should be used to obtain access to the session previously
            in use.  Only if this fails, should a CREATE_SESSION be done.
            While this procedure mirrors that in <xref target="V41c-ssmig"/>
            above,
            there is an important difference in that preservation of the
            session is not purely optional but depends on the type of
            trunking.
          </t>
          <t>
            Access to appropriate locking state should need no actions beyond
	    access to the session.  However, the SEQ4_STATUS bits need to be
	    checked for lost locking state, including the need to reclaim
	    locks after a server reboot.
          </t>
        </section>
      </section>    
      <section title="Server Responsibilities Upon Migration (to be added)"
	       anchor="SEC11-trans-server">
        <t>
	  In the event of file system migration, when the client connects
	  to the destination server, it needs to be able to provide the
	  client continued to access
	  the files it had open on the source server.  There are two ways
	  to provide this:
	<list style="symbols">
	  <t>
	    By provision of an fs-specific grace period, allowing the client the
	    ability to reclaim its locks, in a fashion similar to what would
	    have been done in the
	    case of recovery from a server restart.  See
	    <xref target="SEC11-XS-reclaim"/> for a more complete
	    discussion.
	  </t>
	  <t>
	    By implementing Transparent State Migration possibly in
	    connection with session migration, the server can provide
	    the client immediate access to the state built up on the
	    source server, on the destination.  
	  <vspace blankLines="1"/>
	    These features are discussed separately in Sections 
            <xref target="SEC11-XS-lock" format="counter"/> and
            <xref target="SEC11-XS-session" format="counter"/>,
	    which discuss Transparent State Migration and session
	    migration respectively.
	  </t>
	</list>
	</t>
	<t>
	  All the features described above can involve transfer of
	  lock-related information between source and destination
	  servers.   In some cases this transfer is a necessary part
	  of the implementation while in other cases it is a helpful
	  implementation aid which servers might or might not use.
	  The sub-sections below discuss the information which would
	  transferred but do not define the specifics of the transfer
          protocol.  This is left as an implementation choice although
          standards in this area could be developed at a later time.
        </t>
        <section title="Server Responsibilities in Effecting State Reclaim after Migration (to be added)"
                 anchor="SEC11-XS-reclaim"> 
          <t>
	    In this case, destination server need have no knowledge of
	    the locks held
	    on the source server, but relies on the clients to accurately report
	    (via reclaim operations) the locks previously held, not allowing
	    new locks to be granted on migrated file system until the grace
	    period expires.
	  </t>
	  <t>
	    During this grace period clients have the opportunity to use
	    reclaim operations to obtain locks for file system objects within
	    the migrated file system, in the same way that they do when
	    recovering from server restart, and the servers typically
	    rely on clients to accurately report their locks, although they
	    have the option of subjecting these requests to verification.
	    If the clients only reclaim locks held on the source server, no
	    conflict can arise.  Once the client has reclaimed its locks,
	    it indicates the completion of lock reclamation by performing a
	    RECLAIM_COMPLETE specifying rca_one_fs as TRUE.   
	    
	  </t>
	  <t>
	    While it is not necessary for source and destination servers
	    to co-operate to transfer information about locks, implementations
	    are well-advised to consider transferring the following
	    useful information:
	  <list style="symbols">
	    <t>
	      If information about the set of clients that have
	      locking state for the transferred file system, the destination
	      server will be able to terminate the grace period once all
	      such clients have reclaimed their locks, allowing normal
	      locking activity to resume earlier than it would have otherwise.
	    </t>
	    <t>
	      Locking summary information for individual clients (at various
	      possible levels of detail) can detect
	      some instances in which clients do not accurately represent the
	      locks held on the source server.  
	    </t>
	  </list>  
	  </t>
	</section>
        <section title="Server Responsibilities in Effecting Transparent State Migration (to be added)"
                   anchor="SEC11-XS-lock"> 
          <t>
	    The basic responsibility of the source server in effecting
	    Transparent State Migration is to make available to the
	    destination server a description of each piece of locking state
	    associated with the file system being migrated.  In addition to 
            client id string and verifier, the source server needs to provide,
            for each stateid:
          <list style ='symbols'>
            <t>
	      The stateid including the current sequence value.
            </t>
            <t>
	      The associated client ID.
            </t>
            <t>
	      The handle of the associated file.
            </t>
            <t>
	      The type of the lock, such as open, byte-range lock, delegation,
	      layout.
            </t>
            <t>
	      For locks such as opens and byte-range locks, there will be
	      information about the owner(s) of the lock.
            </t>
            <t>
	      For recallable/revocable lock types, the current recall status
	      needs to be included.
            </t>
            <t>
	      For each lock type there will by type-specific information, such
	      as share and deny modes for opens and type and byte ranges for
	      byte-range locks and layouts.
            </t>
          </list>	    
          </t>
          <t>
	    A further server responsibility concerns locks that are revoked
	    or otherwise lost during the process of file system migration.
	    Because locks that appear to be lost during the process of
	    migration will be reclaimed by the client, the servers have to
	    take steps to ensure that locks revoked soon before or soon
	    after migration are not inadvertently allowed to be reclaimed
	    in situations in which the continuity of lock possession
	    cannot be assured.
          <list style='symbols'>
            <t>
	      For locks lost on the source but whose loss has not yet been
	      acknowledged by the client (by using FREE_STATEID), the
	      destination must be aware of this loss so that it can deny
	      a request to reclaim them.
	    </t>
            <t>
	      For locks lost on the destination after the state transfer
	      but before the client's RECLAIM_COMPLTE is done, the
	      destination server should note these and not allow them to
	      be reclaimed.
	    </t>
          </list>	    
          </t>
          <t>
	    An additional responsibility of the cooperating
	    servers concerns situations
	    in which a stateid cannot be transferred transparently because it
	    conflicts with an existing stateid held by the client and
	    associated with a different file system.  In this case there
	    are two valid choices:
          <list style ='symbols'>
            <t>
	      Treat the transfer, as in NFSv4.0, as one without Transparent
	      State Migration.  In this case, conflicting locks cannot be
	      granted until the client does a RECLAIM_COMPLETE, after
	      reclaiming the locks it had, with the exception of reclaims
	      denied because they were attempts to reclaim locks that had
	      been lost.
	    </t>
            <t>
	      Implement Transparent State Migration, except for the lock
	      with the conflicting stateid.  In this case, the client will
	      be aware of a lost lock (through the SEQ4_STATUS flags) and be
	      allowed to reclaim it.
	    </t>
          </list>	    
          </t>
          <t>
            When transferring state between the source and destination, the
            issues discussed in Section 7.2 of <xref target="RFC7931"/> 
            must still be attended to.  In this case, 
            the use of NFS4ERR_DELAY may still
            necessary in NFSv4.1, as it was in NFSv4.0, to prevent locking
            state changing while it is being transferred.
          </t>
          <t>
            There are a number of important differences in the NFS4.1 
            context:
          <list style ='symbols'>
            <t>
              The absence of RELEASE_LOCKOWNER means that the one case
              in which an operation could not be deferred by use of
              NFS4ERR_DELAY no longer exists.
            </t>
            <t>
              Sequencing of operations is no longer done using owner-based
              operation sequences numbers.  Instead, sequencing is session-
              based 
            </t>
          </list>
          </t>
          <t>
            As a result, when sessions are not transferred, the techniques
            discussed in Section 7.2 of <xref target="RFC7931"/> 
            are adequate and will not
            be further discussed.
          </t>
        </section>
        <section title="Server Responsibilities in Effecting Session Transfer (to be added)"
                 anchor="SEC11-XS-session">
          <t>
	    The basic responsibility of the source server in effecting
	    session transfer is to make available to the
	    destination server a description of the current state of each
	    slot with the session, including:
          <list style ='symbols'>
            <t>
	      The last sequence value received for that slot.
            </t>
            <t>
	      Whether there is cached reply data for the last request
	      executed and, if so, the cached reply.
            </t>
          </list>	    
	    
          </t>
          <t>
            When sessions are transferred, there are a number of issues that
            pose challenges in terms of making the transferred state
	    unmodifiable during the period it is gathered up and
	    transferred to the destination server.
          <list style ='symbols'>
            <t>
              A single session may be used to access multiple file systems,
              not all of which are being transferred.              
            </t>
            <t>
              Requests made on a session may, even if rejected, affect
              the state of the session by advancing the sequence number 
              associated with the slot used.
            </t>
	  </list>
          </t>
          <t>
            As a result, when the filesystem state might otherwise be
            considered unmodifiable, the client might have any number of
            in-flight requests, each of which is capable of changing session
            state, which may be of a number of types:
          <list style ='numbers'>
            <t>
              Those requests that were processed on the migrating file system,
              before migration began.
            </t>
            <t>
              Those requests which got the error NFS4ERR_DELAY because the
              file system being accessed was in the process of being
              migrated. 
            </t>
            <t>
              Those requests which got the error NFS4ERR_MOVED because the
              file system being accessed had been migrated. 
            </t>
            <t>
              Those requests that accessed the migrating file system,
              in order to obtain location or status information.
            </t>
            <t>
              Those requests that did not reference the migrating file system.
            </t>
          </list>
	  </t>
	  <t>
	    It should be noted that the history of any 
            particular slot is likely
	    to include a number of these request classes.  In the case in which
	    a session which is migrated is used by filesystems other than the
	    one migrated, requests of class 5 may be common and be the last
	    request processed, for many slots.
	  </t>
          <t>
	    Since session state can change even after the locking
	    state has been fixed as part of the migration process,
	    the session state known to the client could
	    be different from that on
	    the destination server, which necessarily reflects the session
	    state on the source server, at an earlier time.
            In deciding how to deal with this situation, it is helpful to 
            distinguish between two sorts of behavioral consequences of
            the choice of initial sequence ID values. 
          <list style ='symbols'>
            <t>
              The error NFS4ERR_SEQ_MISORDERED is returned when the sequence ID
              in a request is neither equal to the last one seen for the 
              current slot nor the next greater one.
            <vspace blankLines='1' />
              In view of the difficulty of arriving at a mutually acceptable
              value for the correct last sequence value
	      at the point of migration,
              it may be necessary for the server to show some degree of
              forbearance, when the sequence ID is one that would be
              considered unacceptable if session migration were not 
              involved. 
            </t>
            <t>
              Returning the cached reply for a previously executed 
              request when the sequence ID
              in the request matches the last value recorded for the slot. 
            <vspace blankLines='1' />
              In the cases in which an error is returned and there is no
              possibility of any non-idempotent operation having been executed,
              it may not be necessary to adhere to this as strictly as might
              be proper if session migration were not 
              involved.   For example, the fact that the error NFS4ERR_DELAY
              was returned may not assist the client in any material way, while
              the fact that NFS4ERR_MOVED was returned by the source server
              may not be relevant when the request was reissued, directed 
              to the
              destination server.
            </t>
          </list>
          </t>
          <t>
            One part of the necessary adaptation to these sorts of
	    issues would restrict
            enforcement of normal slot sequence enforcement semantics until
            the client itself, by issuing a request using a particular slot
            on the destination server, established the new starting sequence
            for that slot on the migrated session.
          </t>
          <t>
            An important issue is that the specification needs to take note of
            all potential COMPOUNDs, even if they might be unlikely
            in practice.  For example, a COMPOUND is allowed to access 
            multiple file systems and might perform non-idempotent operations
            in some of them before accessing a file system being migrated.
            Also, a COMPOUND may return considerable data in the response, 
            before
            being rejected with NFS4ERR_DELAY or NFS4ERR_MOVED, and  may
            in addition be marked as sa_cachethis.
          </t>
          <t>
            To address these issues,  a destination server MAY do any of
            the following when implementing session transfer.
          <list style ='symbols'>
            <t>
              Avoid enforcing any sequencing semantics for a particular slot
              until the client has established the starting sequence for that
              slot on the destination server.
            </t>
            <t>
              For each slot, avoid
              returning a cached reply returning NFS4ERR_DELAY or NFS4ERR_MOVED
              until the client has established the starting sequence for that
              slot on the destination server.  
            </t>
            <t>
              Until the client has established the starting sequence for a
              particular slot on the destination server, 
              avoid reporting NFS4ERR_SEQ_MISORDERED or 
              return a cached reply returning NFS4ERR_DELAY or NFS4ERR_MOVED,
              where the reply consists solely of a series of operations
              where the response is NFS4_OK until the final error.
            </t>
          </list>
          </t>
        </section>	
      </section>
    <section title="fs_locations_info"
	       anchor="SEC11-locations-info">
      <section title="Updates to treatment of fs_locations_info"
	       anchor="SEC11-li-changes">
	<t>
	  Various elements of the fs_locations_info attribute contain
	  information that applies to either a specific filesystem replica
	  or to a network path or set of network paths used to access such
	  a replica.
	  The existing treatment of fs_locations info (in Section 11.10 of
	  <xref target="RFC5661"/>) does not clearly distinguish these cases, in
	  part because the document did not clearly distinguish replicas from
	  the paths used to access them.
	</t>
	<t>
	  In addition, special clarification needed to be provided for:
	<list style="symbols">
	  <t>
	    With regard to the handling of FSLI4GF_GOING, it needs to be
	    made clear that this only applies to the unavailability of a
	    replica rather than to a path to access a replica.
	  </t>
	  <t>
	    In describing the appropriate value for a server to use for
	    fli_valid_for, it needs to be made clear that there is no
	    need for the client to frequently fetch the fs_locations_info
	    value to be prepared for shifts in trunking patterns.
	  </t>
	  <t>
	    Clarification of the rules for extensions of the fls_info needs
	    to be provided.  The existing treatment reflects the extension
	    model in effect at the time <xref target="RFC5661"/> was written,
	    and need to be updated in accord with the extension model
	    described <xref target="RFC8178"/>.
	  </t>
	</list> 
	</t>
      </section>
      <section title="The Attribute fs_locations_info (as updated)"
	       anchor="SEC11-li-new">
    <t>
      The fs_locations_info attribute is intended as a more functional
      replacement for the fs_locations attribute which will continue to exist
      and be
      supported.  Clients can use it to get a more complete set of 
      data about alternative file system locations, including additional
      network paths to access replicas in use and additional replicas.
      When the server does not support
      fs_locations_info, fs_locations can be used to get a subset of the
      data.  A server that supports fs_locations_info MUST support
      fs_locations as well.
    </t>
    <t>
      There is additional data present in
      fs_locations_info, that is not available in fs_locations:
    </t>
    <t>
     <list style='symbols'>
      <t>    
        Attribute continuity information. This information
        will allow a client to select a
        replica that meets the transparency requirements of the
        applications accessing the data and to leverage
        optimizations due to the server guarantees of attribute
        continuity (e.g., if the
        change attribute of a file of the file system is continuous
	between multiple replicas,
        the client does not have to invalidate the file's cache
	when switching to a different replica).
      </t>    
      <t>    
        File system identity information that indicates when multiple
        replicas, from the client's point of view, correspond to the
        same target file system, allowing them to be used
        interchangeably, without disruption, as distinct synchronized
	replicas of the same file data.
      <vspace blankLines="1"/>
        Note that having two replicas with common identity information is
        distinct from the case of two (trunked) paths to the same
	replica.
      </t>    
      <t>    
        Information that will bear on the suitability of various
        replicas, depending on the use that the client intends.  For
        example, many applications need an absolutely up-to-date copy
        (e.g., those that write), while others may only need access to
        the most up-to-date copy reasonably available.
      </t>    
      <t>    
        Server-derived preference information for replicas, which can
        be used to implement load-balancing while giving the client
        the entire file system list to be used in case the primary fails.
      </t>    
     </list>
    </t>
    <t>
      The fs_locations_info attribute is structured similarly to the
      fs_locations attribute.  A top-level structure
      (fs_locations_info4) contains the entire attribute including the root
      pathname of the file system and an array of lower-level structures that
      define replicas that share a common rootpath on their respective
      servers.  The lower-level structure in turn
      (fs_locations_item4) contains a specific pathname and information on one
      or more individual network access paths.  For that last lowest level,
      fs_locations_info has an fs_locations_server4
      structure that contains per-server-replica information in addition
      to the location entry.  This per-server-replica information includes a
      nominally opaque array, fls_info, within which specific pieces
      of information
      are located at the specific indices listed below.
    </t>
    <t>
      Two fs_location_server4 entries that are within different
      fs_location_item4 structures are never trunkable, while two entries
      within in the same fs_location_item4 structure might or might not be
      trunkable.  Two entries that are trunkable will have identical
      identity information, although, as noted above, the converse is
      not the case.
    </t>
    <t>
      The attribute will always contain at least a single fs_locations_server
      entry.  Typically, there  will be an entries with the FS4LIGF_CUR_REQ
      flag set, although in the case of a referral there will be no
      entry with that flag set.
    </t>
    <t>
      It should be noted that fs_locations_info attributes returned by
      servers for various replicas may differ for various reasons.
      One server may know about a set of replicas that are not known to
      other servers.  Further, compatibility attributes may differ.
      Filehandles might be of the same class going from replica A to
      replica B but not going in the reverse direction.  This might happen 
      because the filehandles are the same, but
      replica B's server implementation might not have provision to note
      and report that equivalence.
    </t>
    <t>
      The fs_locations_info attribute consists of a root
      pathname (fli_fs_root, just like fs_root in the
      fs_locations attribute), together with an array of
      fs_location_item4 structures.  The fs_location_item4
      structures in turn consist of a root pathname
      (fli_rootpath) together with an array (fli_entries)
      of elements of data type fs_locations_server4,
      all defined as follows.

    </t>
<figure>
 <artwork>
&lt;CODE BEGINS&gt;

/*
 * Defines an individual server access path
 */
struct  fs_locations_server4 {
        int32_t         fls_currency;
        opaque          fls_info&lt;>;
        utf8str_cis     fls_server;
};

/*
 * Byte indices of items within
 * fls_info: flag fields, class numbers,
 * bytes indicating ranks and orders.
 */
const FSLI4BX_GFLAGS            = 0;
const FSLI4BX_TFLAGS            = 1;

const FSLI4BX_CLSIMUL           = 2;
const FSLI4BX_CLHANDLE          = 3;
const FSLI4BX_CLFILEID          = 4;
const FSLI4BX_CLWRITEVER        = 5;
const FSLI4BX_CLCHANGE          = 6;
const FSLI4BX_CLREADDIR         = 7;

const FSLI4BX_READRANK          = 8;
const FSLI4BX_WRITERANK         = 9;
const FSLI4BX_READORDER         = 10;
const FSLI4BX_WRITEORDER        = 11;

/*
 * Bits defined within the general flag byte.
 */
const FSLI4GF_WRITABLE          = 0x01;
const FSLI4GF_CUR_REQ           = 0x02;
const FSLI4GF_ABSENT            = 0x04;
const FSLI4GF_GOING             = 0x08;
const FSLI4GF_SPLIT             = 0x10;

/*
 * Bits defined within the transport flag byte.
 */
const FSLI4TF_RDMA              = 0x01;

/*
 * Defines a set of replicas sharing
 * a common value of the rootpath
 * within the corresponding
 * single-server namespaces.
 */
struct  fs_locations_item4 {
        fs_locations_server4    fli_entries&lt;>;
        pathname4               fli_rootpath;
};

/*
 * Defines the overall structure of
 * the fs_locations_info attribute.
 */
struct  fs_locations_info4 {
        uint32_t                fli_flags;
        int32_t                 fli_valid_for;
        pathname4               fli_fs_root;
        fs_locations_item4      fli_items&lt;>;
};

/*
 * Flag bits in fli_flags.
 */
const FSLI4IF_VAR_SUB           = 0x00000001;

typedef fs_locations_info4 fattr4_fs_locations_info;

&lt;CODE ENDS&gt;
 </artwork>
</figure>
    <t>
      As noted above, the fs_locations_info attribute, when supported, may
      be requested of absent file systems without causing NFS4ERR_MOVED to
      be returned.  It is generally expected that it will be available for
      both present and absent file systems even if only a single
      fs_locations_server4 entry is present, designating the current (present)
      file system, or two fs_locations_server4 entries designating the 
      previous location of an absent file system (the one just referenced) and its
      successor location.  Servers are strongly urged to support this
      attribute on all file systems if they support it on any file system.
    </t>
    <t>
      The data presented in the fs_locations_info attribute may be obtained
      by the server in any number of ways, including specification by
      the administrator or by current protocols for transferring data
      among replicas and protocols not yet developed.  NFSv4.1 only defines
      how this information is presented by the server to
      the client.
    </t>
    <section anchor="SEC11-fsli-server" 
             title="The fs_locations_server4 Structure (as updated)">
      <t>
        The fs_locations_server4 structure consists of the following items
	in addition to the fls_server field which specifies a network
	address or set of addresses to be used to access the specified file
	system.  Note that both of these items specify attributes of the
	file system replica and should not be different when there are
	multiple fs_locations_server4 structures for the same replica, each
	specifying a network path to the chosen replica.
      </t>
      <t>
       <list style='symbols'>
        <t>    
          An indication of how up-to-date the file system is (fls_currency) in
          seconds.  This value
          is relative to the master copy.  A negative
          value indicates that the server is unable to give any
          reasonably useful value here.  A value of zero indicates that the
          file system is the actual writable data or a reliably coherent
          and fully up-to-date copy.  Positive values indicate how 
          out-of-date this copy can normally be before it is considered for
          update.  Such a value is not a guarantee that such updates
          will always be performed on the required schedule but instead
          serves as a hint about how far the copy of the data would be
          expected to be behind the most up-to-date copy.
        </t>    
        <t>    
          A counted array of one-byte values (fls_info) containing
          information about the particular file system instance.  This
          data includes general flags, transport capability flags,
          file system equivalence class information, and selection
          priority information.  The encoding will be discussed below.  
        </t>    
        <t>    
          The server string (fls_server).  For the case of the
          replica currently
          being accessed (via GETATTR), a zero-length string MAY be used to
          indicate the current address being used for the RPC call.
          The fls_server field can also be an IPv4 or IPv6 address,
          formatted the same way as an IPv4 or IPv6 address in the "server"
          field of the fs_location4 data type (see
	  Section 11.9 of <xref target="RFC5661"/>).
        </t>
       </list>
      </t>
      <t>
	With the exception of the transport-flag field (at offset
	FSLIBX_TFLAGS with the fls_info array), all of this data applies
	to the replica specified by the entry, rather that the specific
	network path used to access it.
      </t>
      <t>
        Data within the fls_info array is in the form of 8-bit data items
        with constants giving the offsets within the array of various
        values describing this particular file system instance.  
        This style of
        definition was chosen, in preference to explicit XDR
        structure definitions for these values, for a number of
        reasons.
      </t>
      <t>
      <list style='symbols'>
        <t>
          The kinds of data in the fls_info array, representing flags, 
          file system classes, and priorities among sets of file systems
          representing the same data, are such that 8 bits provide
          a quite acceptable range of values.  Even where there might 
          be more than 256 such file system instances, having more than
          256 distinct classes or priorities is unlikely.
        </t>
        <t>
          Explicit definition of the various specific data items within
          XDR would limit expandability in that any extension within
          would require yet another attribute,
          leading to specification and implementation clumsiness.
	  In the context of the NFSv4 extension model in effect at the time
	  fs_locations_info was designed (i.e. that described in
	  <xref target="RFC5661"/>), this would necessitate a new minor
	  to effect any Standards Track extension to the data in in
	  fls_info.
        </t>
      </list>
      </t>
      <t>
        The set of fls_info data is subject to expansion in a future minor 
        version, or in a Standards Track RFC, within the context of a single
        minor version.  The server SHOULD NOT send and the client MUST NOT
        use indices within the fls_info array or flag bits that are not
	defined in 
        Standards Track RFCs.
      </t> 
      <t>
	In light of the new extension model defined in <xref target="RFC8178"/>
	and the fact that the individual items within fls_info are not
	explicitly referenced in the XDR, the following practices should be
	followed when extending or otherwise changing the structure of
	the data returned in fls_info within the scope of a single minor
	version.
      <list style='symbols'>
        <t>
	  All extensions need to be described by Standards Track documents.
	  There
	  is no need for such documents to be marked as updating
	  <xref target="RFC5661"/> or this document.
        </t>
        <t>
	  It needs to be made clear whether the information in any added data
	  items applies to the replica specified by the entry or to the specific
	  network paths specified in the entry.
	</t>
        <t>
	  There needs to be a reliable way defined to determine whether the
	  server is aware of the extension.  This may be based on the
	  length field of the fls_info array, but it is more flexible to
	  provide fs-scope or server-scope attributes to indicate what
	  extensions are provided.
        </t>
      </list>
      </t>
      <t>
        This encoding scheme can be adapted to the specification of
        multi-byte numeric values, even though none are currently
        defined.  If extensions are made via Standards Track RFCs,
        multi-byte quantities will be encoded as a range of bytes 
        with a range of indices, with the byte interpreted in big-endian
        byte order.  Further, any such index assignments will be constrained
        by the need for the relevant quantities not to
	cross XDR word boundaries.
      </t>
      <t>
        The fls_info array currently contains:
      </t>
      <t>
       <list style='symbols'>
         <t>
           Two 8-bit flag fields, one devoted to general file-system
           characteristics and a second reserved for transport-related
           capabilities.
         </t>
         <t>
           Six 8-bit class values that define various file system
           equivalence classes as explained below.
         </t>
         <t>
           Four 8-bit priority values that govern file system selection
           as explained below.
         </t>
       </list>
      </t>
      <t>
        The general file system characteristics flag (at byte index
        FSLI4BX_GFLAGS) has the following
        bits defined within it:
      </t>
      <t>
       <list style='symbols'>
        <t>
          FSLI4GF_WRITABLE indicates that this file system target is writable,
          allowing it to be selected by clients that may need to write
          on this file system.  When the current file system instance
          is writable and is defined as of the same simultaneous use 
          class (as specified by the value at index FSLI4BX_CLSIMUL) 
          to which the client was previously writing, then it must
          incorporate within its data any committed
          write made on the source file system instance.  See
          <xref target="SEC11-EFF-wv" />, which discusses
          the write-verifier class.  While there is no harm in not setting
          this flag for a file system that turns out to be writable,
          turning the flag on for a read-only file system can cause
          problems for clients that select a migration or replication
          target based on the flag and then find themselves unable to write.
        </t>
        <t>
          FSLI4GF_CUR_REQ indicates that this replica is the one on which
          the request is being made.  Only a single server entry may
          have this flag set and, in the case of a referral, no entry
          will have it set.  Note that this flag might be set even if the
	  request was made on a network access path different from any of
	  those specified in the current entry.
        </t>
        <t>
          FSLI4GF_ABSENT indicates that this entry corresponds to an absent
          file system replica.  It can only be set if FSLI4GF_CUR_REQ is set.
          When both such bits are set, it indicates that a file system
          instance is not usable but that the information in the entry
          can be used to determine the sorts of continuity available
          when switching from this replica to other possible replicas.
          Since this bit can only be true if FSLI4GF_CUR_REQ is true, the
          value could be determined using the fs_status attribute, but
          the information is also made available here for the
          convenience of the client.  An entry with this bit, since it
          represents a true file system (albeit absent), does not appear
          in the event of a referral, but only when a file system has
          been accessed at this location and has subsequently been migrated.
        </t>
        <t>
          FSLI4GF_GOING indicates that a replica, while still available,
          should not be used further.  The client, if using it, should
          make an orderly transfer to another file system instance as
          expeditiously as possible.  It is expected that file systems
          going out of service will be announced as FSLI4GF_GOING some time
          before the actual loss of service. It is also expected that the
	  fli_valid_for value
          will be sufficiently small to allow clients to detect and act
          on scheduled events, while large enough that the cost of the
          requests to fetch the fs_locations_info values will not be
          excessive.  Values on the order of ten minutes seem
          reasonable.
          <vspace blankLines='1' />
          When this flag is seen as part of a transition into a new
          file system, a client might choose to transfer immediately 
          to another replica, or it may reference the current file system
          and only transition when a migration event occurs.  Similarly,
          when this flag appears as a replica in the referral, clients
          would likely avoid being referred to this instance whenever
          there is another choice.
          <vspace blankLines='1' />
	  This flag, like the other items within fls_info applies to the
	  replica, rather than to a particular path to that replica.  When
	  it appears, a transition to a new replica rather than to a
	  different path to the same replica, is indicated.
        </t>
        <t>
          FSLI4GF_SPLIT indicates that when a transition occurs from
          the current file system instance to this one, the replacement 
          may consist of multiple file systems.  In this case, the 
          client has to be prepared for the possibility that objects 
          on the same file system before migration will be on different ones 
          after.  Note that FSLI4GF_SPLIT is not incompatible with the
          file systems belonging to the same fileid
          class
          since, if one has a set of fileids that are unique within
          a file system, each subset assigned to a smaller file system after migration
          would not have any conflicts internal to that file system.
          <vspace blankLines='1' />
          A client, in the case of a split file system, will interrogate
          existing files with which it has continuing connection (it 
          is free to simply forget cached filehandles).  If the client
          remembers the directory filehandle associated with each open
          file, it may proceed upward using LOOKUPP to find the new file system
          boundaries.  Note that in the event of a referral, there will
          not be any such files and so these actions will not be performed.
	  Instead, a reference to a portion of the original
	  file system now split off into other file systems
	  will encounter an fsid change and possibly a
	  further referral.

          <vspace blankLines='1' />
          Once the client recognizes that one file system has been split 
          into two, it can prevent the disruption of running applications
          by presenting the two file systems as a single
          one until a convenient point to recognize the transition,
          such as a restart.  This would require a mapping
          from the server's fsids to fsids as seen by the client, but 
          this is already necessary for other reasons.  As noted 
          above, existing fileids within the two descendant file systems
          will not conflict.  Providing non-conflicting fileids for 
          newly created files on the split file systems
          is the responsibility of the server (or servers working in 
          concert).  The server can encode filehandles such
          that filehandles generated before the split event can be discerned
          from those generated after the split,
          allowing the server to determine when the need
          for emulating two file systems as one is over. 
          <vspace blankLines='1' />
          Although it is possible for this flag to be present in the
          event of referral, it would generally be of little interest
          to the client, since the client is not expected to have
          information regarding the current contents of the absent
          file system. 
        </t>
       </list>        
      </t>
      <t>
        The transport-flag field (at byte index FSLI4BX_TFLAGS) contains 
        the following bits related to the transport
        capabilities of the specific network path(s) specified by the
	entry.
      </t>
      <t>
       <list style='symbols'>
        <t>
          FSLI4TF_RDMA indicates that any specified network paths
	  provide NFSv4.1 clients
          access using an RDMA-capable transport.
        </t>
       </list>
      </t>
      <t>
        Attribute continuity and file system identity information are 
        expressed by defining equivalence relations on the sets of
        file systems presented to the client.  Each such relation
        is expressed as a set of file system equivalence classes.
        For each relation, a file system has an 8-bit class number.
        Two file systems belong to the same class if both have 
        identical non-zero class numbers.  Zero is treated as 
        non-matching.  Most often, 
        the relevant question for the client will be whether a
        given replica is identical to / continuous with the current one in a
        given respect, but the information should be available also as to
        whether two other replicas match in that respect as well.
      </t>
      <t>
        The following fields specify the file system's class numbers
        for the equivalence relations used in determining the nature of
        file system transitions.  See Sections
	<xref target="SEC11-trans-oview" format="counter"/>
	through <xref target="SEC11-trans-server" format="counter"/>
	and their various subsections
        for details about how
        this information is to be used.  Servers may assign these values
        as they wish, so long as file system instances that share the 
        same value have the specified relationship to one another;
        conversely, file systems that have the specified relationship
        to one another share a common class value. As each instance
        entry is added, the relationships of this instance to previously
        entered instances can be consulted, and if one is found that
        bears the specified relationship, that entry's class value can
        be copied to the new entry.  When no such previous entry exists,
        a new value for that byte index (not previously used) can be 
        selected, most likely by incrementing the value of the last class
        value assigned for that index. 
      </t>
      <t>
       <list style='symbols'>
        <t>
          The field with byte index FSLI4BX_CLSIMUL defines the 
          simultaneous-use class for the file system.
        </t>
        <t>
          The field with byte index FSLI4BX_CLHANDLE defines the handle
          class for the file system.
        </t>
        <t>
          The field with byte index FSLI4BX_CLFILEID defines the fileid
          class for the file system.
        </t>
        <t>
          The field with byte index FSLI4BX_CLWRITEVER defines the
          write-verifier class for the file system.
        </t>
        <t>
          The field with byte index FSLI4BX_CLCHANGE defines the change
          class for the file system.
        </t>
        <t>
          The field with byte index FSLI4BX_CLREADDIR defines the readdir
          class for the file system.
        </t>
       </list>
      </t>
      <t>     
        Server-specified preference information is also provided via
        8-bit values within the fls_info array.  The values provide a 
        rank and an order (see below) to be used with separate values
        specifiable for the cases of read-only and writable file 
        systems.  
        These values are compared
        for different file systems to establish the server-specified 
        preference, with lower values indicating "more preferred".
      </t>
      <t>
        Rank is used to express a strict server-imposed ordering on
        clients, with lower values indicating "more preferred".  Clients
        should attempt to use all replicas with a given rank before they
        use one with a higher rank.  Only if all of those file systems are
        unavailable should the client proceed to those of a higher rank.
        Because specifying a rank will override client preferences, servers
        should be conservative about using this mechanism, particularly
        when the environment is one in which client communication characteristics
        are neither tightly controlled nor visible to the server.
      </t>
      <t>
        Within a rank, the order value is used to specify the server's
        preference to guide the client's selection when the client's own
        preferences are not controlling, with lower values of order
        indicating "more preferred".  If replicas are approximately equal
        in all respects, clients should defer to the order specified by the
        server.  When clients look at server latency as part of their
        selection, they are free to use this criterion but it is suggested
        that when latency differences are not significant, the
        server-specified order should guide selection.

      </t>
      <t>
       <list style='symbols'>
        <t>
          The field at byte index FSLI4BX_READRANK gives the rank value to
          be used for read-only access. 
        </t>
        <t>
          The field at byte index FSLI4BX_READORDER gives the order value to
          be used for read-only access. 
        </t>
        <t>
          The field at byte index FSLI4BX_WRITERANK gives the rank value to
          be used for writable access. 
        </t>
        <t>
          The field at byte index FSLI4BX_WRITEORDER gives the order value to
          be used for writable access. 
        </t>
       </list>
      </t>
      <t>
        Depending on the potential need for write access by a given client,
        one of the pairs of rank and order values is used. 
        The read rank and order should only be used
        if the client knows that only reading will ever be done or if it is
        prepared to switch to a different replica in the event that any
        write access capability is required in the future.  
      </t>
    </section>
    <section anchor="SEC11-fsli-info" 
             title="The fs_locations_info4 Structure (as updated)">
      <t>
        The fs_locations_info4 structure, encoding the fs_locations_info
        attribute, contains the following:
      </t>
      <t>
       <list style='symbols'>
        <t>
          The fli_flags field, which contains general flags that affect 
          the interpretation of this fs_locations_info4 structure and
          all fs_locations_item4 structures within it.  The only flag
          currently defined is FSLI4IF_VAR_SUB.  All bits in the
	  fli_flags field that are not defined should always be returned as zero.
        </t>
        <t>
          The fli_fs_root field, which contains the pathname of the root of
          the current file system on the current server, just as it does
          in the fs_locations4 structure.
        </t>
        <t>
          An array called fli_items of fs_locations4_item structures, which contain
          information about replicas of the current file system.  Where
          the current file system is actually present, or has been
          present, i.e., this is not a referral situation, one of the
          fs_locations_item4 structures will contain an fs_locations_server4 for
          the current server.  This structure will have FSLI4GF_ABSENT set
          if the current file system is absent, i.e., normal access to it
          will return NFS4ERR_MOVED.
        </t>
        <t>
          The fli_valid_for field specifies a time in seconds
          for which it is reasonable for a client to use the fs_locations_info attribute
          without refetch.  The fli_valid_for value does not provide a
          guarantee of validity since servers can unexpectedly go out of
          service or become inaccessible for any number of reasons.
          Clients are well-advised to refetch this information for an
          actively accessed file system at every fli_valid_for seconds.  This
          is particularly important when file system replicas may go out
          of service in a controlled way using the FSLI4GF_GOING flag to
          communicate an ongoing change.  The server should set
          fli_valid_for to a value that allows well-behaved clients to
          notice the FSLI4GF_GOING flag and make an orderly switch before
          the loss of service becomes effective.  If this value is zero,
          then no refetch interval is appropriate and the client need
          not refetch this data on any particular schedule.
          In the event of a transition to a new file system instance, a
          new value of the fs_locations_info attribute will be fetched at
          the destination.  It is to be expected that this may have a
          different fli_valid_for value, which the client should then use
          in the same fashion as the previous value.   Because a refetch
	  of the attribute cause information from all component entries to
	  be refetched, the server will typically provide a low value for
	  this field if any of the replicas are likely to go out of service
	  in a short time frame.   Note that, because of the ability of the
	  server to return NFS4ERR_MOVED to change to use of different paths,
	  when alternate trunked paths are available, there is generally no
	  need to use low values of fli_valid_for in connection with the
	  management of alternate paths to the same replica.
        </t>
       </list>
      </t>
      <t>
        The FSLI4IF_VAR_SUB flag within fli_flags controls whether variable
        substitution is to be enabled.  See <xref target="SEC11-fsli-item" />
        for an explanation of variable substitution.
      </t>
    </section>
    <section anchor="SEC11-fsli-item" 
             title="The fs_locations_item4 Structure (as updated)">
      <t>
        The fs_locations_item4 structure contains a pathname 
        (in the field fli_rootpath) that encodes
        the path of the target file system replicas on the set of 
        servers designated by the included fs_locations_server4 entries.
        The precise manner in which this target location
        is specified depends on the value of the FSLI4IF_VAR_SUB
        flag within the associated fs_locations_info4 structure. 
      </t>
      <t>
        If this flag is not set, then fli_rootpath simply designates
        the location of the target file system within each server's
        single-server namespace just as it does for the rootpath
        within the fs_location4 structure.  When this bit is set,
        however, component entries of a certain form are subject
        to client-specific variable substitution so as to allow
        a degree of namespace non-uniformity in order to accommodate
        the selection of client-specific file system targets to
        adapt to different client architectures or other
        characteristics.
      </t>
      <t>
        When such substitution is in effect, a variable beginning
        with the string "${" and ending with the string "}"
        and containing a colon is to be
        replaced by the client-specific value associated with
        that variable.  The string "unknown" should be used 
        by the client when it has no value for such a variable.
        The pathname resulting from such
        substitutions is used to designate the target file system,
        so that different clients may have different file systems,
        corresponding to that location in the multi-server namespace.
      </t>
      <t>
        As mentioned above, such substituted pathname variables
        contain a colon.  The part before the colon is to be a
        DNS domain name, and the part after is to be a case-insensitive
        alphanumeric string.
      </t>
      <t> 
        Where the domain is "ietf.org", only variable names defined
        in this document or subsequent Standards Track RFCs
        are subject to such substitution.  Organizations are
        free to use their domain names to create their own sets
        of client-specific variables, to be subject to such
        substitution.  In cases where such variables are intended
        to be used more broadly than a single organization, 
        publication of an Informational RFC defining such variables
        is RECOMMENDED. 
      </t>
      <t>
        The variable ${ietf.org:CPU_ARCH} is used to denote that the
        CPU architecture object files are compiled.  This specification
        does not limit the acceptable values (except that they must be
        valid UTF-8 strings), but such values as "x86", "x86_64", and "sparc"
        would be expected to be used in line with industry practice.
      </t>
      <t>
        The variable ${ietf.org:OS_TYPE} is used to denote the 
        operating system, and thus the kernel and library APIs,
        for which code might be compiled.  This specification does
        not limit the acceptable values (except that they must be
        valid UTF-8 strings), but such values as "linux" and "freebsd"
        would be expected to be used in line with industry practice.
      </t>
      <t>
        The variable ${ietf.org:OS_VERSION} is used to denote the 
        operating system version, and thus the specific details
        of versioned interfaces,
        for which code might be compiled.  This specification does
        not limit the acceptable values (except that they must be
        valid UTF-8 strings). However, combinations of numbers and 
        letters with interspersed dots would be expected to be used
        in line with industry practice, with the details of the 
        version format depending on the specific value of
        the variable ${ietf.org:OS_TYPE} with which
        it is used.
      </t>
      <t>
        Use of these variables could result in the direction of different
        clients to different file systems on the same server, as
        appropriate to particular clients.  In cases in which the
        target file systems are located on different servers, a single
        server could serve as a referral point so that each valid
        combination of variable values would designate a referral
        hosted on a single server, with the targets of those referrals on
        a number of different servers.
      </t>
      <t>
        Because namespace administration is affected by the values
        selected to substitute for various variables, clients should
        provide convenient means of determining what variable 
        substitutions a client will implement, as well as, where
        appropriate, providing means to control the substitutions to
        be used.  The exact means by which this will be done is 
        outside the scope of this specification.
      </t>
      <t>
        Although variable substitution is most suitable for use
        in the context of referrals, it may be used in the context
        of replication and migration.  If it is used in these contexts,
        the server must ensure that no matter what values the
        client presents for the substituted variables, the result 
        is always a valid successor file system instance to that
        from which a transition is occurring, i.e., that the data is
        identical or represents a later image of a writable file
        system. 
      </t>
      <t>
        Note that when fli_rootpath is a null pathname (that is, one
        with zero components), the file system designated is at the
        root of the specified server, whether or not the FSLI4IF_VAR_SUB
        flag within the associated fs_locations_info4 structure is 
        set. 
      </t>
    </section>
  </section>
  </section>
    <section title="Changes to RFC5661 outside Section 11"
             anchor="OTH">
      <t>
        Beside the major rework of Section 11, there are a number of 
        related changes that are necessary:
      <list style="symbols">
        <t>
          The summary that appeared in Section 1.7.3.3 of 
          <xref target="RFC5661"/> needs to be revised to reflect the changes
          called for in <xref target="SEC11"/> of the current document.
	  The updated summary 
          appears as <xref target="OTH-intro"/> below.     
        </t>
        <t>
          The discussion of server scope which appeared in Section 2.10.4 of
          <xref target="RFC5661"/> needs to be replaced, since the existing 
          text appears to require a level of inter-server co-ordination
          incompatible with its basic function of avoiding the need for
          a globally uniform means of assigning server_owner values.
          A revised treatment appears in <xref target="OTH-scope"/>
	  below.     

        </t>
        <t>
          While the last paragraph (exclusive of sub-sections) of 
          Section 2.10.5 in <xref target="RFC5661"/>, dealing with
          server_owner changes, is literally true, it has been a source
          of confusion.   Since the existing paragraph can be read as 
          suggesting that such changes be dealt with non-disruptively, the
          treatment in <xref target="OTH-so"/> below
	  needs to be substituted.
        </t>
        <t>
          The existing definition of NFS4ERR_MOVED (in Section 15.1.2.4 of
          <xref target="RFC5661"/>) needs to be updated to reflect the 
          different handling of unavailability of a particular fs via a
          specific network address.  Since such a situation is no longer
          considered to constitute unavailability of a file system 
          instance, the description needs 
          to change even though the set of circumstances in 
          which it is to be returned remain the same.  The updated description
          appears in <xref target="OTH-moved"/> below.     
        </t>
        <t>
          The existing treatment of EXCHANGE_ID (in Section 18.35 of
	  <xref target="RFC5661"/>) assumes that client IDs cannot be created/
	  confirmed other than by the EXCHANGE_ID and CREATE_SESSION
	  operations.  Also, the necessary use of EXCHANGE_ID in recovery
	  from migration and related situations is not addressed clearly.
	  A revised treatment of EXCHANGE_ID is necessary and it appears in 
          <xref target="EXID"/> below while the specific differences
	  between it and the treatment within <xref target="RFC5661"/>
	  are explained in <xref target="OTH-eid"/> below.
        </t>
        <t>
	  The existing treatment of RECLAIM_COMPLETE in section 18.51 of
	  <xref target="RFC5661"/>) is not sufficiently clear about the
	  purpose and use of the rca_one_fs and how the server is to deal
	  with inappropriate values of this argument.  Because the
	  resulting confusion raises interoperability issues, a new treatment
	  of RECLAIM_COMPLETE is necessary and it appears in
	  <xref target="RC"/> below while the specific differences
	  between it and the treatment within <xref target="RFC5661"/>
	  are discussed in <xref target="OTH-rc"/> below.  In addition, the
	  definitions of the reclaim-related errors receive an updated
	  treatment in <xref target="OTH-recerror"/> to reflect the fact
	  that there are multiple contexts for lock reclaim operations.
        </t>
      </list>
      </t>
      <section title="(Introduction to) Multi-Server Namespace  (as updated)"
               anchor="OTH-intro">
       <t>
          NFSv4.1 contains a number of features to allow
          implementation of namespaces that cross server boundaries
          and that allow and facilitate a non-disruptive transfer of 
          support for individual file systems between servers.  They 
          are all based upon attributes that allow one file system to
          specify alternate, additional, and new location information
          which specifies how the client may access
          to access that file system.   
        </t>
        <t>
          These attributes can be used to provide for individual active
          file systems:
        <list style="symbols">
          <t>
            Alternate network addresses to access the 
            current file system instance.
          </t>
          <t>
            The locations of alternate file system instances
            or replicas to be used in the event that the current 
            file system instance becomes unavailable.
          </t>
        </list>
        </t>
        <t>
          These attributes may be used together with the concept
          of absent file systems, in which a position in the server
          namespace is associated with locations on other servers without 
          there being any corresponding file system instance on the
	  current server.
        <list style="symbols">
          <t>
            Location attributes may be used with absent file systems
            to implement referrals whereby one server may direct the
            client to a file system provided by another server.  This
            allows extensive multi-server namespaces to be constructed.
          </t>
          <t>
            Location attributes may be provided when a previously
            present file system becomes absent.  This allows 
            non-disruptive migration of file systems to alternate
            servers.
          </t>
        </list>
        </t>
      </section>    
      <section title="Server Scope (as updated)"
               anchor="OTH-scope">
        <t>
          Servers each specify a server scope value in the form
          of an opaque string eir_server_scope returned as part of
          the results of an EXCHANGE_ID operation.  The purpose of
          the server scope is to allow a group of servers to 
          indicate to clients that a set of servers sharing the 
          same server scope value has arranged to use compatible 
          values of otherwise opaque identifiers. Thus, the identifiers
          generated by two servers within that set can be assumed compatible
          so that, in some cases,
	  identifiers  by one server in that set that set may be presented to
          another server of the same scope.
        </t>
        <t>
          The use of such compatible values does not imply that
          a value generated by one server will always be accepted
          by another.  In most cases, it will not.  However, a
          server will not accept a value generated by another
          inadvertently.  When it does accept it, it will be because
          it is recognized as valid and carrying the same meaning  
          as on another server of the same scope.
        </t>
        <t>
          When servers are of the same server scope, this compatibility
          of values applies to the following identifiers:
          <list style="symbols">
            <t>
              Filehandle values.  A filehandle value accepted by two 
              servers of the same server scope denotes the same object.
              A WRITE operation sent to one server is reflected immediately
              in a READ sent to the other.
            </t>
            <t>
              Server owner values.  When the server scope values are 
              the same, server owner value may be validly compared.  
              In cases where the server scope values are different, server 
              owner values are treated as different even if they 
              contain identical strings of bytes.
            </t>
          </list>
        </t>
        <t>
          The coordination among servers required to provide such
          compatibility can be quite minimal, and limited to a simple
          partition of the ID space.  The recognition of common values
          requires additional implementation, but this can be tailored
          to the specific situations in which that recognition is 
          desired.
        </t>
        <t>
          Clients will have occasion to compare the server scope values
          of multiple servers under a number of circumstances, each of
          which will be discussed under the appropriate functional 
          section:
          <list style="symbols">
            <t>
              When server owner values received in response to 
              EXCHANGE_ID operations sent to multiple network
              addresses are compared for the purpose of determining
              the validity of various forms of trunking, as described
              in <xref target="SEC11-USES-trunk" /> of the current document. 
            </t>
            <t>
              When network or server reconfiguration causes the same
              network address to possibly be directed to different
              servers, with the necessity for the client to determine
              when lock reclaim should be attempted, as described
              in Section 8.4.2.1 of <xref target="RFC5661" />.
            </t>
          </list>
        </t>
        <t>
          When two replies from EXCHANGE_ID, each from two different
          server network addresses, have the same server scope, there
          are a number of ways a client can validate that the common
          server scope is due to two servers cooperating in a group.
          <list style="symbols">
            <t>
              If both EXCHANGE_ID requests were sent with RPCSEC_GSS
	      (<xref target="RFC2203"/>, <xref target="RFC5403"/>,
	      <xref target="RFC7861"/>)
              authentication and the server principal is the same for 
              both targets, the equality of server scope is validated. 
              It is RECOMMENDED that two servers intending to share the
              same server scope also share the same principal name.
            </t>
            <t>
              The client may accept the appearance of the second
              server in the fs_locations or fs_locations_info attribute
              for a relevant file system.  For example, if there is
              a migration event for a particular file system
              or there are locks to be reclaimed on a particular file
              system, the attributes for that particular file system
              may be used.  The client sends the GETATTR request to 
              the first server for the fs_locations or 
              fs_locations_info attribute with RPCSEC_GSS 
              authentication.  It may need to do this in advance
              of the need to verify the common server scope.
              If the client successfully authenticates the reply 
              to GETATTR, and the GETATTR request and reply containing 
              the fs_locations or fs_locations_info attribute refers 
              to the second server, then the equality of server scope 
              is supported.  A client may choose to limit the use of
              this form of support to information relevant to the
              specific file system involved (e.g. a file system 
              being migrated).
            </t>
          </list>  
        </t>
      </section>    
      <section title="Revised Treatment of NFS4ERR_MOVED"
               anchor="OTH-moved">
	<t>
	  Because of the need to appropriately address trunking-related
	  issues, some uses of the term "replica" in <xref target="RFC5661"/>
	  have become  problematic since a shift in network access paths was
	  considered to be a shift to a different replica.  As a result,
	  the description of NFS4ERR_MOVED in <xref target="RFC5661"/>
	  needs to be changed to the one below.
	  The new paragraph explicitly recognizes that a different network
	  address might be used, while the previous description, misleadingly,
	  treated this as a shift between two replicas while only a single
	  file system instance might be involved.  
	<list style="none">
          <t>
            The file system that contains the current filehandle object is 
            not accessible using the address on which the request was made.
            It still might be accessible using other addresses
            server-trunkable with it or it might not be present 
            at the server.  In the latter case, it might have been relocated 
            or migrated to another server, or it might have never been 
            present.  The client may
            obtain information regarding access to the file system location 
            by obtaining the "fs_locations"
            or "fs_locations_info" attribute for the current filehandle.  For
            further discussion, refer to Section 11 of <xref target="RFC5661"/>,
            as modified by the current document.
          </t>
	</list>
	</t>
      </section>    
      <section title="Revised Discussion of Server_owner changes"
               anchor="OTH-so">
        <t>
	  Because of likely problems with the treatment of such changes, a
	  confusing paragraph which appear at the end of Section 2.5.10
	  if <xref target="RFC5661"/>, which simply says that such changes
	  need to be dealt with, is to be replaced by the material below.
	<list style="none">
          <t>
            It is always possible that, as a result of various sorts 
            of reconfiguration events, eir_server_scope and 
            eir_server_owner values may be different on subsequent 
            EXCHANGE_ID requests made to the same network address. 
          </t>
          <t>
             In most cases such reconfiguration events will be 
             disruptive and indicate that an IP address formerly connected
             to one server is now connected to an entirely different one. 
          </t>
          <t>
             Some guidelines on client handling of such situations follow:
          <list style ='symbols'>
            <t>
              When eir_server_scope changes, the client has no assurance
              that any id's it obtained previously (e.g. file handles) can
              be validly used on the new server, and, even if the new 
              server accepts them, there is no assurance that this is not 
              due to accident.  Thus it is best to treat all such state 
              as lost/stale although a client may assume that the 
              probability  of inadvertent acceptance is low and treat 
              this situation as within the next case. 
            </t>
            <t>
              When eir_server_scope remains the same and 
              eir_server_owner.so_major_id changes, the client can use 
              filehandles it has and attempt reclaims.  It may find that
              these are now stale but if NFS4ERR_STALE is not received,
              he can proceed to reclaim his opens. 
            </t>
            <t>
              When eir_server_scope and 
              eir_server_owner.so_major_id remain the same,
              the client has to use the now-current values
              of eir_server_owner.so_minor_id in deciding on appropriate 
              forms of trunking.
            </t>
          </list>
          </t>
        </list>
        </t>
      </section>
      <section title="Revision to Treatment of EXCHANGE_ID"
               anchor="OTH-eid">
      
        <t>
	  There are a number of issues in the original treatment of
	  EXCHANGE_ID (in <xref target="RFC5661"/>) that cause problems
	  for Transparent State Migration and for the transfer of access
	  between different network access paths
	  to the same file system instance.
        </t>
        <t>
          These issues arise from the fact that this treatment was
          written:
	<list style="symbols">
	  <t>
            Assuming that a client ID can only become known to a server
            by having been created by executing an EXCHANGE_ID, with 
            confirmation of the ID only possible by execution of a 
            CREATE_SESSION. 
          </t>
          <t>
            Considering the interactions between a client and a server only 
            on a single network address
          </t>
        </list>
        </t>
        <t>
          As these assumptions have become invalid in the context of 
          Transparent State Migration and active use of trunking, 
          the treatment has been modified in
          several respects.   
	<list style="symbols">
	  <t>
            It had been assumed that an 
            EXCHANGED_ID executed when the server is already aware of a 
            given client instance must be either updating associated
            parameters (e.g. with respect to callbacks) or a lingering
            retransmission to deal with a previously lost reply.  As 
            result, any slot sequence returned by that operation
	    would be of no use.
	    The existing treatment
            goes so far as to say that it "MUST NOT" be used, although 
            this usage is not in accord with <xref target="RFC2119"/>.
            This created
            a difficulty when an EXCHANGE_ID is done after Transparent State
            Migration since that slot sequence would need to be used in a
            subsequent CREATE_SESSION.
          <vspace blankLines="1"/>
            In the updated treatment, CREATE_SESSION is a way that client
            IDs are confirmed but it is understood that other ways are
            possible.  The slot sequence can be used as needed and cases 
            in which it would be of no use are appropriately noted.
	  </t>    
	  <t>    
            It was assumed that the only functions of EXCHANGE_ID were to 
            inform the server of the client, create the client ID,
            and communicate it to the client.  When multiple 
            simultaneous connections are involved, as often happens when
            trunking, that treatment was inadequate in that it ignored the
            role of EXCHANGE_ID in associating the client ID with the 
            connection on which it was done, so that it could be used
            by a subsequent CREATE_SESSSION, whose parameters do not
            include an explicit client ID.  
          <vspace blankLines="1"/>
            The new treatment explicitly discusses the role of EXCHANGE_ID
            in associating the client ID with the connection so it
	    can be used
            by CREATE_SESSION and in associating a connection with an
            existing session.
	  </t>    

	</list>
        </t>
        <t>
          The new treatment can be found in <xref target="EXID"/> below.
	  It is intended to supersede the treatment in Section 18.35 of
	  <xref target="RFC5661"/>. Publishing a complete replacement for 
          Section 18.35 allows the corrected definition to be read as a whole
          once <xref target="RFC5661"/> is updated
        </t>
      </section>
        <section title="Revision to Treatment of RECLAIM_COMPLETE"
               anchor="OTH-rc">
          <t>
	    The following changes were made to the treatment of
	    RECLAIM_COMPLETE in <xref target="RFC5661"/> to arrive at the
	    treatment in <xref target="RC"/>.
	  <list style="symbols">
	    <t>
	      In a number of places the text is more explicit about the
	      purpose of rca_one_fs and its connection to file system
	      migration.
	    </t>  
	    <t>
	      There is a discussion of situations in which either form of
	      RECLAIM_COMPLETE would need to be done.
	    </t>  
	    <t>
	      There is a discussion of interoperability issues that result
	      from implementations that may have arisen due to the lack of
	      clarity of the previous treatment of RECLAIM_COMPLETE.
	    </t>  
	  </list>
          </t>
        </section>
	<section title="Reclaim Errors (as updated)"
		 anchor="OTH-recerror">
          <t>
            These errors relate to the process of reclaiming locks after a
            server restart or in connection with the migration of a file
	    system (i.e. in the case in which rca_one_fs is TRUE).
         </t>
         <section title="NFS4ERR_COMPLETE_ALREADY (as updated; Error Code 10054)" 
                  anchor="err_COMPLETE_ALREADY">
           <t>
             The client previously sent a successful RECLAIM_COMPLETE
             operation specifying the same scope, whether that scope is global 
	     or for the same file system in the case of a per-fs
	     RECLAIM_COMPLETE.
	     An additional RECLAIM_COMPLETE operation is not
             necessary and results in this error.
           </t>
        </section>
        <section title="NFS4ERR_GRACE (as updated; Error Code 10013)" 
                 anchor="err_GRACE">
        <t>
          The server was in its recovery or grace period, with regard to
	  the file system object for which the lock was requested.
          The locking request was not a reclaim request and so
          could not be granted during that period.
        </t>
      </section>
      <section title="NFS4ERR_NO_GRACE (as updated; Error Code 10033)" 
               anchor="err_NO_GRACE">
        <t>
          A reclaim of client state was attempted in circumstances in 
          which the server cannot guarantee that conflicting state has 
          not been provided to another client.  This can occur because 
          the reclaim has been done outside of a grace period of implemented
          by the server, after the client has done a RECLAIM_COMPLETE operation
	  which ends its ability to reclaim the requested lock,
          or because previous operations have created a situation in which
          the server is not able to determine that a reclaim-interfering
          edge condition does not exist.
        </t>
      </section>
      <section title="NFS4ERR_RECLAIM_BAD (as updated; Error Code 10034)" 
               anchor="err_RECLAIM_BAD">
        <t>

	  The server has determined that a reclaim attempted by the client 
	  is not valid, i.e. the lock specified as being reclaimed could
	  not possibly have existed before the server restart or file
	  system migration event.  A server 
	  is not obliged to make this determination and will typically rely 
	  on the client to only reclaim locks that the client was granted prior
          to restart or file system migration.  However, 
	  when a server does have reliable information to enable it make  
	  this determination, this error indicates that the reclaim has 
	  been rejected as invalid.  This is as opposed to the error
	  NFS4ERR_RECLAIM_CONFLICT (see <xref target="err_RECLAIM_CONFLICT"/>)
          where the server can only determine that 
	  there has been an invalid reclaim, but cannot determine
	  which request is invalid.

        </t>
      </section>
      <section title="NFS4ERR_RECLAIM_CONFLICT (as updated; Error Code 10035)" 
               anchor="err_RECLAIM_CONFLICT">
        <t>
          The reclaim attempted by the client has encountered a conflict
          and cannot be satisfied.  Potentially indicates a misbehaving
          client, although not necessarily the one receiving the error.
          The misbehavior might be on the part of the client that 
          established the lock with which this client conflicted.  See also
	  <xref target="err_RECLAIM_BAD"/> for the related error,
	  NFS4ERR_RECLAIM_BAD.

        </t>
      </section>
    </section>


   </section>    
      <section title=" Operation 42: EXCHANGE_ID - Instantiate Client ID (as updated)"
               anchor="EXID">
        <t>
          The EXCHANGE_ID exchanges long-hand client and server identifiers 
          (owners), and provides access to a client ID, creating one 
          if necessary.  This client ID becomes associated with the connection
          on which the operation is done, so that it is available when a
          CREATE_SESSION is done or when the connection is used to issue
          a request
          on an existing session associated with the current client.      
        </t>
	        <section title="ARGUMENT"
                 anchor="EXID-arg"
                 toc="exclude">
          <t>
            <figure>
              <artwork>
&lt;CODE BEGINS&gt;

const EXCHGID4_FLAG_SUPP_MOVED_REFER    = 0x00000001;
const EXCHGID4_FLAG_SUPP_MOVED_MIGR     = 0x00000002;

const EXCHGID4_FLAG_BIND_PRINC_STATEID  = 0x00000100;

const EXCHGID4_FLAG_USE_NON_PNFS        = 0x00010000;
const EXCHGID4_FLAG_USE_PNFS_MDS        = 0x00020000;
const EXCHGID4_FLAG_USE_PNFS_DS         = 0x00040000;

const EXCHGID4_FLAG_MASK_PNFS           = 0x00070000;

const EXCHGID4_FLAG_UPD_CONFIRMED_REC_A = 0x40000000;
const EXCHGID4_FLAG_CONFIRMED_R         = 0x80000000;

struct state_protect_ops4 {
        bitmap4 spo_must_enforce;
        bitmap4 spo_must_allow;
};

struct ssv_sp_parms4 {
        state_protect_ops4      ssp_ops;
        sec_oid4                ssp_hash_algs&lt;>;
        sec_oid4                ssp_encr_algs&lt;>;
        uint32_t                ssp_window;
        uint32_t                ssp_num_gss_handles;
};

enum state_protect_how4 {
        SP4_NONE = 0,
        SP4_MACH_CRED = 1,
        SP4_SSV = 2
};

union state_protect4_a switch(state_protect_how4 spa_how) {
        case SP4_NONE:
                void;
        case SP4_MACH_CRED:
                state_protect_ops4      spa_mach_ops;
        case SP4_SSV:
                ssv_sp_parms4           spa_ssv_parms;
};

struct EXCHANGE_ID4args {
        client_owner4           eia_clientowner;
        uint32_t                eia_flags;
        state_protect4_a        eia_state_protect;
        nfs_impl_id4            eia_client_impl_id&lt;1>;
};

&lt;CODE ENDS&gt;
              </artwork>
            </figure>         
          </t>
        </section>    
        <section title="RESULT"
                 anchor="EXID-res"
                 toc="exclude">
          <t>
            <figure>
              <artwork>

		&lt;CODE BEGINS&gt;

struct ssv_prot_info4 {
 state_protect_ops4     spi_ops;
 uint32_t               spi_hash_alg;
 uint32_t               spi_encr_alg;
 uint32_t               spi_ssv_len;
 uint32_t               spi_window;
 gsshandle4_t           spi_handles&lt;>;
};

union state_protect4_r switch(state_protect_how4 spr_how) {
 case SP4_NONE:
         void;
 case SP4_MACH_CRED:
         state_protect_ops4     spr_mach_ops;
 case SP4_SSV:
         ssv_prot_info4         spr_ssv_info;
};

struct EXCHANGE_ID4resok {
 clientid4        eir_clientid;
 sequenceid4      eir_sequenceid;
 uint32_t         eir_flags;
 state_protect4_r eir_state_protect;
 server_owner4    eir_server_owner;
 opaque           eir_server_scope&lt;NFS4_OPAQUE_LIMIT>;
 nfs_impl_id4     eir_server_impl_id&lt;1>;
};

union EXCHANGE_ID4res switch (nfsstat4 eir_status) {
case NFS4_OK:
 EXCHANGE_ID4resok      eir_resok4;

default:
 void;
};
 
&lt;CODE ENDS&gt;
              </artwork>
            </figure>         
          </t>
        </section>    
        <section title="DESCRIPTION"
                 anchor="EXID-desc"
                 toc="exclude">
          <t>
            The client uses the EXCHANGE_ID operation to register
            a particular client_owner with the server.  However,
	    when the client_owner has been already been registered
            by other means (e.g. Transparent State Migration), the
            client may still use EXCHANGE_ID to obtain the client ID
	    assigned previously. 
          </t>
	  <t>
            The client ID returned from this
            operation will be associated with the connection 
            on which the EXHANGE_ID is received and 
	    will serve as a parent object for 
            sessions created by the client on this connection or 
            to which the connection is bound.  As a result of using 
            those sessions to make requests involving the creation
            of state, that state will become associated with the 
            client ID returned.
          </t>
	  <t>
            In situations in which the registration of the
	    client_owner has not occurred previously, 
            the client ID must first be used, along with
            the returned eir_sequenceid, in creating an
            associated session using 
            CREATE_SESSION.  
          </t>
	  <t>
            If the flag EXCHGID4_FLAG_CONFIRMED_R is set in the
            result, eir_flags, then it is an indication that the
	    registration of the client_owner has already occurred
            and that a further CREATE_SESSION is not  needed to
            confirm it.  Of course, subsequent CREATE_SESSION
            operations may
	    be needed for other reasons.
	  </t>
          <t>
            The value eir_sequenceid is used to establish an initial
            sequence value associate with the client ID returned.  In
	    cases in which a CREATE_SESSION has already been done,
	    there is no need for this value, since sequencing of
	    such request has already been established and the client
	    has no need for this value and will ignore it
	  </t>
    <t>
     EXCHANGE_ID MAY be sent in a COMPOUND procedure that starts with
     SEQUENCE. However, when a client communicates with a server
     for the first time, it will not have a session, so using
     SEQUENCE will not be possible.
     If EXCHANGE_ID is sent without a preceding SEQUENCE, then it
     MUST be the only operation in the COMPOUND procedure's request. If
     it is not, the server MUST return NFS4ERR_NOT_ONLY_OP.
    </t>

    <t>
     The eia_clientowner field is composed of a co_verifier
     field and a co_ownerid string.  As noted in section 2.4 of
     <xref target="RFC5661"/>, the co_ownerid
     describes the client, and the co_verifier is
     the incarnation of the client. An EXCHANGE_ID
     sent with a new incarnation of the client will
     lead to the server removing lock state of the old
     incarnation. Whereas an EXCHANGE_ID sent with the
     current incarnation and co_ownerid will result in
     an error or an update of the client ID's properties,
     depending on the arguments to EXCHANGE_ID.
    </t>
    <t>
      A server MUST NOT provide the same client ID to two different
      incarnations of an eir_clientowner.
    </t>
    <t>
     In addition to the client ID and sequence ID, the server
     returns a server owner (eir_server_owner) and
     server scope (eir_server_scope).  The former field is used
     in connection with 
     network trunking as described in Section 2.10.54 of <xref
     target="RFC5661" />.  The latter field is used to
     allow clients to determine when client IDs sent by
     one server may be recognized by another in the event
     of file system migration (see <xref
     target="SEC11-EFF-lock" /> of the current document).
    </t>
    <t>
     The client ID returned by EXCHANGE_ID is only unique
     relative to the combination of eir_server_owner.so_major_id
     and eir_server_scope. Thus, if two servers return the
     same client ID, the onus is on the client to
     distinguish the client IDs on the basis of eir_server_owner.so_major_id
     and eir_server_scope. In the event two different servers
     claim matching server_owner.so_major_id and eir_server_scope,
     the client can use the verification techniques discussed
     in Section 2.10.5 of <xref target="RFC5661" /> to determine if the servers
     are distinct. If they are distinct, then the client
     will need to note the destination network addresses
     of the connections used with each server, and use
     the network address as the final discriminator.
    </t>
    <t>
     The server, as defined by the unique identity expressed
     in the so_major_id of the server owner and the server scope,
     needs to track several properties of each client ID it
     hands out. The properties apply to the client ID and all
     sessions associated with the client ID.
     The properties are derived from the
     arguments and results of EXCHANGE_ID.
     The client ID properties include:
     <list style="symbols">
     <t>
      The capabilities expressed by the following bits, which
      come from the results of EXCHANGE_ID:
        <list>
        <t>EXCHGID4_FLAG_SUPP_MOVED_REFER</t>
	<t>EXCHGID4_FLAG_SUPP_MOVED_MIGR    </t>
	<t>EXCHGID4_FLAG_BIND_PRINC_STATEID        </t>
	<t>EXCHGID4_FLAG_USE_NON_PNFS     </t>
	<t>EXCHGID4_FLAG_USE_PNFS_MDS   </t>
	<t>EXCHGID4_FLAG_USE_PNFS_DS     </t>
        </list>
        These properties may be updated by subsequent
        EXCHANGE_ID operations on confirmed client IDs though the server MAY
        refuse to change them.
     </t>
     <t>
       The state protection method used, one of SP4_NONE,
       SP4_MACH_CRED, or SP4_SSV, as set by the spa_how
       field of the arguments to EXCHANGE_ID.  Once the
       client ID is confirmed, this property cannot be
       updated by subsequent EXCHANGE_ID operations.

     </t>
     <t>
       For SP4_MACH_CRED or SP4_SSV state protection:
       <list>
       <t>
	 The list of operations (spo_must_enforce) that MUST use the specified
	 state protection. This list comes
	 from the results of EXCHANGE_ID.

       </t>
       <t>
	 The list of operations (spo_must_allow) that MAY use the specified
	 state protection. This list comes
	 from the results of EXCHANGE_ID.

       </t>
       </list>
       Once the client ID is confirmed, these properties
       cannot be updated by subsequent EXCHANGE_ID
       requests.

     </t>
     <t>
      For SP4_SSV protection:
      <list>
   
      <t>
       The OID of the hash algorithm. This property is
       represented by one of the algorithms in the
       ssp_hash_algs field of the EXCHANGE_ID arguments.
       Once the client ID is confirmed, this property
       cannot be updated by subsequent EXCHANGE_ID
       requests.

      </t>
      <t>
       The OID of the encryption algorithm. This property
       is represented by one of the algorithms in the
       ssp_encr_algs field of the EXCHANGE_ID arguments.
       Once the client ID is confirmed, this property
       cannot be updated by subsequent EXCHANGE_ID
       requests.

      </t>

      <t>
       The length of the SSV. This property is
       represented by the spi_ssv_len field in the EXCHANGE_ID
       results.

       Once the client ID is confirmed,
       this property cannot be updated by 
       subsequent EXCHANGE_ID operations.

	 <vspace blankLines='1' />

       There are REQUIRED and RECOMMENDED relationships among the
       length of the key of the encryption algorithm ("key length"), the length of the
       output of hash algorithm ("hash length"), and the length of the SSV ("SSV length").
       <list style="symbols">
       <t>
        key length MUST be &lt;= hash length. This is because the keys used for
        the encryption algorithm are actually subkeys derived from the SSV,
        and the derivation is via the hash algorithm. The selection of an
        encryption algorithm with a key length that exceeded the length of
        the output of the hash algorithm would require padding, and thus
        weaken the use of the encryption algorithm.
       </t>
       <t>
        hash length SHOULD be &lt;= SSV length. This is because the
        SSV is a key used to derive subkeys via an HMAC, and
        it is recommended that the key used as input to an HMAC be
        at least as long as the length of the HMAC's hash algorithm's
        output (see Section 3 of <xref target="RFC2104"/>).
       </t>

       <t>
        key length SHOULD be &lt;= SSV length. This is a transitive result of the
        above two invariants.
       </t>

       <t>
        key length SHOULD be >= hash length / 2. This is because the subkey
        derivation is via 
        an HMAC and it is recommended that if the HMAC has to be truncated,
        it should not be truncated to less than half the hash length
        (see Section 4 of <xref target="RFC2104">RFC2104</xref>).
       </t>
       </list>
      </t>

      <t>
       Number of concurrent versions of the SSV the client
       and server will support (see Section 2.10.9
       of <xref target="RFC5661"/>).
       This property is represented by spi_window
       in the EXCHANGE_ID results.  The property may be
       updated by subsequent EXCHANGE_ID operations.

      </t>
      </list>
    </t>
    <t>
     The client's implementation ID as represented by
     the eia_client_impl_id field of the arguments.
     The property may be updated by subsequent EXCHANGE_ID
     requests.
    </t>
    <t>
     The server's implementation ID as represented by
     the eir_server_impl_id field of the reply.
     The property may be updated by replies to subsequent EXCHANGE_ID
     requests.
    </t>
    </list>
    </t>

    <t>
      The eia_flags passed as part of the arguments and
      the eir_flags results allow the client and server
      to inform each other of their capabilities as well
      as indicate how the client ID will be used. Whether
      a bit is set or cleared on the arguments' flags
      does not force the server to set or clear the same
      bit on the results' side.  Bits not defined above
      cannot be set in the eia_flags field.  If they
      are, the server MUST reject the operation with
      NFS4ERR_INVAL.

    </t>
    <t>
      The EXCHGID4_FLAG_UPD_CONFIRMED_REC_A bit can only be set
      in eia_flags; it is always off in eir_flags.
      The EXCHGID4_FLAG_CONFIRMED_R bit can only be set in
      eir_flags; it is always off in eia_flags.  If the
      server recognizes the co_ownerid and co_verifier
      as mapping to a confirmed client ID, it sets
      EXCHGID4_FLAG_CONFIRMED_R in eir_flags.
      The EXCHGID4_FLAG_CONFIRMED_R flag allows a client
      to tell if the client ID it is trying to create
      already exists and is confirmed.

    </t>

    <t>
      If EXCHGID4_FLAG_UPD_CONFIRMED_REC_A is set in eia_flags,
      this means that the client is attempting to update properties
      of an existing confirmed client ID (if the client wants to
      update properties of an unconfirmed client ID, it MUST NOT
      set EXCHGID4_FLAG_UPD_CONFIRMED_REC_A).
      If so, it is
      RECOMMENDED that the client send the update EXCHANGE_ID
      operation in the same COMPOUND as a SEQUENCE so that
      the EXCHANGE_ID is executed exactly once. Whether
      the client can update the properties of client ID
      depends on the state protection it selected when the
      client ID was created, and the principal and security
      flavor it uses when sending the EXCHANGE_ID operation.
      The situations described in items

      <xref target="case_update" format="counter"/>,

      <xref target="case_update_noent" format="counter"/>,

      <xref target="case_update_exist" format="counter"/>,

      or

      <xref target="case_update_perm" format="counter"/>

      of the second numbered list of <xref
      target="EXID-impl" /> below will apply.
      Note that if the operation succeeds
      and returns a client ID that is already
      confirmed, the server MUST set the
      EXCHGID4_FLAG_CONFIRMED_R bit in eir_flags.


    </t>

    <t>
      If EXCHGID4_FLAG_UPD_CONFIRMED_REC_A is not set in eia_flags,
      this means that the client is trying to establish a new
      client ID; it is
      attempting to trunk data communication to
      the server (See Section 2.10.5 of <xref target="RFC5661"/>); or it
      is attempting to update properties of an unconfirmed
      client ID. The
      situations described in
      items
	<xref target="case_new_owner_id" format="counter"/>,
	<xref target="case_non_update" format="counter"/>,
	<xref target="case_client_collision" format="counter"/>,
	<xref target="case_retry" format="counter"/>, or
	<xref target="case_client_restart" format="counter"/>

      of the second numbered list of <xref
      target="EXID-impl" /> below will apply.
      Note that if the operation succeeds
      and returns a client ID that was previously
      confirmed, the server MUST set the
      EXCHGID4_FLAG_CONFIRMED_R bit in eir_flags.

    </t>
    
    <t>
      When the EXCHGID4_FLAG_SUPP_MOVED_REFER flag bit
      is set, the client indicates that it is capable
      of dealing with an NFS4ERR_MOVED error as part of
      a referral sequence.  When this bit is not set, it
      is still legal for the server to perform a referral
      sequence.  However, a server may use the fact that
      the client is incapable of correctly responding
      to a referral, by avoiding it for that particular
      client.  It may, for instance, act as a proxy
      for that particular file system, at some cost in
      performance, although it is not obligated to do so.
      If the server will potentially perform a referral, it
      MUST set EXCHGID4_FLAG_SUPP_MOVED_REFER in eir_flags.

    </t>
    <t>
      When the EXCHGID4_FLAG_SUPP_MOVED_MIGR is set,
      the client indicates that it is capable of dealing
      with an NFS4ERR_MOVED error as part of a file system
      migration sequence.  When this bit is not set, it
      is still legal for the server to indicate that a
      file system has moved, when this in fact happens.
      However, a server may use the fact that the client
      is incapable of correctly responding to a migration
      in its scheduling of file systems to migrate so as to
      avoid migration of file systems being actively used.
      It may also hide actual migrations from clients
      unable to deal with them by acting as a proxy for a
      migrated file system for particular clients, at some
      cost in performance, although it is not obligated
      to do so.  If the server will potentially perform a
      migration, it MUST set EXCHGID4_FLAG_SUPP_MOVED_MIGR
      in eir_flags.

    </t>
    <t>
      When EXCHGID4_FLAG_BIND_PRINC_STATEID is set, the
      client indicates that it wants the server to bind the
      stateid to the principal. This means that when a
      principal creates a stateid, it has to be the one to
      use the stateid. If the server will perform binding,
      it will return EXCHGID4_FLAG_BIND_PRINC_STATEID. The
      server MAY return EXCHGID4_FLAG_BIND_PRINC_STATEID
      even if the client does not request it. If
      an update to the client ID changes the value
      of EXCHGID4_FLAG_BIND_PRINC_STATEID's client
      ID property, the effect applies only to new
      stateids. Existing stateids (and all stateids with
      the same "other" field) that were created with
      stateid to principal binding in force will continue
      to have binding in force.  Existing stateids (and all
      stateids with the same "other" field) that were created
      with stateid to principal not in force will continue
      to have binding not in force.

    </t>

    <t>
     The EXCHGID4_FLAG_USE_NON_PNFS,
     EXCHGID4_FLAG_USE_PNFS_MDS,  and
     EXCHGID4_FLAG_USE_PNFS_DS bits are described in 
     Section 13.1 of <xref target="RFC5661" /> and convey roles the
     client ID is to be used for in a pNFS environment.
     The server MUST set one of the acceptable combinations
     of these bits (roles) in eir_flags, as specified in that
     section.
     Note that the same client owner/server owner pair can
     have multiple roles. Multiple roles can be associated
     with the same client ID or with different client
     IDs. Thus, if a client sends EXCHANGE_ID from the
     same client owner to the same server owner multiple
     times, but specifies different pNFS roles each time,
     the server might return different client IDs. Given
     that different pNFS roles might have different client
     IDs, the client may ask for different properties for
     each role/client ID.

    </t>

    <t>
     The spa_how field of the eia_state_protect field
     specifies how the client wants to protect its client,
     locking, and session states from unauthorized changes
     (Section 2.10.8.3 of <xref target="RFC5661"/>):

     <list style="symbols">
     <t>
      SP4_NONE. The client does not request the NFSv4.1 server
      to enforce state protection. The NFSv4.1 server MUST NOT
      enforce state protection for the returned client ID.
     </t>
     <t>
      SP4_MACH_CRED.  If spa_how is SP4_MACH_CRED, then
      the client MUST send the EXCHANGE_ID operation with RPCSEC_GSS
      as the security flavor, and with a service of
      RPC_GSS_SVC_INTEGRITY or RPC_GSS_SVC_PRIVACY. If SP4_MACH_CRED
      is specified, then the
      client wants to use an RPCSEC_GSS-based machine
      credential to protect its state. The server MUST note
      the principal the EXCHANGE_ID operation was sent
      with, and the GSS mechanism used.  These notes
      collectively comprise the machine credential.

	 <vspace blankLines='1' />

      After the client ID is confirmed, as long as the lease associated with
      the client ID is unexpired, a subsequent EXCHANGE_ID
      operation that uses the same eia_clientowner.co_owner
      as the first EXCHANGE_ID MUST also use the same
      machine credential as the first EXCHANGE_ID. The
      server returns the same client ID for
      the subsequent EXCHANGE_ID as that returned from
      the first EXCHANGE_ID.

     </t>
     <t>
      SP4_SSV. If spa_how is SP4_SSV, then
      the client MUST send the EXCHANGE_ID operation with RPCSEC_GSS
      as the security flavor, and with a service of
      RPC_GSS_SVC_INTEGRITY or RPC_GSS_SVC_PRIVACY.
      If SP4_SSV is specified, then
      the client wants to use the SSV to protect its state.
      The server records the credential used in the request
      as the machine credential (as defined above) for
      the eia_clientowner.co_owner.
      The CREATE_SESSION operation that
      confirms the client ID MUST use the same machine
      credential.

     </t>
     </list>
     </t>
     <t>
     When a client specifies SP4_MACH_CRED or SP4_SSV,
     it also provides two lists of operations (each
     expressed as a bitmap).  The first list
     is spo_must_enforce and consists of those operations
     the client MUST send (subject to the server confirming the
     list of operations in the result of EXCHANGE_ID) with the
     machine credential (if SP4_MACH_CRED protection is
     specified) or the SSV-based credential (if SP4_SSV
     protection is used).  The client MUST send the
     operations with RPCSEC_GSS credentials that specify
     the RPC_GSS_SVC_INTEGRITY or RPC_GSS_SVC_PRIVACY
     security service.  Typically, the first list of
     operations includes EXCHANGE_ID, CREATE_SESSION,
     DELEGPURGE, DESTROY_SESSION, BIND_CONN_TO_SESSION,
     and DESTROY_CLIENTID.  The client SHOULD NOT specify
     in this list any operations that require a filehandle
     because the server's access policies MAY conflict with
     the client's choice, and thus the client would then be
     unable to access a subset of the server's namespace.

     </t>
     <t>

     Note that if SP4_SSV protection is specified, and
     the client indicates that CREATE_SESSION must be
     protected with SP4_SSV, because the SSV cannot exist
     without a confirmed client ID, the first CREATE_SESSION
     MUST instead be sent using the machine credential,
     and the server MUST accept the machine credential.

     </t>
     <t>

     There is a corresponding result, also called spo_must_enforce,
     of the operations for which the server will require SP4_MACH_CRED or
     SP4_SSV protection. Normally, the server's result
     equals the client's argument, but the result MAY be different.
     If the client requests one or more operations in
     the set { EXCHANGE_ID, CREATE_SESSION,
     DELEGPURGE, DESTROY_SESSION, BIND_CONN_TO_SESSION,
     DESTROY_CLIENTID }, then the result spo_must_enforce
     MUST include the operations the client requested from that set.

     </t>
     <t>
     If spo_must_enforce in the results has BIND_CONN_TO_SESSION
     set, then connection binding enforcement is enabled, and
     the client MUST use the machine (if SP4_MACH_CRED protection is used)
     or SSV (if SP4_SSV protection is used) credential on calls
     to BIND_CONN_TO_SESSION.

     </t>
     <t>
     The second list is spo_must_allow and consists of those
     operations
     the client wants to have the option of sending with the machine credential or
     the SSV-based credential, even if the object the
     operations are performed on is not owned by the
     machine or SSV credential.

     </t>
     <t>

     The corresponding result, also called
     spo_must_allow, consists of the operations the server
     will allow the client to use SP4_SSV or SP4_MACH_CRED
     credentials with.
     Normally, the server's result
     equals the client's argument, but the result MAY be different.

     </t>
     <t>

     The purpose of spo_must_allow is to allow clients to
     solve the following conundrum. Suppose the client ID
     is confirmed with EXCHGID4_FLAG_BIND_PRINC_STATEID,
     and it calls OPEN with the RPCSEC_GSS credentials of
     a normal user. Now suppose the user's credentials expire,
     and cannot be renewed (e.g., a Kerberos ticket granting ticket
     expires, and the user has logged off and will not be
     acquiring a new ticket granting ticket). The client will be
     unable to send CLOSE without the user's credentials, which is to
     say the client has to either leave the state on the server
     or re-send EXCHANGE_ID with a new verifier to
     clear all state, that is, unless the client includes
     CLOSE on the list of operations in spo_must_allow and the
     server agrees.

     </t>
    <t>
     The SP4_SSV protection parameters also have:
     <list style="hanging">

     <t hangText="ssp_hash_algs:" />
     <t>
       This is the set of algorithms the client supports
       for the purpose of computing the digests needed for
       the internal SSV GSS mechanism and for the SET_SSV
       operation.  Each algorithm is specified as an object
       identifier (OID).  The REQUIRED algorithms for a
       server are id-sha1, id-sha224, id-sha256, id-sha384,
       and id-sha512 <xref target="RFC4055"/>.
       The algorithm the server selects among the
       set is indicated in spi_hash_alg, a field of
       spr_ssv_prot_info. The field spi_hash_alg is an
       index into the array ssp_hash_algs. 

       If the server
       does not support any of the offered algorithms,
       it returns NFS4ERR_HASH_ALG_UNSUPP.

       If ssp_hash_algs is empty, the server MUST return NFS4ERR_INVAL.

     </t>
     <t hangText="ssp_encr_algs:" />
     <t>
       This is the set of algorithms the client supports for the
       purpose of providing privacy protection for the internal
       SSV GSS mechanism.  Each algorithm is
       specified as an OID.
       The REQUIRED algorithm for a server is id-aes256-CBC.
       The RECOMMENDED algorithms are id-aes192-CBC and id-aes128-CBC
       <xref target="CSOR_AES" />. The selected algorithm is
       returned in spi_encr_alg, an index into ssp_encr_algs.

       If the server
       does not support any of the offered algorithms,
       it returns NFS4ERR_ENCR_ALG_UNSUPP.

       If ssp_encr_algs is empty, the server MUST return NFS4ERR_INVAL.

       Note that due to previously stated requirements and recommendations
       on the relationships between key length and hash length, some
       combinations of RECOMMENDED and REQUIRED encryption algorithm and
       hash algorithm either SHOULD NOT or MUST NOT be used.
       <xref target="algtbl"/> summarizes the illegal and discouraged
       combinations.

     </t>
     <t hangText="ssp_window:" />
     <t>
       This is the number of SSV versions the client wants
       the server to maintain (i.e., each successful call to SET_SSV
       produces a new version of the SSV). If ssp_window is zero, the
       server MUST return NFS4ERR_INVAL. The server responds
       with spi_window, which MUST NOT exceed ssp_window, and MUST 
       be at least one.
       Any requests on the backchannel or fore channel that
       are using a version of the SSV that is outside the window will fail with
       an ONC RPC authentication error, and the requester
       will have to retry them with the same slot ID and
       sequence ID.
     </t>

     <t hangText="ssp_num_gss_handles:" />
     <t>
       This is the number of RPCSEC_GSS handles the
       server should create that are based on the GSS
       SSV mechanism (see section 2.10.9 of
       <xref target="RFC5661" />).
       It is not the total number of RPCSEC_GSS handles for
       the client ID. Indeed, subsequent calls to EXCHANGE_ID
       will add RPCSEC_GSS handles.
       The server responds with a list of handles in
       spi_handles. If the client asks for at least
       one handle and the server cannot create it,
       the server MUST return an error.  The handles in
       spi_handles are not available for use until the
       client ID is confirmed, which could be immediately
       if EXCHANGE_ID returns EXCHGID4_FLAG_CONFIRMED_R,
       or upon successful confirmation from CREATE_SESSION.
		 <vspace blankLines='1' />
       While a client ID can span all the connections
       that are connected to a server sharing the same
       eir_server_owner.so_major_id, the RPCSEC_GSS
       handles returned in spi_handles can only be used
       on connections connected to a server that returns
       the same the eir_server_owner.so_major_id and
       eir_server_owner.so_minor_id on each connection.
       It is permissible for the client to set
       ssp_num_gss_handles to zero; the client can
       create more handles with another EXCHANGE_ID call.
		 <vspace blankLines='1' />
       Because each SSV RPCSEC_GSS handle shares a common SSV GSS context,
       there are security considerations specific to this situation
       discussed in Section 2.10.10 of <xref target="RFC5661"/>.
		 <vspace blankLines='1' />
       The seq_window (see Section 5.2.3.1 of <xref target="RFC2203"/>)
       of each RPCSEC_GSS handle in spi_handle
       MUST be the same as the seq_window of
       the RPCSEC_GSS handle used for the credential of the RPC request
       that the EXCHANGE_ID operation was sent as a part of.

     </t>
      
     </list>
     
    </t>
      <texttable anchor='algtbl'>
	      <ttcol align='left'>Encryption Algorithm</ttcol>
	      <ttcol align='left'>MUST NOT be combined with</ttcol>
	      <ttcol align='left'>SHOULD NOT be combined with</ttcol>
	      <c>id-aes128-CBC</c> <c></c> <c>id-sha384, id-sha512</c>
	      <c>id-aes192-CBC</c> <c>id-sha1</c> <c>id-sha512</c>
	      <c>id-aes256-CBC</c> <c>id-sha1, id-sha224</c> <c></c>
      </texttable>

    <t>
      The arguments include an array of up to one
      element in length called eia_client_impl_id. If
      eia_client_impl_id is present, it contains the
      information identifying the implementation of the
      client. Similarly, the results include an array of up
      to one element in length called eir_server_impl_id
      that identifies the implementation of the server.
      Servers MUST accept a zero-length eia_client_impl_id
      array, and clients MUST accept a zero-length
      eir_server_impl_id array.
   
    </t>
    <t>
      A possible use for implementation identifiers
      would be in diagnostic software that extracts
      this information in an attempt to identify
      interoperability problems, performance workload
      behaviors, or general usage statistics.  Since the
      intent of having access to this information is for
      planning or general diagnosis only, the client and
      server MUST NOT interpret this implementation
      identity information in a way that affects
      how the implementation behaves in interacting with 
      its peer.  The client and server are not
      allowed to depend on the peer's manifesting a particular
      allowed behavior based on an implementation identifier
      but are required to interoperate as specified elsewhere
      in the protocol specification.
    </t>
    <t>
      Because it is possible that some implementations might
      violate the protocol specification and interpret
      the identity information, implementations MUST
      provide facilities to allow the NFSv4 client and server 
      be configured to
      set the contents of the nfs_impl_id structures sent
      to any specified value.

    </t>

        </section>
        <section title="IMPLEMENTATION"
                 anchor="EXID-impl"
                 toc="exclude">
    <t>
      A server's client record is a 5-tuple:
    </t>
    <t>
      <list style="numbers">
	<t>co_ownerid
	<list style="empty">
	  <t>The client identifier string, from the eia_clientowner
	  structure of the EXCHANGE_ID4args structure.</t>
	</list></t>

	<t>co_verifier:
	<list style="empty">
	  <t>A client-specific value used to indicate incarnations (where a client restart represents a new incarnation), from the
	  eia_clientowner structure of the EXCHANGE_ID4args
	  structure.</t>
	</list></t>

	<t>principal:
	<list style="empty">
	  <t>
           The principal that was defined in the RPC header's credential
           and/or verifier at the time the client record was
           established.
         </t>
	</list></t>

	<t>client ID:
	<list style="empty">
	  <t>The shorthand client identifier, generated by the server and
	  returned via the eir_clientid field in the EXCHANGE_ID4resok
	  structure.</t>
	</list></t>

	<t>confirmed:
	<list style="empty">
	  <t>A private field on the server indicating whether or not a
	  client record has been confirmed.  A client record is
	  confirmed if there has been a successful CREATE_SESSION
	  operation to confirm it.  Otherwise, it is unconfirmed.  An
	  unconfirmed record is established by an EXCHANGE_ID call.
	  Any unconfirmed record that is not confirmed within a lease
	  period SHOULD be removed.</t>
	</list></t>
	
      </list>
    </t>
    <!-- start new list -->
    <t>
      The following identifiers represent special values for the fields
      in the records.
      <list style="hanging">
	<t hangText="ownerid_arg:"/>
	<t>
	  The value of the eia_clientowner.co_ownerid subfield of the
	  EXCHANGE_ID4args structure of the current request.
	</t>
	<t hangText="verifier_arg:"/>
	<t>
	  The value of the eia_clientowner.co_verifier subfield of the
	  EXCHANGE_ID4args structure of the current request.
	</t>
	<t hangText="old_verifier_arg:"/>
	<t>
	  A value of the eia_clientowner.co_verifier field of a client record
	  received in a previous request; this is distinct from
	  verifier_arg.
	</t>
	<t hangText="principal_arg:"/>
	<t>
	  The value of the RPCSEC_GSS principal for the current request.
	</t>
	<t hangText="old_principal_arg:"/>
	<t>
	  A value of the principal of a client record as defined by the
          RPC header's credential or verifier of a previous request.
	  This is distinct from principal_arg.
	 
	</t>
	<t hangText="clientid_ret:"/>
	<t>
	  The value of the eir_clientid field the server will return in the
	  EXCHANGE_ID4resok structure for the current request.
	</t>
	<t hangText="old_clientid_ret:"/>
	<t>
	  The value of the eir_clientid field the server returned in the
	  EXCHANGE_ID4resok structure for a previous request.  This
	  is distinct from clientid_ret.
	</t>
	<t hangText="confirmed:"/>
	<t>
          The client ID has been confirmed.
	</t>
	<t hangText="unconfirmed:"/>
	<t>
          The client ID has not been confirmed.
	</t>
      </list>
    </t>
    <t>
      Since EXCHANGE_ID is a non-idempotent operation, we must
      consider the possibility that retries occur as a result of a
      client restart, network partition, malfunctioning router, etc.
      Retries are identified by the value of the eia_clientowner field of
      EXCHANGE_ID4args, and the method for dealing with them is
      outlined in the scenarios below.
    </t>
    <t>
      The scenarios are described in terms of the
      client record(s) a server has for a given
      co_ownerid. Note that if the client ID
      was created specifying SP4_SSV state protection and
      EXCHANGE_ID as the one of the operations in spo_must_allow,
      then the server MUST authorize EXCHANGE_IDs with the SSV
      principal in addition to the principal that created the
      client ID.
    </t>
    <t anchor="case_list">
      <list style="numbers">
	<t anchor="case_new_owner_id">New Owner ID
	<list style="empty">
	  <t>
	    If the server has no client records
	    with eia_clientowner.co_ownerid matching
	    ownerid_arg, and EXCHGID4_FLAG_UPD_CONFIRMED_REC_A is not
	    set in the EXCHANGE_ID, then a new shorthand
	    client ID (let us call it clientid_ret)
	    is generated, and the following unconfirmed
	    record is added to the server's state.

		 <vspace blankLines='1' />

    { ownerid_arg, verifier_arg, principal_arg, clientid_ret, unconfirmed }

		 <vspace blankLines='1' />

	    Subsequently, the server returns clientid_ret.

		 <vspace blankLines='1' />

	  </t>
	</list>
		 <vspace blankLines='1' />
	</t>

	<t anchor="case_non_update">Non-Update on Existing Client ID
	<list style="empty">
	  <t>
	    If the server has the following confirmed record, and
            the request does not have
	    EXCHGID4_FLAG_UPD_CONFIRMED_REC_A set,
	    then the request is the result of a retried request due to a
	    faulty router or lost connection, or
            the client is trying to determine if it can perform
            trunking.

		 <vspace blankLines='1' />

    { ownerid_arg, verifier_arg, principal_arg, clientid_ret, confirmed }

		 <vspace blankLines='1' />

	    Since the record has been confirmed, the client
	    must have received the server's reply from
	    the initial EXCHANGE_ID request. Since the
	    server has a confirmed record, and since
	    EXCHGID4_FLAG_UPD_CONFIRMED_REC_A is not set, with the
            possible exception of eir_server_owner.so_minor_id, the
	    server returns the same result it did when
	    the client ID's properties were last updated
	    (or if never updated, the result when the
	    client ID was created). The confirmed record
            is unchanged.
	  </t>
	</list>
	</t>

	<t anchor="case_client_collision">Client Collision
	<list style="empty">
	  <t>
	    If EXCHGID4_FLAG_UPD_CONFIRMED_REC_A is not set, and
	    if the server has the following confirmed
	    record, then this request is likely the result
	    of a chance collision between the values of
	    the eia_clientowner.co_ownerid subfield of
	    EXCHANGE_ID4args for two different clients.

	  </t>
	  <t>
      
	    { ownerid_arg, *, old_principal_arg, old_clientid_ret, confirmed }
	  </t>
	  <t>
            If there is currently no state associated with old_clientid_ret,
            or if there is state but the lease has expired, then
            this case is effectively equivalent to the
            New Owner ID case of <xref target="case_new_owner_id"/>.
            The confirmed record is deleted, the old_clientid_ret and its
            lock state are deleted, 
	    a new shorthand client ID
	    is generated, and the following unconfirmed
	    record is added to the server's state.

		 <vspace blankLines='1' />

    { ownerid_arg, verifier_arg, principal_arg, clientid_ret, unconfirmed }

		 <vspace blankLines='1' />

	    Subsequently, the server returns clientid_ret.

		 <vspace blankLines='1' />
	  </t>
          <t>
            If old_clientid_ret has an unexpired lease with state, then
	    no state of old_clientid_ret is changed or deleted.
            The server returns NFS4ERR_CLID_INUSE
	    to indicate that the client should
	    retry with a different value for the
	    eia_clientowner.co_ownerid subfield of
	    EXCHANGE_ID4args. The client record is not changed.
	  </t>
	</list>
	</t>

	<t anchor="case_retry">Replacement of Unconfirmed Record
	<list style="empty">
	  <t>
            If the EXCHGID4_FLAG_UPD_CONFIRMED_REC_A flag is not set,
	    and the server has the following unconfirmed record, then
            the client is attempting EXCHANGE_ID again on an
            unconfirmed client ID, perhaps due to a retry, a client
            restart before client ID confirmation (i.e., 
            before CREATE_SESSION was called), or
            some other reason.

		 <vspace blankLines='1' />

	    { ownerid_arg, *, *, old_clientid_ret, unconfirmed }

		 <vspace blankLines='1' />

            It is possible that
            the properties of old_clientid_ret are
            different than those specified in the current
            EXCHANGE_ID. Whether or not the properties are being updated,
            to eliminate ambiguity, the server
            deletes the unconfirmed record, generates a
            new client ID (clientid_ret), and establishes
            the following unconfirmed record:

		 <vspace blankLines='1' />

	    { ownerid_arg, verifier_arg, principal_arg, clientid_ret, unconfirmed }
		 <vspace blankLines='1' />
	  </t>
	</list>
	</t>

	<t anchor="case_client_restart">Client Restart
	<list style="empty">
	  <t>
	    If EXCHGID4_FLAG_UPD_CONFIRMED_REC_A is not set, and
	    if the server has the following confirmed client record, then
	    this request is likely from a previously confirmed client
	    that has restarted.
	  </t>
	  <t>
	    { ownerid_arg, old_verifier_arg, principal_arg, old_clientid_ret, confirmed }
	  </t>
	  <t>
	    Since the previous incarnation of the same
	    client will no longer be making requests,
	    once the new client ID is confirmed by
	    CREATE_SESSION, byte-range locks and share reservations
	    should be released immediately rather than
	    forcing the new incarnation to wait for
	    the lease time on the previous incarnation
	    to expire.	Furthermore, session state should
	    be removed since if the client had maintained
	    that information across restart, this request
	    would not have been sent.  If the server
	    supports neither the CLAIM_DELEGATE_PREV
            nor CLAIM_DELEG_PREV_FH
	    claim types, associated delegations should be
	    purged as well; otherwise, delegations are
	    retained and recovery proceeds according to
	    section 10.2.1 of <xref target="RFC5661"/>.

	  </t>
	  <t>
	    After processing, clientid_ret is returned to the client and
	    this client record is added:
	  </t>
	  <t>
	    { ownerid_arg, verifier_arg, principal_arg, clientid_ret, unconfirmed }
		 <vspace blankLines='1' />
	  </t>
          <t>
	    The previously described confirmed record
	    continues to exist, and thus the same
	    ownerid_arg exists in both a confirmed and
	    unconfirmed state at the same time. The number
	    of states can collapse to one once the server
	    receives an applicable CREATE_SESSION or
	    EXCHANGE_ID.

            <list style='symbols'>

            <t>
	     If the server subsequently receives a successful
	     CREATE_SESSION that confirms clientid_ret,
	     then the server atomically destroys the
	     confirmed record and makes the unconfirmed
	     record confirmed as described in section 16.36.3 of
	     <xref target="RFC5661" />.

            </t>

            <t>
	     If the server instead subsequently receives
	     an EXCHANGE_ID with the client owner equal
	     to ownerid_arg, one strategy is to simply
	     delete the unconfirmed record, and process the
	     EXCHANGE_ID as described in the entirety of
	     <xref target="EXID-impl"
	     />.

            </t>

	    </list>

          </t>
	</list>
	</t>

	<t anchor="case_update">Update
	<list style="empty">
	  <t>
	    If EXCHGID4_FLAG_UPD_CONFIRMED_REC_A is set, and the
	    server has the following confirmed record,
	    then this request is an attempt at an update.

		 <vspace blankLines='1' />

    { ownerid_arg, verifier_arg, principal_arg, clientid_ret, confirmed }

		 <vspace blankLines='1' />

	    Since the record has been confirmed, the client must have
	    received the server's reply from the initial EXCHANGE_ID
	    request. The server allows the update, and the client record
            is left intact.
	  </t>
	</list>
	</t>

	<t anchor="case_update_noent">Update but No Confirmed Record
	<list style="empty">
          <t>
	    If EXCHGID4_FLAG_UPD_CONFIRMED_REC_A is set, and the
            server has no confirmed record corresponding ownerid_arg,
            then the server returns NFS4ERR_NOENT and leaves any unconfirmed
            record intact.
          </t>
	</list>
	</t>

	<t anchor="case_update_exist">Update but Wrong Verifier
	<list style="empty">
          <t>
	    If EXCHGID4_FLAG_UPD_CONFIRMED_REC_A is set, and the
	    server has the following confirmed record,
	    then this request is an illegal attempt at an
	    update, perhaps because of a retry from a previous client
            incarnation.

		 <vspace blankLines='1' />

    { ownerid_arg, old_verifier_arg, *, clientid_ret, confirmed }

		 <vspace blankLines='1' />

	    The server returns NFS4ERR_NOT_SAME and leaves the client record
            intact.
          </t>
	</list>
	</t>

	<t anchor="case_update_perm">Update but Wrong Principal
	<list style="empty">
          <t>
	    If EXCHGID4_FLAG_UPD_CONFIRMED_REC_A is set, and the
	    server has the following confirmed record,
	    then this request is an illegal attempt at an
	    update by an unauthorized principal.

		 <vspace blankLines='1' />

    { ownerid_arg, verifier_arg, old_principal_arg, clientid_ret, confirmed }

		 <vspace blankLines='1' />

	    The server returns NFS4ERR_PERM and leaves the client record
            intact.
          </t>
	</list>
	</t>

      </list>
    </t>
	  
        </section>    
      </section>      
      <section title="Operation 58: RECLAIM_COMPLETE - Indicates Reclaims Finished (as updated)"
	       anchor="RC">

  <section toc="exclude" anchor="OP_RECLAIM_COMPLETE_ARGUMENT" title="ARGUMENT">
<figure>
 <artwork>

&lt;CODE BEGINS&gt;

struct RECLAIM_COMPLETE4args {
        /*
         * If rca_one_fs TRUE,
         *
         *    CURRENT_FH: object in
         *    file system reclaim is
         *    complete for.
         */
        bool            rca_one_fs;
};

&lt;CODE ENDS&gt;
 </artwork>
</figure>
  </section>

  <section toc="exclude" anchor="OP_RECLAIM_COMPLETE_RESULTS" title="RESULTS">
<figure>
 <artwork>
&lt;CODE BEGINS&gt;

struct RECLAIM_COMPLETE4res {
        nfsstat4        rcr_status;
};

&lt;CODE ENDS&gt;
 </artwork>
</figure>
  </section>

  <section toc="exclude" anchor="OP_RECLAIM_COMPLETE_DESCRIPTION" title="DESCRIPTION">
    <t>
      A RECLAIM_COMPLETE operation is used to indicate that the client
      has reclaimed all of the locking state that it will recover using
      reclaim,
      when it is recovering state due to either a server restart or the
      migration of a file system to another server.  There are two types
      of RECLAIM_COMPLETE operations:
      <list style="symbols">
        <t>
          When rca_one_fs is FALSE, a global RECLAIM_COMPLETE is being
          done.  This indicates that recovery of all
          locks that the client held on the previous server instance
          have been completed.  The current filehandle need not be set in
	  this case.
        </t>
        <t>
          When rca_one_fs is TRUE, a file system-specific RECLAIM_COMPLETE
          is being done.  This indicates that recovery of locks
          for a single fs (the one designated by the current filehandle)
          due to the migration of the file system has been completed.  Presence
          of a current filehandle is required when rca_one_fs is set to TRUE.
	  When the current filehandle designates a filehandle in a file system
	  not in the process of migration, the operation returns NFS4_OK and
	  is otherwise ignored.
        </t>
      </list>
    </t>
    <t>
      Once a RECLAIM_COMPLETE is done, there can be no further
      reclaim operations for locks whose scope is defined as having
      completed recovery.  Once the client sends RECLAIM_COMPLETE, 
      the server will not allow the client to do
      subsequent reclaims of locking state for that scope 
      and, if these are attempted, will return NFS4ERR_NO_GRACE.
    </t>
    <t>
      Whenever a client establishes a new client ID and before it does
      the first non-reclaim operation that obtains a lock, it MUST send a
      RECLAIM_COMPLETE with rca_one_fs set to FALSE, even if there
      are no locks to 
      reclaim.  If non-reclaim
      locking operations are done before the RECLAIM_COMPLETE, an NFS4ERR_GRACE
      error will be returned.
    </t>
    <t>
      Similarly, when the client accesses a migrated file system on a new
      server, before it sends the first non-reclaim operation that
      obtains a lock on this new server, it MUST send a RECLAIM_COMPLETE
      with rca_one_fs set to TRUE and current filehandle within that file system,
      even if there are no locks to reclaim.  If non-reclaim locking
      operations are done on that file system before the
      RECLAIM_COMPLETE, an NFS4ERR_GRACE error will be returned.
    </t>
    <t>
      It should be noted that there are situations in which a client needs
      to issue both forms of RECLAIM_COMPLETE.   An example is an instance
      of file system migration in which the file system is migrated to a
      server for which the client has no clientid.  As a result, the client
      needs to obtain a clientid from the server (incurring the responsibility
      to do RECLAIM_COMPLETE with rca_one_fs set to FALSE) as well as
      RECLAIM_COMPLETE with rca_one_fs set to TRUE to complete the per-fs
      grace period associated with the file system migration.
    </t>
    <t>
      Any locks not reclaimed at the point at which RECLAIM_COMPLETE
      is done become non-reclaimable.  The client MUST NOT attempt 
      to reclaim them, either during 
      the current server instance or in any subsequent
      server instance, or on another server to which responsibility
      for that file system is transferred.  If the client were to do so, 
      it would be
      violating the protocol by representing itself as owning locks
      that it does not own, and so has no right to reclaim.  See
      Section 8.4.3 of <xref target="RFC5661"/> for a 
      discussion of edge conditions related to lock reclaim.
    </t>
    <t>
      By sending a RECLAIM_COMPLETE, the client indicates readiness
      to proceed to do normal non-reclaim locking operations.  The client
      should be aware that such operations may temporarily result in 
      NFS4ERR_GRACE errors until the server is ready to terminate its
      grace period.
    </t>
  </section>
  <section toc="exclude" anchor="OP_RECLAIM_COMPLETE_IMPLEMENTATION" title="IMPLEMENTATION">
    <t>
      Servers will typically use the information as to when reclaim
      activity is complete to reduce the length of the grace period.
      When the server maintains in persistent storage
      a list of clients that might have had locks,
      it is able to use the fact that
      all such clients have done a RECLAIM_COMPLETE to terminate the
      grace period and begin normal operations (i.e., grant requests
      for new locks) sooner than it might otherwise.
    </t>
    <t>
      Latency can be minimized by doing a RECLAIM_COMPLETE as part of
      the COMPOUND request in which the last lock-reclaiming operation
      is done.  When there are no reclaims to be done, RECLAIM_COMPLETE
      should be done immediately in order to allow the grace period 
      to end as soon as possible.
    </t>
    <t>
      RECLAIM_COMPLETE should only be done once for each server instance
      or occasion of the transition of a file system.
      If it is done a second time, the error NFS4ERR_COMPLETE_ALREADY will 
      result.  Note that because of the session feature's retry protection,
      retries of COMPOUND
      requests containing RECLAIM_COMPLETE operation will not result 
      in this error.
    </t>
    <t>
      When a RECLAIM_COMPLETE is sent, the client effectively acknowledges
      any locks not yet reclaimed as lost.  This allows the server to
      re-enable the client to recover locks if the occurrence of edge
      conditions, as described in Section 8.4.3 of <xref target="RFC5661"/>,
      had caused the server to disable the client's ability to
      recover locks.
    </t>
    <t>
      Because previous descriptions of RECLAIM_COMPLETE were not 
      sufficiently explicit about the circumstances in which use of
      RECLAIM_COMPLETE with rca_one_fs set to TRUE was appropriate,
      there have been cases which it has been misused by clients, and
      cases in which servers have, in various ways, not responded to
      such misuse as described above.   While clients SHOULD NOT misuse
      this feature and servers SHOULD respond to such misuse as described
      above, implementers need to be aware of the following considerations
      as they make necessary tradeoffs between interoperability with
      existing implementations and proper support for facilities to
      allow lock recovery in the event of file system migration.
    <list style="symbols">
      <t>
	When servers have no support for becoming the destination server
	of a file system subject to migration, there is no possibility of
	a per-fs RECLAIM_COMPLETE being done legitimately and occurrences of it
	SHOULD be ignored.  However, the negative consequences of accepting
	mistaken use are quite limited as long as the does not issue it
	before all necessary reclaims are done.
      </t>	
      <t>
	When a server might become the destination for a file system being
	migrated, inappropriate use per-fs RECLAIM_COMPLETE is more
	concerning.  In the case in which the file system designated is not
	within a per-fs grace period, it SHOULD be ignored, with the
	negative consequences of accepting it being limited, as in the
	case in which migration is not supported.  However, if it should
	encounter a file system undergoing migration, it cannot be accepted
	as if it were a global RECLAIM_COMPLETE without invalidating its
	intended use.
      </t>	
    </list>	
    </t>
  </section>
</section>
        
      <section title="Security Considerations"
	       anchor="SECCON">
      <t>
        The Security Considerations section of
        <xref target="RFC5661" /> needs the additions below to
        properly address some aspects of trunking discovery, referral,
        migration and replication.
      <list style="none">
        <t>
          The possibility that requests to determine the set of network 
          addresses corresponding to a given server might be interfered 
          with or have their responses corrupted needs to be taken into 
          account.  In light of this, the following considerations 
          should be taken note of:
        <list style="format o  ">
          <t>
            When DNS is used to convert server named  to addresses 
            and DNSSEC <xref target="RFC4033"/>
	    is not available, the validity of the network
            addresses returned cannot be relied upon.  However, when the
            client uses RPCSEC_GSS to access the designated server,
            it is possible for mutual authentication
            to discover invalid server addresses provided. 
          </t>
          <t>
            The fetching of 
            attributes containing location information SHOULD be 
            performed using RPCSEC_GSS with integrity protection, 
            as previously 
            explained in  the Security Considerations section of 
            <xref target="RFC5661"/>.  It is important to note here that 
            a client making a request of this sort without using
            RPCSEC_GSS including integrity protection needs be aware of 
            the negative consequences of doing so, which can lead to 
            invalid host names or network addresses being returned.  
            In light of
            this, the client needs to recognize that using such returned 
            location information to access an NFSv4 server
            without use of RPCSEC_GSS (i.e.
            by using AUTH_SYS) poses dangers as it can result in the client
            interacting with an unverified network address posing as an
            NFSv4 server. 
          </t>
          <t>
            Despite the fact that it is a REQUIREMENT (of 
            <xref target="RFC5661"/>) that "implementations" provide
            "support" for use of RPCSEC_GSS, it cannot be assumed that 
            use of RPCSEC_GSS is always available between any particular
            client-server pair.
          </t>
          <t>
            When a client has the network addresses of a server but not the
            associated host names, that would interfere with its ability
            to use RPCSEC_GSS.
          </t>
        </list>
        </t>
        <t>
          In light of the above, a server should present location
          entries that correspond to file systems on other servers using a
          host name.  This would allow the client to interrogate the 
          fs_locations on the destination server to obtain trunking information
          (as well as replica information) using RPCSEC_GSS with integrity,
          validating the name provided while assuring that the response has
          not been corrupted. 
        </t> 
	<t>
          When RPCSEC_GSS is not available on a server, the client needs
          to be aware of the fact that the location entries are subject to
          corruption and cannot be relied upon.  In the case of a 
          client being directed to another server after NFS4ERR_MOVED, 
          this could vitiate the
          authentication provided by the use of RPCSEC_GSS on the destination.
	  Even when RPCSEC_GSS authentication is available
	  on the destination, the server might validly represent itself
	  as the server to which the
          client was erroneously directed.  Without a way to decide whether
          the server is a valid one, the client can only determine, using 
          RPCSEC_GSS, that the server corresponds to the name provided, with
          no basis for trusting that server.  As a result, the client should
          not use such unverified location entries as a basis for migration,
	  even though RPCSEC_GSS might be available on the destination.
	</t>       
	<t>
          When a location attribute is fetched upon connecting with an 
          NFS server, it SHOULD, as stated above, be done using RPCSEC_GSS 
          with integrity protection.  When this not possible, it is generally 
          best for the client to ignore trunking and replica information or 
          simply not fetch the location information for these purposes.
	</t>
	<t>
          When location information cannot be verified, it can be subjected 
          to additional filtering to prevent the client from being 
          inappropriately directed.  For example, if a range of network 
          addresses can be determined that assure that the servers and 
          clients using AUTH_SYS are subject to the appropriate set of 
          constrains (e.g. physical network isolation, administrative 
          controls on the operating systems used), then network addresses 
          in the appropriate range can be used with others discarded 
          or restricted in their use of AUTH_SYS.
	</t>
        <t>
          To summarize considerations regarding the use of RPCSEC_GSS in
          fetching location information, we need to consider the following 
          possibilities for requests to interrogate location information, with
          interrogation approaches on the referring and destination servers  
          arrived at separately:
        <list style="format o  ">
          <t>
            The use of RPCSEC_GSS with integrity protection is RECOMMENDED 
            in all cases, since the absence of integrity protection exposes
	    the client to the possibility of the results
            being modified in transit.
          </t>
          <t>
            The use of requests issued without RPCSEC_GSS 
            (i.e. using AUTH_SYS),
            while undesirable, may not be avoidable in all cases.  
            Where the use
            of the returned information cannot be avoided, it should be 
            subject to filtering to eliminate the possibility that the
            client would
            treat an invalid address as if it were a NFSv4 server.  The 
            specifics will vary depending on the degree of network isolation
            and whether the request is to the referring or destination servers.
          </t>
        </list>
        </t>
      </list>
      </t>
    </section>
  
    <section title="IANA Considerations"
             anchor="IANA">
      <t>
        This document does not require actions by IANA.
      </t>
    </section>        
          
  </middle>
  <back>
    <references title="Normative References">
      <?rfc include="reference.RFC.2119.xml"?>
      <?rfc include="reference.RFC.2203.xml"?>
      <?rfc include="reference.RFC.4033.xml"?> 
      <?rfc include="reference.RFC.4055.xml"?> 
      <?rfc include="reference.RFC.5403.xml"?> 
      <?rfc include="reference.RFC.5531.xml"?> 
      <?rfc include="reference.RFC.5661.xml"?> 
      <?rfc include="reference.RFC.7530.xml"?>
      <?rfc include="reference.RFC.7861.xml"?>
      <?rfc include="reference.RFC.7931.xml"?>
      <?rfc include="reference.RFC.8166.xml"?>
      <?rfc include="reference.RFC.8178.xml"?>
      <reference anchor="CSOR_AES">
      <front>
        <title>
	  Cryptographic Algorithm Object Registration
        </title>
	<author>
	  <organization>
	    National Institute of Standards and Technology
	  </organization>
	</author>
	<date month="November" year="2007" />
      </front>
        <seriesInfo name="URL" value="http://csrc.nist.gov/groups/ST/crypto_apps_infra/csor/algorithms.html" />
      </reference>
    </references>
    <references title="Informative References">
        <?rfc include="reference.RFC.2104.xml"?>
        <?rfc include="reference.I-D.cel-nfsv4-mv0-trunking-update.xml"?>
    </references>
 
    <section title="Classification of Document Sections"
	     anchor="CLASS">
      <t>
	Using the classification appearing in <xref target="PRELIM-rel"/>, we 
        can proceed through the current document and classify its sections
        as listed below.  In this listing, when we refer to a Section X and 
        there is a Section X.1 within it, the classification of Section X
	refers to the
        part of that section exclusive of subsections.  In the case when that
        portion is empty, the section is not counted. 
      <list style="symbols">
        <t>
          Sections <xref target="INTRO" format="counter"/> through
          <xref target="SEC11" format="counter"/>, a total of five
	  sections, are all explanatory.
        </t>
        <t>
	  <xref target="SEC11-msns-oview"/>
	  is a replacement section.
        </t>
        <t>
	  <xref target="SEC11-loc-attr"/> is an additional section.
        </t>
        <t>
	  <xref target="SEC11-loc-attr"/> is a replacement section.
        </t>
        <t>
	  <xref target="SEC11-uses-reorg"/> is explanatory.
        </t>
        <t>
	  <xref target="SEC11-USES"/> is a replacement section.
        </t>
        <t>
	  Sections <xref target="SEC11-USES-mult" format="counter"/> 
          through <xref target="SEC11-USES-types"  format="counter"/>, 
          a total of three sections, are all additional sections.
        </t>
        <t>
	  Sections <xref target="SEC11-USES-repl" format="counter"/> 
          through <xref target="SEC11-USES-ref"  format="counter"/>, 
          a total of three sections, are all replacement sections.
        </t>
        <t>
	  <xref target="SEC11-USES-changes"/> is an additional section.
        </t>
        <t> 
	  <xref target="SEC11-trans-reorg"/> is explanatory.
        </t>
        <t>
	  Sections <xref target="SEC11-trans-oview" format="counter"/> 
          and  <xref target="SEC11-nwa"  format="counter"/>
	  are additional sections.
        </t>
        <t>
	  Sections <xref target="SEC11-EFF" format="counter"/> 
          through  <xref target="SEC11-EFF-lock"  format="counter"/>,
	  a total of ten sections, are all replacement sections.
        </t>
        <t>
	  Sections <xref target="SEC11-trans-locking" format="counter"/> 
          through  <xref target="SEC11-XS-session"  format="counter"/>,
	  a total of twelve sections, are all additional sections.
        </t>
        <t>
	  <xref target="SEC11-li-changes"/> is explanatory.
        </t>
        <t>
	  Sections <xref target="SEC11-li-new" format="counter"/>
	  throuhy <xref target="SEC11-fsli-item" format="counter"/>,
	  a total of four sections, are all replacemebt sections.
        </t>
        <t>
	  <xref target="OTH"/> is explanatory.
        </t>
        <t>
	  Sections <xref target="OTH-intro" format="counter"/>
	  and <xref target="OTH-scope" format="counter"/> are
	  replacement sections.
        </t>
        <t>
	  Sections <xref target="OTH-moved" format="counter"/> 
          and  <xref target="OTH-so"  format="counter"/>
	  are editing sections.
        </t>
        <t>
	  Sections <xref target="OTH-eid" format="counter"/>
	  and <xref target="OTH-rc" format="counter"/>
	  is explanatory.
        </t>
	<t>
	  <xref target="OTH-recerror"/> is a replcement section, which consists
	  of a total of six sections.
        </t>
        <t>
	  <xref target="EXID"/> is a replacement section, which consists
	  of a total of five sections.
        </t>
        <t>
	  <xref target="RC"/> is a replacement section, which consists
	  of a total of five sections.
        </t>
        <t>
	  <xref target="SECCON"/> is an editing section.
        </t>
        <t>
          <xref target="IANA"/> through  Acknowledgments,
	  a total of six sections, 
	  are all explanatory.
        </t>
      </list>
      </t>
      <t>
	To summarize:
      <list style="symbols">
        <t>
	  There are seventeen explanatory sections.
        </t>
        <t>
	  There are thirty-seven replacement sections.
        </t>
        <t>
	  There are eightteen additional sections.	  
        </t>
        <t>
	  There are three editing sections.
        </t>
      </list>
      </t>
    </section>
    <section title="Updates to RFC5661"
	     anchor="UPD">
      <t>
        In this appendix, we proceed through <xref target="RFC5661"/> 
        identifying sections as unchanged, modified, deleted,  
        or replaced and indicating where 
        additional sections from the current document would appear in an
        eventual consolidated description of NFSv4.1.  In this presentation,
	when section X is referred to, it denotes that section plus all
	included subsections.  When it is necessary 
        to refer to the part of a section outside any included subsections, the
        exclusion is noted explicitly.
      <list style="symbols">
        <t>
	  Section 1 is unmodified except that Section 1.7.3.3 is to be
	  replaced by <xref target="OTH-intro"/> from the current document.
        </t>
        <t>
	  Section 2 is unmodified except for the specific items listed below:
        <list style="format o  ">
          <t>
	    Section 2.10.4 is replaced by <xref target="OTH-scope"/>
	    from the current document.
          </t>
          <t>
	    Section 2.10.5 is modified as discussed in
	    <xref target="OTH-so"/> of the current document.
          </t>
        </list>
        </t>
        <t>
	  Sections 3 through 10 are unchanged.
        </t>
        <t>
	  Section 11 is extensively modified as discussed below.
        <list style="format o  ">
          <t>
	    Section 11, exclusive of subsections, is replaced by
	    Sections <xref target="SEC11-msns-oview" format="counter"/>
	    and <xref target="SEC11-loc-term" format="counter"/> from the 
            current document.
          </t>
          <t>
	    Section 11.1 is replaced by <xref target="SEC11-loc-attr"/>
	    from the current document.
          </t>
          <t>
	    Sections 11.2, 11.3, 11.3.1, and 11.3.2 are unchanged.
          </t>
          <t>
	    Section 11.4 is replaced by <xref target="SEC11-USES"/> from
	    the current document.  For details regarding subsections
	    see below.
          <list style="format o  ">
            <t>
	      New sections corresponding to Sections
	      <xref target="SEC11-USES-mult" format="counter"/>
	      through <xref target="SEC11-USES-types" format="counter"/>
	      from the current
	      document appear next.
            </t>
            <t>
	      Section 11.4.1 is replaced by <xref target="SEC11-USES-repl"/>
            </t>
            <t>
	      Section 11.4.2 is replaced by <xref target="SEC11-USES-migr"/>
            </t>
            <t>
	      Section 11.4.3 is replaced by <xref target="SEC11-USES-ref"/>
            </t>
            <t>
	      A new section corresponding to 
	      <xref target="SEC11-USES-changes"/>
	      from the current
	      document appears next.
            </t>
          </list>
          </t>
          <t>
	    Section 11.5 is to be deleted.
          </t>
          <t>
	    Section 11.6 is unchanged.
          </t>
          <t>
	    New sections corresponding to Sections
	    <xref target="SEC11-trans-oview" format="counter"/>
	    and <xref target="SEC11-nwa" format="counter"/> from the current
	    document appear next.
          </t>
          <t>
	    Section 11.7 is replaced by <xref target="SEC11-EFF"/> from
	    the current document.  For details regarding subsections
	    see below.
          <list style="format o  ">
            <t>
	      Section 11.7.1 is replaced by <xref target="SEC11-EFF-simul"/>
            </t>
            <t>
	      Sections 11.7.2, 11.7.2.1, and 11.7.2.2 are deleted.
            </t>
            <t>
	      Section 11.7.3 is replaced by <xref target="SEC11-EFF-fh"/>
            </t>
            <t>
	      Section 11.7.4 is replaced by <xref target="SEC11-EFF-fileid"/>
            </t>
            <t>
	      Sections 11.7.5 and 11.7.5.1 are replaced by Sections
	      <xref target="SEC11-EFF-fsid" format="counter"/> and
	      <xref target="SEC11-EFF-fsid-split" format="counter"/>
              respectively.
            </t>
            <t>
	      Section 11.7.6 is replaced by <xref target="SEC11-EFF-change"/>
            </t>
            <t>
	      Section 11.7.7, exclusive of subsections, is replaced 
              by <xref target="SEC11-EFF-lock"/>.  Sections 11.7.7.1 and
              11.7.72 are unchanged.
            </t>
            <t>
	      Section 11.7.8 is replaced by <xref target="SEC11-EFF-wv"/>
            </t>
            <t>
	      Section 11.7.9 is replaced by <xref target="SEC11-EFF-rdc"/>
            </t>
            <t>
	      Section 11.7.10 is replaced by <xref target="SEC11-EFF-data"/>
            </t>
          </list>
          </t>
          <t>
            Sections 11.8, 11.8.1, 11.8.2, and 11.9, are unchanged.
          </t>
          <t>
            Sections 11.10, 11.10.1, 11.10.2, and 11.10.3 are replaced
	    by Sections <xref target="SEC11-li-new" format="counter"/>
	    through <xref target="SEC11-fsli-item" format="counter"/>.
          </t>
          <t>
            Section 11.11 is unchanged.
          </t>
          <t>
	    New sections corresponding to Sections
	    <xref target="SEC11-trans-locking" format="counter"/>,
	    <xref target="SEC11-trans-client" format="counter"/>,
	    and <xref target="SEC11-trans-server" format="counter"/>
	    from the current
	    document appear next as additional sub-sections of
	    Section 11.  Each of these has subsections, so there is a total of
	    seventeen sections added.
          </t>
        </list>
        </t>
        <t>
	  Sections 12 through 14 are unchanged.
        </t>
	
        <t>
	  Section 15 is unmodified except that
	<list style="symbols">
	  <t>  
	    The description of
	    NFS4ERR_MOVED in Section 15.1 is revised as described in
	    <xref target="OTH-moved"/> of the current document.
          </t>
          <t>
	    The description of the reclaim-related errors in section 15.1.9
	    is replaced by the revised descriptions in
	    <xref target="OTH-recerror"/> of the current document.
          </t>
        </list>
        </t>
        <t>
	  Sections 16 and 17 are unchanged.
        </t>
        <t>
	  Section 18 is unmodified except the
	<list style="symbols">
	  <t>  
	    Section 18.35 is replaced by <xref target="EXID"/>
	    in the current document.
          </t>
          <t>
	    Section 18.51 is replaced by <xref target="RC"/>
	    in the current document.
          </t>
        </list>
        </t>
        <t>
	  Sections 19 through 23 are unchanged.
        </t>
      </list>
      </t>
      <t>
	In terms of top-level sections, exclusive of appendices:
      <list style="symbols">
        <t>
	  There is one heavily modified top-level section (Section 11)
        </t>
        <t>
	  There are four other modified top-level sections (Sections 1,
	  2, 15, and 18).
        </t>
        <t>
	  The other eighteen top-level sections are unchanged. 
        </t>
      </list>
      </t>
      <t>
        The disposition of sections of <xref target="RFC5661"/> is 
        summarized in the following table which provides counts of sections
        replaced, added, deleted, modified, or unchanged.  Separate counts 
        are provided for:
      <list style="symbols">
        <t>
          Top-level sections.
        </t>
        <t>
          Sections with TOC entries.
        </t>
        <t>
          Sections within Section 11.
        </t>
        <t>
          Sections outside Section 11.
        </t>
      </list>
      </t>
      <t>
        In this table, the counts for top-level sections and TOC entries
        are for sections including subsections while other counts 
        are for sections exclusive of included subsections.   
      </t>
      <texttable>
        <ttcol>
          Status
        </ttcol>
        <ttcol>
          Top
        </ttcol>
        <ttcol>
          TOC
        </ttcol>
        <ttcol>
          in 11
        </ttcol>
        <ttcol>
          not in 11
        </ttcol>
        <ttcol>
          Total
        </ttcol>
        <c>Replaced</c><c>0</c><c>6</c><c>21</c><c>15</c><c>36</c>
        <c>Added</c><c>0</c><c>5</c><c>24</c><c>0</c><c>24</c>
        <c>Deleted</c><c>0</c><c>1</c><c>4</c><c>0</c><c>4</c>
        <c>Modified </c><c>5</c><c>3</c><c>0</c><c>2</c><c>2</c>
        <c>Unchanged</c><c>18</c><c>210</c><c>12</c><c>910</c><c>922</c>
        <c>in RFC5661</c><c>23</c><c>220</c><c>37</c><c>927</c><c>964</c>
      </texttable>
    </section>
    <section title="Acknowledgments"
	     numbered="no"
	     anchor="ACK">
      <t>
        The authors wish to acknowledge the important role 
        of Andy Adamson of Netapp 
        in clarifying the need for trunking discovery functionality, and
        exploring the role of the location attributes in providing the
        necessary support.
      </t>
      <t>
        The authors also wish to acknowledge the work of Xuan Qi of Oracle 
        with NFSv4.1 client and server prototypes of transparent state
        migration functionality.
      </t>
      <t>
        The authors wish to thank others that brought attention to important
	issues.  The comments of Trond Myklebust of Primary Data related
	to trunking helped to clarify the role of DNS in
        trunking discovery.  Rick Macklem's comments brought attention to
	problems in the handling of the per-fs version of
	RECLAIM_COMPLETE.
      </t>
      <t>
        The authors wish to thank Olga Kornievskaia of Netapp for her helpful
        review comments.
      </t>
    </section>
  </back>
</rfc>

