<?xml version='1.0' encoding='utf-8'?>
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" version="3" ipr="trust200902" docName="draft-ietf-cbor-update-8610-grammar-06" number="9682" category="std" consensus="true" submissionType="IETF" updates="8610" obsoletes="" tocInclude="true" sortRefs="true" symRefs="true" xml:lang="en" prepTime="2024-11-18T15:53:18" indexInclude="true" scripts="Common,Latin" tocDepth="3">
  <link href="https://datatracker.ietf.org/doc/draft-ietf-cbor-update-8610-grammar-06" rel="prev"/>
  <link href="https://dx.doi.org/10.17487/rfc9682" rel="alternate"/>
  <link href="urn:issn:2070-1721" rel="alternate"/>
  <front>
    <title abbrev="CDDL grammar updates">Updates to the Concise Data Definition Language (CDDL) Grammar</title>
    <seriesInfo name="RFC" value="9682" stream="IETF"/>
    <author initials="C." surname="Bormann" fullname="Carsten Bormann">
      <organization showOnFrontPage="true">Universität Bremen TZI</organization>
      <address>
        <postal>
          <street>Postfach 330440</street>
          <city>Bremen</city>
          <code>D-28359</code>
          <country>Germany</country>
        </postal>
        <phone>+49-421-218-63921</phone>
        <email>cabo@tzi.org</email>
      </address>
    </author>
    <date month="11" year="2024"/>
    <area>ART</area>
    <workgroup>cbor</workgroup>
    <keyword>Concise Data Definition Language</keyword>
    <abstract pn="section-abstract">
      <t indent="0" pn="section-abstract-1">The Concise Data Definition Language (CDDL), as defined in
RFCs 8610 and 9165,
provides an easy and unambiguous way to express structures for
protocol messages and data formats that are represented in Concise Binary Object Representation (CBOR) or
JSON.</t>
      <t indent="0" pn="section-abstract-2">This document updates RFC 8610 by addressing related errata reports and making
other small fixes for the ABNF grammar defined for CDDL.</t>
    </abstract>
    <boilerplate>
      <section anchor="status-of-memo" numbered="false" removeInRFC="false" toc="exclude" pn="section-boilerplate.1">
        <name slugifiedName="name-status-of-this-memo">Status of This Memo</name>
        <t indent="0" pn="section-boilerplate.1-1">
            This is an Internet Standards Track document.
        </t>
        <t indent="0" pn="section-boilerplate.1-2">
            This document is a product of the Internet Engineering Task Force
            (IETF).  It represents the consensus of the IETF community.  It has
            received public review and has been approved for publication by
            the Internet Engineering Steering Group (IESG).  Further
            information on Internet Standards is available in Section 2 of 
            RFC 7841.
        </t>
        <t indent="0" pn="section-boilerplate.1-3">
            Information about the current status of this document, any
            errata, and how to provide feedback on it may be obtained at
            <eref target="https://www.rfc-editor.org/info/rfc9682" brackets="none"/>.
        </t>
      </section>
      <section anchor="copyright" numbered="false" removeInRFC="false" toc="exclude" pn="section-boilerplate.2">
        <name slugifiedName="name-copyright-notice">Copyright Notice</name>
        <t indent="0" pn="section-boilerplate.2-1">
            Copyright (c) 2024 IETF Trust and the persons identified as the
            document authors. All rights reserved.
        </t>
        <t indent="0" pn="section-boilerplate.2-2">
            This document is subject to BCP 78 and the IETF Trust's Legal
            Provisions Relating to IETF Documents
            (<eref target="https://trustee.ietf.org/license-info" brackets="none"/>) in effect on the date of
            publication of this document. Please review these documents
            carefully, as they describe your rights and restrictions with
            respect to this document. Code Components extracted from this
            document must include Revised BSD License text as described in
            Section 4.e of the Trust Legal Provisions and are provided without
            warranty as described in the Revised BSD License.
        </t>
      </section>
    </boilerplate>
    <toc>
      <section anchor="toc" numbered="false" removeInRFC="false" toc="exclude" pn="section-toc.1">
        <name slugifiedName="name-table-of-contents">Table of Contents</name>
        <ul bare="true" empty="true" indent="2" spacing="compact" pn="section-toc.1-1">
          <li pn="section-toc.1-1.1">
            <t indent="0" keepWithNext="true" pn="section-toc.1-1.1.1"><xref derivedContent="1" format="counter" sectionFormat="of" target="section-1"/>.  <xref derivedContent="" format="title" sectionFormat="of" target="name-introduction">Introduction</xref></t>
            <ul bare="true" empty="true" indent="2" spacing="compact" pn="section-toc.1-1.1.2">
              <li pn="section-toc.1-1.1.2.1">
                <t indent="0" keepWithNext="true" pn="section-toc.1-1.1.2.1.1"><xref derivedContent="1.1" format="counter" sectionFormat="of" target="section-1.1"/>.  <xref derivedContent="" format="title" sectionFormat="of" target="name-conventions-and-definitions">Conventions and Definitions</xref></t>
              </li>
            </ul>
          </li>
          <li pn="section-toc.1-1.2">
            <t indent="0" pn="section-toc.1-1.2.1"><xref derivedContent="2" format="counter" sectionFormat="of" target="section-2"/>.  <xref derivedContent="" format="title" sectionFormat="of" target="name-clarifications-and-changes-">Clarifications and Changes Based on Errata Reports</xref></t>
            <ul bare="true" empty="true" indent="2" spacing="compact" pn="section-toc.1-1.2.2">
              <li pn="section-toc.1-1.2.2.1">
                <t indent="0" pn="section-toc.1-1.2.2.1.1"><xref derivedContent="2.1" format="counter" sectionFormat="of" target="section-2.1"/>.  <xref derivedContent="" format="title" sectionFormat="of" target="name-updates-to-string-literal-g">Updates to String Literal Grammar</xref></t>
                <ul bare="true" empty="true" indent="2" spacing="compact" pn="section-toc.1-1.2.2.1.2">
                  <li pn="section-toc.1-1.2.2.1.2.1">
                    <t indent="0" keepWithNext="true" pn="section-toc.1-1.2.2.1.2.1.1"><xref derivedContent="2.1.1" format="counter" sectionFormat="of" target="section-2.1.1"/>.  <xref derivedContent="" format="title" sectionFormat="of" target="name-erratum-id-6527-text-string">Erratum ID 6527 (Text String Literals)</xref></t>
                  </li>
                  <li pn="section-toc.1-1.2.2.1.2.2">
                    <t indent="0" pn="section-toc.1-1.2.2.1.2.2.1"><xref derivedContent="2.1.2" format="counter" sectionFormat="of" target="section-2.1.2"/>.  <xref derivedContent="" format="title" sectionFormat="of" target="name-erratum-id-6278-consistent-">Erratum ID 6278 (Consistent String Literals)</xref></t>
                  </li>
                  <li pn="section-toc.1-1.2.2.1.2.3">
                    <t indent="0" pn="section-toc.1-1.2.2.1.2.3.1"><xref derivedContent="2.1.3" format="counter" sectionFormat="of" target="section-2.1.3"/>.  <xref derivedContent="" format="title" sectionFormat="of" target="name-addressing-erratum-id-6526-">Addressing Erratum ID 6526 and Erratum ID 6543</xref></t>
                  </li>
                </ul>
              </li>
              <li pn="section-toc.1-1.2.2.2">
                <t indent="0" pn="section-toc.1-1.2.2.2.1"><xref derivedContent="2.2" format="counter" sectionFormat="of" target="section-2.2"/>.  <xref derivedContent="" format="title" sectionFormat="of" target="name-examples-demonstrating-the-">Examples Demonstrating the Updated String Syntaxes</xref></t>
              </li>
            </ul>
          </li>
          <li pn="section-toc.1-1.3">
            <t indent="0" pn="section-toc.1-1.3.1"><xref derivedContent="3" format="counter" sectionFormat="of" target="section-3"/>.  <xref derivedContent="" format="title" sectionFormat="of" target="name-small-enabling-grammar-chan">Small Enabling Grammar Changes</xref></t>
            <ul bare="true" empty="true" indent="2" spacing="compact" pn="section-toc.1-1.3.2">
              <li pn="section-toc.1-1.3.2.1">
                <t indent="0" pn="section-toc.1-1.3.2.1.1"><xref derivedContent="3.1" format="counter" sectionFormat="of" target="section-3.1"/>.  <xref derivedContent="" format="title" sectionFormat="of" target="name-empty-data-models">Empty Data Models</xref></t>
              </li>
              <li pn="section-toc.1-1.3.2.2">
                <t indent="0" pn="section-toc.1-1.3.2.2.1"><xref derivedContent="3.2" format="counter" sectionFormat="of" target="section-3.2"/>.  <xref derivedContent="" format="title" sectionFormat="of" target="name-non-literal-tag-numbers-and">Non-Literal Tag Numbers and Simple Values</xref></t>
              </li>
            </ul>
          </li>
          <li pn="section-toc.1-1.4">
            <t indent="0" pn="section-toc.1-1.4.1"><xref derivedContent="4" format="counter" sectionFormat="of" target="section-4"/>.  <xref derivedContent="" format="title" sectionFormat="of" target="name-security-considerations">Security Considerations</xref></t>
          </li>
          <li pn="section-toc.1-1.5">
            <t indent="0" pn="section-toc.1-1.5.1"><xref derivedContent="5" format="counter" sectionFormat="of" target="section-5"/>.  <xref derivedContent="" format="title" sectionFormat="of" target="name-iana-considerations">IANA Considerations</xref></t>
          </li>
          <li pn="section-toc.1-1.6">
            <t indent="0" pn="section-toc.1-1.6.1"><xref derivedContent="6" format="counter" sectionFormat="of" target="section-6"/>.  <xref derivedContent="" format="title" sectionFormat="of" target="name-references">References</xref></t>
            <ul bare="true" empty="true" indent="2" spacing="compact" pn="section-toc.1-1.6.2">
              <li pn="section-toc.1-1.6.2.1">
                <t indent="0" pn="section-toc.1-1.6.2.1.1"><xref derivedContent="6.1" format="counter" sectionFormat="of" target="section-6.1"/>.  <xref derivedContent="" format="title" sectionFormat="of" target="name-normative-references">Normative References</xref></t>
              </li>
              <li pn="section-toc.1-1.6.2.2">
                <t indent="0" pn="section-toc.1-1.6.2.2.1"><xref derivedContent="6.2" format="counter" sectionFormat="of" target="section-6.2"/>.  <xref derivedContent="" format="title" sectionFormat="of" target="name-informative-references">Informative References</xref></t>
              </li>
            </ul>
          </li>
          <li pn="section-toc.1-1.7">
            <t indent="0" pn="section-toc.1-1.7.1"><xref derivedContent="Appendix A" format="default" sectionFormat="of" target="section-appendix.a"/>.  <xref derivedContent="" format="title" sectionFormat="of" target="name-updated-collected-abnf-for-">Updated Collected ABNF for CDDL</xref></t>
          </li>
          <li pn="section-toc.1-1.8">
            <t indent="0" pn="section-toc.1-1.8.1"><xref derivedContent="Appendix B" format="default" sectionFormat="of" target="section-appendix.b"/>.  <xref derivedContent="" format="title" sectionFormat="of" target="name-details-about-covering-erra">Details about Covering Erratum ID 6543</xref></t>
            <ul bare="true" empty="true" indent="2" spacing="compact" pn="section-toc.1-1.8.2">
              <li pn="section-toc.1-1.8.2.1">
                <t indent="0" pn="section-toc.1-1.8.2.1.1"><xref derivedContent="B.1" format="counter" sectionFormat="of" target="section-appendix.b.1"/>.  <xref derivedContent="" format="title" sectionFormat="of" target="name-change-proposed-by-erratum-">Change Proposed by Erratum ID 6543</xref></t>
              </li>
              <li pn="section-toc.1-1.8.2.2">
                <t indent="0" pn="section-toc.1-1.8.2.2.1"><xref derivedContent="B.2" format="counter" sectionFormat="of" target="section-appendix.b.2"/>.  <xref derivedContent="" format="title" sectionFormat="of" target="name-no-further-change-needed-af">No Further Change Needed after Updating String Literal Grammar</xref></t>
              </li>
            </ul>
          </li>
          <li pn="section-toc.1-1.9">
            <t indent="0" pn="section-toc.1-1.9.1"><xref derivedContent="" format="none" sectionFormat="of" target="section-appendix.c"/><xref derivedContent="" format="title" sectionFormat="of" target="name-acknowledgments">Acknowledgments</xref></t>
          </li>
          <li pn="section-toc.1-1.10">
            <t indent="0" pn="section-toc.1-1.10.1"><xref derivedContent="" format="none" sectionFormat="of" target="section-appendix.d"/><xref derivedContent="" format="title" sectionFormat="of" target="name-authors-address">Author's Address</xref></t>
          </li>
        </ul>
      </section>
    </toc>
  </front>
  <middle>
    <section anchor="introduction" numbered="true" removeInRFC="false" toc="include" pn="section-1">
      <name slugifiedName="name-introduction">Introduction</name>
      <t indent="0" pn="section-1-1">The Concise Data Definition Language (CDDL), as defined in
<xref target="RFC8610" format="default" sectionFormat="of" derivedContent="RFC8610"/> and <xref target="RFC9165" format="default" sectionFormat="of" derivedContent="RFC9165"/>,
provides an easy and unambiguous way to express structures for
protocol messages and data formats that are represented in CBOR or
JSON.</t>
      <t indent="0" pn="section-1-2">This document updates <xref target="RFC8610" format="default" sectionFormat="of" derivedContent="RFC8610"/> by addressing errata reports and making
other small fixes for the ABNF grammar defined for CDDL.
The body of this document explains and shows motivation for the updates; the
updated collected ABNF syntax in <xref target="collected-abnf" format="default" sectionFormat="of" derivedContent="Figure 11"/> in
<xref target="collected-abnf-appendix" format="default" sectionFormat="of" derivedContent="Appendix A"/> replaces the collected ABNF syntax in
<xref section="B" sectionFormat="of" target="RFC8610" format="default" derivedLink="https://rfc-editor.org/rfc/rfc8610#appendix-B" derivedContent="RFC8610"/>.</t>
      <section anchor="conventions-and-definitions" numbered="true" removeInRFC="false" toc="include" pn="section-1.1">
        <name slugifiedName="name-conventions-and-definitions">Conventions and Definitions</name>
        <t indent="0" pn="section-1.1-1">The terminology from <xref target="RFC8610" format="default" sectionFormat="of" derivedContent="RFC8610"/> applies.
The grammar in <xref target="RFC8610" format="default" sectionFormat="of" derivedContent="RFC8610"/> is based on ABNF, which is defined in <xref target="STD68" format="default" sectionFormat="of" derivedContent="STD68"/>
and <xref target="RFC7405" format="default" sectionFormat="of" derivedContent="RFC7405"/>.</t>
      </section>
    </section>
    <section anchor="clari" numbered="true" removeInRFC="false" toc="include" pn="section-2">
      <name slugifiedName="name-clarifications-and-changes-">Clarifications and Changes Based on Errata Reports</name>
      <t indent="0" pn="section-2-1">A number of errata reports have been made regarding some details of text
string and byte string literal syntax: for example, <xref target="Err6527" format="default" sectionFormat="of" derivedContent="Err6527"/> and <xref target="Err6543" format="default" sectionFormat="of" derivedContent="Err6543"/>.
These are being addressed in this section, updating details of the
ABNF for these literal syntaxes.
Also, the changes described in <xref target="Err6526" format="default" sectionFormat="of" derivedContent="Err6526"/> need to be applied (backslashes have been lost during the RFC publication process of <xref target="RFC8610" sectionFormat="of" section="G.2" format="default" derivedLink="https://rfc-editor.org/rfc/rfc8610#appendix-G.2" derivedContent="RFC8610"/>, garbling the text explaining backslash escaping).</t>
      <t indent="0" pn="section-2-2">These changes are intended to mirror the way existing implementations
have dealt with the errata reports.  This document also uses the opportunity presented
by the necessary cleanup of the grammar of string literals for a
backward-compatible addition to the syntax for hexadecimal escapes.
The latter change is not automatically forward compatible (i.e., CDDL
specifications that make use of this syntax do not necessarily work
with existing implementations until these are updated, which is recommended by this
specification).</t>
      <section anchor="e6527" numbered="true" removeInRFC="false" toc="include" pn="section-2.1">
        <name slugifiedName="name-updates-to-string-literal-g">Updates to String Literal Grammar</name>
        <section numbered="true" anchor="err6527-text-string-literals" removeInRFC="false" toc="include" pn="section-2.1.1">
          <name slugifiedName="name-erratum-id-6527-text-string">Erratum ID 6527 (Text String Literals)</name>
          <t indent="0" pn="section-2.1.1-1">The ABNF used in <xref target="RFC8610" format="default" sectionFormat="of" derivedContent="RFC8610"/> for the content of text string literals
	  is rather permissive:</t>
          <figure anchor="e6527-orig1" align="left" suppress-title="false" pn="figure-1">
            <name slugifiedName="name-original-abnf-from-rfc-8610">Original ABNF from RFC 8610 for Strings with Permissive ABNF
                 for SESC (Which Did Not Allow Hex Escapes)</name>
            <sourcecode type="abnf" markers="false" pn="section-2.1.1-2.1">
; ABNF from RFC 8610:
text = %x22 *SCHAR %x22
SCHAR = %x20-21 / %x23-5B / %x5D-7E / %x80-10FFFD / SESC
SESC = "\" (%x20-7E / %x80-10FFFD)
</sourcecode>
          </figure>
          <t indent="0" pn="section-2.1.1-3">This allows almost any non-C0 character to be escaped by a backslash,
but critically misses out on the <tt>\uXXXX</tt> and <tt>\uHHHH\uLLLL</tt> forms
that JSON allows to specify characters in hex

(which should
apply here according to item 6 of <xref section="3.1" sectionFormat="of" target="RFC8610" format="default" derivedLink="https://rfc-editor.org/rfc/rfc8610#section-3.1" derivedContent="RFC8610"/>).


(Note that CDDL imports from JSON the unwieldy <tt>\uHHHH\uLLLL</tt> syntax,
which represents Unicode code points beyond U+FFFF by making them look
like UTF-16 surrogate pairs; CDDL text strings do not use UTF-16 or
	  surrogates.)</t>
          <t indent="0" pn="section-2.1.1-4">Both can be solved by updating the SESC rule.

This document uses the opportunity to add a popular form of directly specifying
characters in strings using hexadecimal escape sequences of the form
<tt>\u{hex}</tt>, where <tt>hex</tt> is the hexadecimal representation of the
Unicode scalar value.
The result is the new set of rules defining SESC in <xref target="e6527-new1" format="default" sectionFormat="of" derivedContent="Figure 2"/>.</t>
          <figure anchor="e6527-new1" align="left" suppress-title="false" pn="figure-2">
            <name slugifiedName="name-update-to-string-abnf-in-al">Update to String ABNF in <xref target="RFC8610" sectionFormat="of" section="B" format="default" derivedLink="https://rfc-editor.org/rfc/rfc8610#appendix-B" derivedContent="RFC8610"/>: Allow Hex Escapes</name>
            <sourcecode type="abnf" name="cddl-new-sesc.abnf" markers="false" pn="section-2.1.1-5.1">
; new rules collectively defining SESC:
SESC = "\" ( %x22 / "/" / "\" /                 ; \" \/ \\
             %x62 / %x66 / %x6E / %x72 / %x74 / ; \b \f \n \r \t
             (%x75 hexchar) )                   ; \uXXXX
hexchar = "{" (1*"0" [ hexscalar ] / hexscalar) "}" /
          non-surrogate / (high-surrogate "\" %x75 low-surrogate)
non-surrogate = ((DIGIT / "A"/"B"/"C" / "E"/"F") 3HEXDIG) /
                ("D" %x30-37 2HEXDIG )
high-surrogate = "D" ("8"/"9"/"A"/"B") 2HEXDIG
low-surrogate = "D" ("C"/"D"/"E"/"F") 2HEXDIG
hexscalar = "10" 4HEXDIG / HEXDIG1 4HEXDIG
          / non-surrogate / 1*3HEXDIG
HEXDIG1 = DIGIT1 / "A" / "B" / "C" / "D" / "E" / "F"
</sourcecode>
          </figure>
          <aside pn="section-2.1.1-6">
            <t indent="0" pn="section-2.1.1-6.1">Notes:
In ABNF, strings such as <tt>"A"</tt>, <tt>"B"</tt>, etc., are case insensitive, as is
intended here.
The rules above could have also used <tt>%s"b"</tt>, etc., instead of <tt>%x62</tt>, but didn't, in order to
	  maximize compatibility with ABNF tools.</t>
          </aside>
          <t indent="0" pn="section-2.1.1-7">Now that SESC is more restrictively formulated, an
update to the BCHAR rule used in the ABNF syntax for byte string
literals is also required:</t>
          <figure anchor="e6527-orig2" align="left" suppress-title="false" pn="figure-3">
            <name slugifiedName="name-abnf-from-rfc-8610-for-bcha">ABNF from RFC 8610 for BCHAR</name>
            <sourcecode type="abnf" markers="false" pn="section-2.1.1-8.1">
; ABNF from RFC 8610:
bytes = [bsqual] %x27 *BCHAR %x27
BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF
bsqual = "h" / "b64"
</sourcecode>
          </figure>
          <t indent="0" pn="section-2.1.1-9">With the SESC updated as above, <tt>\'</tt> is no longer allowed in BCHAR and now needs to be explicitly included there; see <xref target="e6527-new2" format="default" sectionFormat="of" derivedContent="Figure 4"/>.</t>
        </section>
        <section numbered="true" anchor="e6278" removeInRFC="false" toc="include" pn="section-2.1.2">
          <name slugifiedName="name-erratum-id-6278-consistent-">Erratum ID 6278 (Consistent String Literals)</name>
          <t indent="0" pn="section-2.1.2-1">Updating BCHAR also provides an opportunity to address <xref target="Err6278" format="default" sectionFormat="of" derivedContent="Err6278"/>,
which points to an inconsistency in treating U+007F (DEL) between SCHAR and
BCHAR.
As U+007F is not printable, including it in a byte string literal is
as confusing as for a text string literal; therefore, it should be
excluded from BCHAR as it is from SCHAR.
The same reasoning also applies to the C1 control characters,
so the updated ABNF actually excludes the entire range from U+007F to U+009F.
The same reasoning also applies to text in comments (PCHAR).  For completeness, all these rules should also explicitly exclude the code
points that have been set aside for UTF-16 surrogates.</t>
          <figure anchor="e6527-new2" align="left" suppress-title="false" pn="figure-4">
            <name slugifiedName="name-update-to-abnf-in-bchar-sch">Update to ABNF in  <xref target="RFC8610" sectionFormat="of" section="B" format="default" derivedLink="https://rfc-editor.org/rfc/rfc8610#appendix-B" derivedContent="RFC8610"/>: BCHAR, SCHAR, and PCHAR</name>
            <sourcecode type="abnf" name="cddl-new-bchar.abnf" markers="false" pn="section-2.1.2-2.1">
; new rules for SCHAR, BCHAR, and PCHAR:
SCHAR = %x20-21 / %x23-5B / %x5D-7E / NONASCII / SESC
BCHAR = %x20-26 / %x28-5B / %x5D-7E / NONASCII / SESC / "\'" / CRLF
PCHAR = %x20-7E / NONASCII
NONASCII = %xA0-D7FF / %xE000-10FFFD
</sourcecode>
          </figure>
          <t indent="0" pn="section-2.1.2-3">(Note that, apart from addressing the inconsistencies, there is no
attempt to further exclude non-printable characters from the ABNF;
doing this properly would draw in complexity from the ongoing
evolution of the Unicode standard <xref target="UNICODE" format="default" sectionFormat="of" derivedContent="UNICODE"/> that is not needed here.)</t>
        </section>
        <section numbered="true" anchor="addressing-err6526-err6543" removeInRFC="false" toc="include" pn="section-2.1.3">
          <name slugifiedName="name-addressing-erratum-id-6526-">Addressing Erratum ID 6526 and Erratum ID 6543</name>
          <t indent="0" pn="section-2.1.3-1">The above changes also cover <xref target="Err6543" format="default" sectionFormat="of" derivedContent="Err6543"/> (a proposal to split off
qualified byte string literals from UTF-8 byte string literals) and
<xref target="Err6526" format="default" sectionFormat="of" derivedContent="Err6526"/> (lost backslashes); see <xref target="Err6543-covered" format="default" sectionFormat="of" derivedContent="Appendix B"/> for details.</t>
        </section>
      </section>
      <section anchor="examples-demonstrating-the-updated-string-syntaxes" numbered="true" removeInRFC="false" toc="include" pn="section-2.2">
        <name slugifiedName="name-examples-demonstrating-the-">Examples Demonstrating the Updated String Syntaxes</name>
        <t indent="0" pn="section-2.2-1">The CDDL example in <xref target="string-examples" format="default" sectionFormat="of" derivedContent="Figure 5"/> demonstrates various escaping
techniques now available for (byte and text) strings in CDDL.
Obviously, in the literals for <tt>a</tt> and <tt>x</tt>, there is no need to escape
the second character, an <tt>o</tt>, as <tt>\u{6f}</tt>; this is just for demonstration.
Similarly, as shown in <tt>c</tt> and <tt>z</tt>, there also is no need to escape the
<u format="lit-name-num" pn="u-1">🁳</u> or <u format="lit-name-num" pn="u-2">⌘</u>; however, escaping them may be convenient in order to limit the character
repertoire of a CDDL file itself to ASCII <xref target="STD80" format="default" sectionFormat="of" derivedContent="STD80"/>.</t>
        <figure anchor="string-examples" align="left" suppress-title="false" pn="figure-5">
          <name slugifiedName="name-example-text-and-byte-strin">Example Text and Byte String Literals with Various Escaping Techniques</name>
          <sourcecode type="cddl" markers="false" pn="section-2.2-2.1">
start = [a, b, c, x, y, z]

; "🁳", DOMINO TILE VERTICAL-02-02, and
; "⌘", PLACE OF INTEREST SIGN, in a text string:
a = "D\u{6f}mino's \u{1F073} + \u{2318}"      ; \u{}-escape 3 chars
b = "Domino's \uD83C\uDC73 + \u2318"          ; escape JSON-like
c = "Domino's 🁳 + ⌘"                          ; unescaped

; in a byte string given as text, the ' needs to be escaped:
x = 'D\u{6f}mino\u{27}s \u{1F073} + \u{2318}' ; \u{}-escape 4 chars
y = 'Domino\'s \uD83C\uDC73 + \u2318'         ; escape JSON-like
z = 'Domino\'s 🁳 + ⌘'                         ; escape ' only
</sourcecode>
        </figure>
        <t indent="0" pn="section-2.2-3">In this example, the rules a to c and x to z all produce strings with
byte-wise identical content: a to c are text strings and x to z
are byte strings.
<xref target="string-examples-pretty" format="default" sectionFormat="of" derivedContent="Figure 6"/> illustrates this by showing the output generated from
the <tt>start</tt> rule in <xref target="string-examples" format="default" sectionFormat="of" derivedContent="Figure 5"/>, using pretty-printed hexadecimal.</t>
        <figure anchor="string-examples-pretty" align="left" suppress-title="false" pn="figure-6">
          <name slugifiedName="name-generated-cbor-from-cddl-ex">Generated CBOR from CDDL Example (Pretty-Printed Hexadecimal)</name>
          <sourcecode type="cbor-pretty" markers="false" pn="section-2.2-4.1">
86                                      # array(6)
   73                                   # text(19)
      446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘"
   73                                   # text(19)
      446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘"
   73                                   # text(19)
      446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘"
   53                                   # bytes(19)
      446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘"
   53                                   # bytes(19)
      446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘"
   53                                   # bytes(19)
      446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘"
</sourcecode>
        </figure>
      </section>
    </section>
    <section anchor="small-enabling-grammar-changes" numbered="true" removeInRFC="false" toc="include" pn="section-3">
      <name slugifiedName="name-small-enabling-grammar-chan">Small Enabling Grammar Changes</name>
      <t indent="0" pn="section-3-1">Each subsection that follows specifies a small change to the
grammar that is intended to enable certain kinds of specifications.
These changes are backward compatible (i.e., CDDL files that
comply with <xref target="RFC8610" format="default" sectionFormat="of" derivedContent="RFC8610"/> continue to match the updated grammar) but not
necessarily forward compatible (i.e., CDDL specifications that make
use of these changes cannot necessarily be processed by existing implementations of <xref target="RFC8610" format="default" sectionFormat="of" derivedContent="RFC8610"/>).</t>
      <section anchor="empty" numbered="true" removeInRFC="false" toc="include" pn="section-3.1">
        <name slugifiedName="name-empty-data-models">Empty Data Models</name>
        <t indent="0" pn="section-3.1-1"><xref target="RFC8610" format="default" sectionFormat="of" derivedContent="RFC8610"/> requires a CDDL file to have at least one rule.</t>
        <figure anchor="empty-orig" align="left" suppress-title="false" pn="figure-7">
          <name slugifiedName="name-abnf-from-rfc-8610-for-top-">ABNF from RFC 8610 for Top-Level Rule <tt>cddl</tt></name>
          <sourcecode type="abnf" markers="false" pn="section-3.1-2.1">
; ABNF from RFC 8610:
cddl = S 1*(rule S)
</sourcecode>
        </figure>
        <t indent="0" pn="section-3.1-3">This makes sense when the file has to stand alone, as a CDDL data
model needs to have at least one rule to provide an entry point (i.e., a start
rule).</t>
        <t indent="0" pn="section-3.1-4">With CDDL modules <xref target="I-D.ietf-cbor-cddl-modules" format="default" sectionFormat="of" derivedContent="CDDL-MODULES"/>, CDDL files can also include directives,
and these might be the source of all the rules that
ultimately make up the module created by the file.
Any other rule content in the file has to be available for directive
processing, making the requirement for at least one rule cumbersome.</t>
        <t indent="0" pn="section-3.1-5">Therefore, the present update extends the grammar as in <xref target="empty-new" format="default" sectionFormat="of" derivedContent="Figure 8"/>
and turns the existence of at least one rule into a semantic constraint, to
be fulfilled after processing of all directives.</t>
        <figure anchor="empty-new" align="left" suppress-title="false" pn="figure-8">
          <name slugifiedName="name-update-to-top-level-abnf-in">Update to Top-Level ABNF in  Appendices B and C of RFC 8610</name>
          <sourcecode type="abnf" name="cddl-new-cddl.abnf" markers="false" pn="section-3.1-6.1">
; new top-level rule:
cddl = S *(rule S)
</sourcecode>
        </figure>
      </section>
      <section anchor="tagnum" numbered="true" removeInRFC="false" toc="include" pn="section-3.2">
        <name slugifiedName="name-non-literal-tag-numbers-and">Non-Literal Tag Numbers and Simple Values</name>
        <t indent="0" pn="section-3.2-1">The existing ABNF syntax for expressing tags in CDDL is as follows:</t>
        <figure anchor="tag-orig" align="left" suppress-title="false" pn="figure-9">
          <name slugifiedName="name-original-abnf-from-rfc-8610-">Original ABNF from RFC 8610 for Tag Syntax</name>
          <sourcecode type="abnf" markers="false" pn="section-3.2-2.1">
; extracted from the ABNF in RFC 8610:
type2 =/ "#" "6" ["." uint] "(" S type S ")"
</sourcecode>
        </figure>
        <t indent="0" pn="section-3.2-3">This means tag numbers can only be given as literal numbers (uints).
Some specifications operate on ranges of tag numbers;  for example, <xref target="RFC9277" format="default" sectionFormat="of" derivedContent="RFC9277"/>
has a range of tag numbers 1668546817 (0x63740101) to 1668612095
(0x6374FFFF) to tag specific content formats.
This cannot currently be expressed in CDDL.
Similar considerations apply to simple values (<tt>#7.</tt>xx).</t>
        <t indent="0" pn="section-3.2-4">This update extends the syntax to the following:</t>
        <figure anchor="tag-new" align="left" suppress-title="false" pn="figure-10">
          <name slugifiedName="name-update-to-tag-and-simple-va">Update to Tag and Simple Value ABNF in Appendices B and C of RFC 8610</name>
          <sourcecode type="abnf" name="cddl-new-tag.abnf" markers="false" pn="section-3.2-5.1">
; new rules collectively defining the tagged case:
type2 =/ "#" "6" ["." head-number] "(" S type S ")"
       / "#" "7" ["." head-number]
head-number = uint / ("&lt;" type "&gt;")
</sourcecode>
        </figure>
        <t indent="0" pn="section-3.2-6">For <tt>#6</tt>, the <tt>head-number</tt> stands for the tag number.
For <tt>#7</tt>, the <tt>head-number</tt> stands for the simple value if it is in
the ranges 0..23 or 32..255 (as per Section <xref target="RFC8949" section="3.3" sectionFormat="bare" format="default" derivedLink="https://rfc-editor.org/rfc/rfc8949#section-3.3" derivedContent="RFC8949"/> of RFC 8949 <xref target="STD94" format="default" sectionFormat="of" derivedContent="STD94"/>,
the simple values 24..31 are not used).
For 24..31, the <tt>head-number</tt> stands for the "additional
information", e.g., <tt>#7.25</tt> or <tt>#7.&lt;25&gt;</tt> is a float16, etc.
(All ranges mentioned here are inclusive.)</t>
        <t indent="0" pn="section-3.2-7">So the above range can be expressed in a CDDL fragment such as:</t>
        <sourcecode type="cddl" markers="false" pn="section-3.2-8">
ct-tag&lt;content&gt; = #6.&lt;ct-tag-number&gt;(content)
ct-tag-number = 1668546817..1668612095
; or use 0x63740101..0x6374FFFF
</sourcecode>
        <aside pn="section-3.2-9">
          <t indent="0" pn="section-3.2-9.1">Notes:</t>
          <ol spacing="normal" type="1" indent="adaptive" start="1" pn="section-3.2-9.2"><li pn="section-3.2-9.2.1" derivedCounter="1.">
              <t indent="0" pn="section-3.2-9.2.1.1">This syntax reuses the angle bracket syntax for generics;
this reuse is innocuous because a generic parameter or argument only ever
occurs after a rule name (<tt>id</tt>), while it occurs after the "<tt>.</tt>" (dot) character here.
(Whether there is potential for human confusion can be debated; the
above example deliberately uses generics as well.)</t>
            </li>
            <li pn="section-3.2-9.2.2" derivedCounter="2.">
              <t indent="0" pn="section-3.2-9.2.2.1">The updated ABNF grammar makes it a bit more explicit that the
 number given after the optional dot is the value of the argument:
 for tags and simple
 values, it is not giving the CBOR "additional information”, 
 as it is with other uses of <tt>#</tt> in CDDL.
(Adding this observation to <xref section="2.2.3" sectionFormat="of" target="RFC8610" format="default" derivedLink="https://rfc-editor.org/rfc/rfc8610#section-2.2.3" derivedContent="RFC8610"/> is the subject
of <xref target="Err6575" format="default" sectionFormat="of" derivedContent="Err6575"/>; it is correctly noted in <xref section="3.6" sectionFormat="of" target="RFC8610" format="default" derivedLink="https://rfc-editor.org/rfc/rfc8610#section-3.6" derivedContent="RFC8610"/>.)
In hindsight, maybe a different character than the dot should have
been chosen for this special case; however, changing the grammar
in the current document would have been too disruptive.</t>
            </li>
          </ol>
        </aside>
      </section>
    </section>
    <section anchor="security-considerations" numbered="true" removeInRFC="false" toc="include" pn="section-4">
      <name slugifiedName="name-security-considerations">Security Considerations</name>
      <t indent="0" pn="section-4-1">The grammar fixes and updates in this document are not believed to
create additional security considerations.
The security considerations in <xref section="5" sectionFormat="of" target="RFC8610" format="default" derivedLink="https://rfc-editor.org/rfc/rfc8610#section-5" derivedContent="RFC8610"/> apply.
Specifically, the potential for confusion is increased in an
environment that uses a combination of CDDL tools, some of which have
been updated and some of which have not, in particular based on
<xref target="clari" format="default" sectionFormat="of" derivedContent="Section 2"/>.</t>
      <t indent="0" pn="section-4-2">Attackers may want to exploit such potential confusion by crafting
CDDL models that are interpreted differently by different parts of a
system.
There will be a period of transition from the details that the grammar in
<xref target="RFC8610" format="default" sectionFormat="of" derivedContent="RFC8610"/> handled in a less well-defined way, to the updated
grammar defined in the present document.  This transition might offer one (but not the only) type of opportunity
   for the kind of attack that relies on differences between
   implementations.
Implementations that make use of CDDL models operationally already
need to ascertain the provenance (and thus authenticity and integrity)
and applicability of models they employ.
At the time of writing, it is expected that the models will generally
be processed by a software developer, within a software development
environment.
Therefore, developers are advised to treat CDDL models with
the same care as any other source code.</t>
    </section>
    <section anchor="iana-considerations" numbered="true" removeInRFC="false" toc="include" pn="section-5">
      <name slugifiedName="name-iana-considerations">IANA Considerations</name>
      <t indent="0" pn="section-5-1">This document has no IANA actions.</t>
    </section>
  </middle>
  <back>
    <displayreference target="I-D.ietf-cbor-cddl-modules" to="CDDL-MODULES"/>
    <displayreference target="I-D.ietf-cbor-edn-literals" to="EDN-LITERALS"/>
    <references pn="section-6">
      <name slugifiedName="name-references">References</name>
      <references anchor="sec-normative-references" pn="section-6.1">
        <name slugifiedName="name-normative-references">Normative References</name>
        <reference anchor="RFC8610" target="https://www.rfc-editor.org/info/rfc8610" quoteTitle="true" derivedAnchor="RFC8610">
          <front>
            <title>Concise Data Definition Language (CDDL): A Notational Convention to Express Concise Binary Object Representation (CBOR) and JSON Data Structures</title>
            <author fullname="H. Birkholz" initials="H." surname="Birkholz"/>
            <author fullname="C. Vigano" initials="C." surname="Vigano"/>
            <author fullname="C. Bormann" initials="C." surname="Bormann"/>
            <date month="June" year="2019"/>
            <abstract>
              <t indent="0">This document proposes a notational convention to express Concise Binary Object Representation (CBOR) data structures (RFC 7049). Its main goal is to provide an easy and unambiguous way to express structures for protocol messages and data formats that use CBOR or JSON.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="8610"/>
          <seriesInfo name="DOI" value="10.17487/RFC8610"/>
        </reference>
        <referencegroup anchor="STD68" target="https://www.rfc-editor.org/info/std68" derivedAnchor="STD68">
          <reference anchor="RFC5234" target="https://www.rfc-editor.org/info/rfc5234" quoteTitle="true">
            <front>
              <title>Augmented BNF for Syntax Specifications: ABNF</title>
              <author fullname="D. Crocker" initials="D." role="editor" surname="Crocker"/>
              <author fullname="P. Overell" initials="P." surname="Overell"/>
              <date month="January" year="2008"/>
              <abstract>
                <t indent="0">Internet technical specifications often need to define a formal syntax. Over the years, a modified version of Backus-Naur Form (BNF), called Augmented BNF (ABNF), has been popular among many Internet specifications. The current specification documents ABNF. It balances compactness and simplicity with reasonable representational power. The differences between standard BNF and ABNF involve naming rules, repetition, alternatives, order-independence, and value ranges. This specification also supplies additional rule definitions and encoding for a core lexical analyzer of the type common to several Internet specifications. [STANDARDS-TRACK]</t>
              </abstract>
            </front>
            <seriesInfo name="STD" value="68"/>
            <seriesInfo name="RFC" value="5234"/>
            <seriesInfo name="DOI" value="10.17487/RFC5234"/>
          </reference>
        </referencegroup>
        <referencegroup anchor="STD94" target="https://www.rfc-editor.org/info/std94" derivedAnchor="STD94">
          <reference anchor="RFC8949" target="https://www.rfc-editor.org/info/rfc8949" quoteTitle="true">
            <front>
              <title>Concise Binary Object Representation (CBOR)</title>
              <author fullname="C. Bormann" initials="C." surname="Bormann"/>
              <author fullname="P. Hoffman" initials="P." surname="Hoffman"/>
              <date month="December" year="2020"/>
              <abstract>
                <t indent="0">The Concise Binary Object Representation (CBOR) is a data format whose design goals include the possibility of extremely small code size, fairly small message size, and extensibility without the need for version negotiation. These design goals make it different from earlier binary serializations such as ASN.1 and MessagePack.</t>
                <t indent="0">This document obsoletes RFC 7049, providing editorial improvements, new details, and errata fixes while keeping full compatibility with the interchange format of RFC 7049. It does not create a new version of the format.</t>
              </abstract>
            </front>
            <seriesInfo name="STD" value="94"/>
            <seriesInfo name="RFC" value="8949"/>
            <seriesInfo name="DOI" value="10.17487/RFC8949"/>
          </reference>
        </referencegroup>
      </references>
      <references anchor="sec-informative-references" pn="section-6.2">
        <name slugifiedName="name-informative-references">Informative References</name>
        <reference anchor="I-D.ietf-cbor-cddl-modules" target="https://datatracker.ietf.org/doc/html/draft-ietf-cbor-cddl-modules-03" quoteTitle="true" derivedAnchor="CDDL-MODULES">
          <front>
            <title>CDDL Module Structure</title>
            <author initials="C." surname="Bormann" fullname="Carsten Bormann">
              <organization showOnFrontPage="true">Universität Bremen TZI</organization>
            </author>
            <author initials="B." surname="Moran" fullname="Brendan Moran">
              <organization showOnFrontPage="true">Arm Limited</organization>
            </author>
            <date month="September" day="1" year="2024"/>
            <abstract>
              <t indent="0">   At the time of writing, the Concise Data Definition Language (CDDL)
   is defined by RFC 8610 and RFC 9165.  The latter has used the
   extension point provided in RFC 8610, the _control operator_.

   As CDDL is being used in larger projects, the need for features has
   become known that cannot be easily mapped into this single extension
   point.

   The present document defines a backward- and forward-compatible way
   to add a module structure to CDDL.

              </t>
            </abstract>
          </front>
          <seriesInfo name="Internet-Draft" value="draft-ietf-cbor-cddl-modules-03"/>
          <refcontent>Work in Progress</refcontent>
        </reference>
        <reference anchor="I-D.ietf-cbor-edn-literals" target="https://datatracker.ietf.org/doc/html/draft-ietf-cbor-edn-literals-13" quoteTitle="true" derivedAnchor="EDN-LITERALS">
          <front>
            <title>CBOR Extended Diagnostic Notation (EDN)</title>
            <author initials="C." surname="Bormann" fullname="Carsten Bormann">
              <organization showOnFrontPage="true">Universität Bremen TZI</organization>
            </author>
            <date month="November" day="3" year="2024"/>
            <abstract>
              <t indent="0">   The Concise Binary Object Representation (CBOR) (STD 94, RFC 8949) is
   a data format whose design goals include the possibility of extremely
   small code size, fairly small message size, and extensibility without
   the need for version negotiation.

   In addition to the binary interchange format, CBOR from the outset
   (RFC 7049) defined a text-based "diagnostic notation" in order to be
   able to converse about CBOR data items without having to resort to
   binary data.  RFC 8610 extended this into what is known as Extended
   Diagnostic Notation (EDN).

   This document consolidates the definition of EDN, sets forth a
   further step of its evolution, and is intended to serve as a single
   reference target in specifications that use EDN.

   It specifies an extension point for adding application-oriented
   extensions to the diagnostic notation.  It then defines two such
   extensions that enhance EDN with text representations of epoch-based
   date/times and of IP addresses and prefixes (RFC 9164).

   A few further additions close some gaps in usability.  The document
   modifies one extension originally specified in Appendix G.4 of RFC
   8610 to enable further increasing usability.  To facilitate tool
   interoperation, this document specifies a formal ABNF grammar, and it
   adds media types.


   // (This "cref" paragraph will be removed by the RFC editor:) The
   // present revision -13 reflects the branches "roll-up" and "roll-up-
   // 2" in the repository, an attempt to contain the entire
   // specification of EDN in this document, instead of describing
   // updates to the existing documents RFC 8949 and RFC 8610.
   // Editorial work on the branch "roll-up-2" might continue.  The
   // exact reflection of this document being a replacement for both
   // Section 8 of RFC 8949 and Appendix G of RFC 8610 needs to be
   // recorded in the metadata and in abstract and introduction.

              </t>
            </abstract>
          </front>
          <seriesInfo name="Internet-Draft" value="draft-ietf-cbor-edn-literals-13"/>
          <refcontent>Work in Progress</refcontent>
        </reference>
        <reference anchor="Err6278" target="https://www.rfc-editor.org/errata/eid6278" quoteTitle="false" derivedAnchor="Err6278">
          <front>
            <title>Erratum ID 6278</title>
            <author>
              <organization showOnFrontPage="true">RFC Errata</organization>
            </author>
          </front>
          <refcontent>RFC 8610</refcontent>
        </reference>
        <reference anchor="Err6526" target="https://www.rfc-editor.org/errata/eid6526" quoteTitle="false" derivedAnchor="Err6526">
          <front>
            <title>Erratum ID 6526</title>
            <author>
              <organization showOnFrontPage="true">RFC Errata</organization>
            </author>
          </front>
          <refcontent>RFC 8610</refcontent>
        </reference>
        <reference anchor="Err6527" target="https://www.rfc-editor.org/errata/eid6527" quoteTitle="false" derivedAnchor="Err6527">
          <front>
            <title>Erratum ID 6527</title>
            <author>
              <organization showOnFrontPage="true">RFC Errata</organization>
            </author>
            <date/>
          </front>
          <refcontent>RFC 8610</refcontent>
        </reference>
        <reference anchor="Err6543" target="https://www.rfc-editor.org/errata/eid6543" quoteTitle="false" derivedAnchor="Err6543">
          <front>
            <title>Erratum ID 6543</title>
            <author>
              <organization showOnFrontPage="true">RFC Errata</organization>
            </author>
          </front>
          <refcontent>RFC 8610</refcontent>
        </reference>
        <reference anchor="Err6575" target="https://www.rfc-editor.org/errata/eid6575" quoteTitle="false" derivedAnchor="Err6575">
          <front>
            <title>Erratum ID 6575</title>
            <author>
              <organization showOnFrontPage="true">RFC Errata</organization>
            </author>
          </front>
          <refcontent>RFC 8610</refcontent>
        </reference>
        <reference anchor="RFC7405" target="https://www.rfc-editor.org/info/rfc7405" quoteTitle="true" derivedAnchor="RFC7405">
          <front>
            <title>Case-Sensitive String Support in ABNF</title>
            <author fullname="P. Kyzivat" initials="P." surname="Kyzivat"/>
            <date month="December" year="2014"/>
            <abstract>
              <t indent="0">This document extends the base definition of ABNF (Augmented Backus-Naur Form) to include a way to specify US-ASCII string literals that are matched in a case-sensitive manner.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="7405"/>
          <seriesInfo name="DOI" value="10.17487/RFC7405"/>
        </reference>
        <reference anchor="RFC9165" target="https://www.rfc-editor.org/info/rfc9165" quoteTitle="true" derivedAnchor="RFC9165">
          <front>
            <title>Additional Control Operators for the Concise Data Definition Language (CDDL)</title>
            <author fullname="C. Bormann" initials="C." surname="Bormann"/>
            <date month="December" year="2021"/>
            <abstract>
              <t indent="0">The Concise Data Definition Language (CDDL), standardized in RFC 8610, provides "control operators" as its main language extension point.</t>
              <t indent="0">The present document defines a number of control operators that were not yet ready at the time RFC 8610 was completed:.plus,.cat, and.det for the construction of constants;.abnf/.abnfb for including ABNF (RFC 5234 and RFC 7405) in CDDL specifications; and.feature for indicating the use of a non-basic feature in an instance.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="9165"/>
          <seriesInfo name="DOI" value="10.17487/RFC9165"/>
        </reference>
        <reference anchor="RFC9277" target="https://www.rfc-editor.org/info/rfc9277" quoteTitle="true" derivedAnchor="RFC9277">
          <front>
            <title>On Stable Storage for Items in Concise Binary Object Representation (CBOR)</title>
            <author fullname="M. Richardson" initials="M." surname="Richardson"/>
            <author fullname="C. Bormann" initials="C." surname="Bormann"/>
            <date month="August" year="2022"/>
            <abstract>
              <t indent="0">This document defines a stored ("file") format for Concise Binary Object Representation (CBOR) data items that is friendly to common systems that recognize file types, such as the Unix file(1) command.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="9277"/>
          <seriesInfo name="DOI" value="10.17487/RFC9277"/>
        </reference>
        <referencegroup anchor="STD80" target="https://www.rfc-editor.org/info/std80" derivedAnchor="STD80">
          <reference anchor="RFC0020" target="https://www.rfc-editor.org/info/rfc20" quoteTitle="true">
            <front>
              <title>ASCII format for network interchange</title>
              <author fullname="V.G. Cerf" initials="V.G." surname="Cerf"/>
              <date month="October" year="1969"/>
            </front>
            <seriesInfo name="STD" value="80"/>
            <seriesInfo name="RFC" value="20"/>
            <seriesInfo name="DOI" value="10.17487/RFC0020"/>
          </reference>
        </referencegroup>
        <reference anchor="UNICODE" target="https://www.unicode.org/versions/latest/" quoteTitle="true" derivedAnchor="UNICODE">
          <front>
            <title>The Unicode Standard</title>
            <author>
              <organization showOnFrontPage="true">The Unicode Consortium</organization>
            </author>
          </front>
        </reference>
      </references>
    </references>
    <section anchor="collected-abnf-appendix" numbered="true" removeInRFC="false" toc="include" pn="section-appendix.a">
      <name slugifiedName="name-updated-collected-abnf-for-">Updated Collected ABNF for CDDL</name>
      <t indent="0" pn="section-appendix.a-1">This appendix is normative.</t>
      <t indent="0" pn="section-appendix.a-2">It provides the full ABNF from <xref target="RFC8610" format="default" sectionFormat="of" derivedContent="RFC8610"/> as updated by the present document.</t>
      <figure anchor="collected-abnf" align="left" suppress-title="false" pn="figure-11">
        <name slugifiedName="name-abnf-for-cddl-as-updated">ABNF for CDDL as Updated</name>
        <sourcecode type="abnf" name="cddl-updated-complete.abnf" markers="false" pn="section-appendix.a-3.1">
cddl = S *(rule S)
rule = typename [genericparm] S assignt S type
     / groupname [genericparm] S assigng S grpent

typename = id
groupname = id

assignt = "=" / "/="
assigng = "=" / "//="

genericparm = "&lt;" S id S *("," S id S ) "&gt;"
genericarg = "&lt;" S type1 S *("," S type1 S ) "&gt;"

type = type1 *(S "/" S type1)

type1 = type2 [S (rangeop / ctlop) S type2]
; space may be needed before the operator if type2 ends in a name

type2 = value
      / typename [genericarg]
      / "(" S type S ")"
      / "{" S group S "}"
      / "[" S group S "]"
      / "~" S typename [genericarg]
      / "&amp;" S "(" S group S ")"
      / "&amp;" S groupname [genericarg]
      / "#" "6" ["." head-number] "(" S type S ")"
      / "#" "7" ["." head-number]
      / "#" DIGIT ["." uint]                ; major/ai
      / "#"                                 ; any
head-number = uint / ("&lt;" type "&gt;")

rangeop = "..." / ".."

ctlop = "." id

group = grpchoice *(S "//" S grpchoice)

grpchoice = *(grpent optcom)

grpent = [occur S] [memberkey S] type
       / [occur S] groupname [genericarg]  ; preempted by above
       / [occur S] "(" S group S ")"

memberkey = type1 S ["^" S] "=&gt;"
          / bareword S ":"
          / value S ":"

bareword = id

optcom = S ["," S]

occur = [uint] "*" [uint]
      / "+"
      / "?"

uint = DIGIT1 *DIGIT
     / "0x" 1*HEXDIG
     / "0b" 1*BINDIG
     / "0"

value = number
      / text
      / bytes

int = ["-"] uint

; This is a float if it has fraction or exponent; int otherwise
number = hexfloat / (int ["." fraction] ["e" exponent ])
hexfloat = ["-"] "0x" 1*HEXDIG ["." 1*HEXDIG] "p" exponent
fraction = 1*DIGIT
exponent = ["+"/"-"] 1*DIGIT

text = %x22 *SCHAR %x22
SCHAR = %x20-21 / %x23-5B / %x5D-7E / NONASCII / SESC

SESC = "\" ( %x22 / "/" / "\" /                 ; \" \/ \\
             %x62 / %x66 / %x6E / %x72 / %x74 / ; \b \f \n \r \t
             (%x75 hexchar) )                   ; \uXXXX

hexchar = "{" (1*"0" [ hexscalar ] / hexscalar) "}" /
          non-surrogate / (high-surrogate "\" %x75 low-surrogate)
non-surrogate = ((DIGIT / "A"/"B"/"C" / "E"/"F") 3HEXDIG) /
                ("D" %x30-37 2HEXDIG )
high-surrogate = "D" ("8"/"9"/"A"/"B") 2HEXDIG
low-surrogate = "D" ("C"/"D"/"E"/"F") 2HEXDIG
hexscalar = "10" 4HEXDIG / HEXDIG1 4HEXDIG
          / non-surrogate / 1*3HEXDIG

bytes = [bsqual] %x27 *BCHAR %x27
BCHAR = %x20-26 / %x28-5B / %x5D-7E / NONASCII / SESC / "\'" / CRLF
bsqual = "h" / "b64"

id = EALPHA *(*("-" / ".") (EALPHA / DIGIT))
ALPHA = %x41-5A / %x61-7A
EALPHA = ALPHA / "@" / "_" / "$"
DIGIT = %x30-39
DIGIT1 = %x31-39
HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
HEXDIG1 = DIGIT1 / "A" / "B" / "C" / "D" / "E" / "F"
BINDIG = %x30-31

S = *WS
WS = SP / NL
SP = %x20
NL = COMMENT / CRLF
COMMENT = ";" *PCHAR CRLF
PCHAR = %x20-7E / NONASCII
NONASCII = %xA0-D7FF / %xE000-10FFFD
CRLF = %x0A / %x0D.0A
</sourcecode>
      </figure>
    </section>
    <section anchor="Err6543-covered" numbered="true" removeInRFC="false" toc="include" pn="section-appendix.b">
      <name slugifiedName="name-details-about-covering-erra">Details about Covering Erratum ID 6543</name>
      <t indent="0" pn="section-appendix.b-1">This appendix is informative.</t>
      <t indent="0" pn="section-appendix.b-2"><xref target="Err6543" format="default" sectionFormat="of" derivedContent="Err6543"/> notes that
the ABNF used in <xref target="RFC8610" format="default" sectionFormat="of" derivedContent="RFC8610"/> for the content of byte string literals
lumps together byte strings notated as text with byte strings notated
      in base16 (hex) or base64 (but see also updated BCHAR rule in <xref target="e6527-new2" format="default" sectionFormat="of" derivedContent="Figure 4"/>):</t>
      <figure anchor="e6527-orig2a" align="left" suppress-title="false" pn="figure-12">
        <name slugifiedName="name-original-abnf-from-rfc-8610-f">Original ABNF from RFC 8610 for BCHAR</name>
        <sourcecode type="abnf" markers="false" pn="section-appendix.b-3.1">
; ABNF from RFC 8610:
bytes = [bsqual] %x27 *BCHAR %x27
BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF
</sourcecode>
      </figure>
      <section numbered="true" anchor="change-proposed-by-errata-report-6543" removeInRFC="false" toc="include" pn="section-appendix.b.1">
        <name slugifiedName="name-change-proposed-by-erratum-">Change Proposed by Erratum ID 6543</name>
        <t indent="0" pn="section-appendix.b.1-1">Erratum ID 6543 proposes handling the two cases in separate
ABNF rules (where, with an updated SESC, BCHAR obviously needs to be
updated as above):</t>
        <figure anchor="e6543-1" align="left" suppress-title="false" pn="figure-13">
          <name slugifiedName="name-proposal-from-erratum-id-65">Proposal from Erratum ID 6543 to Split the Byte String Rules</name>
          <sourcecode type="abnf" markers="false" pn="section-appendix.b.1-2.1">
; Proposal from Erratum ID 6543:
bytes = %x27 *BCHAR %x27
      / bsqual %x27 *QCHAR %x27
BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF
QCHAR = DIGIT / ALPHA / "+" / "/" / "-" / "_" / "=" / WS
</sourcecode>
        </figure>
        <t indent="0" pn="section-appendix.b.1-3">This potentially causes a subtle change, which is hidden in the WS rule:</t>
        <figure anchor="e6543-2" align="left" suppress-title="false" pn="figure-14">
          <name slugifiedName="name-abnf-definition-of-ws-from-">ABNF Definition of WS from RFC 8610</name>
          <sourcecode type="abnf" markers="false" pn="section-appendix.b.1-4.1">
; ABNF from RFC 8610:
WS = SP / NL
SP = %x20
NL = COMMENT / CRLF
COMMENT = ";" *PCHAR CRLF
PCHAR = %x20-7E / %x80-10FFFD
CRLF = %x0A / %x0D.0A
</sourcecode>
        </figure>
        <t indent="0" pn="section-appendix.b.1-5">This allows any non-C0 character in a comment, so this fragment
becomes possible:</t>
        <sourcecode type="cddl" markers="false" pn="section-appendix.b.1-6">
foo = h'
   43424F52 ; 'CBOR'
   0A       ; LF, but don't use CR!
'
</sourcecode>
        <t indent="0" pn="section-appendix.b.1-7">The current text is not unambiguously saying whether the three apostrophes
need to be escaped with a <tt>\</tt> or not, as in:</t>
        <sourcecode type="cddl" markers="false" pn="section-appendix.b.1-8">
foo = h'
   43424F52 ; \'CBOR\'
   0A       ; LF, but don\'t use CR!
'
</sourcecode>
        <t indent="0" pn="section-appendix.b.1-9">... which would be supported by the existing ABNF in <xref target="RFC8610" format="default" sectionFormat="of" derivedContent="RFC8610"/>.</t>
      </section>
      <section numbered="true" anchor="no-further-change-needed-after-updating-string-literal-grammar-e6527" removeInRFC="false" toc="include" pn="section-appendix.b.2">
        <name slugifiedName="name-no-further-change-needed-af">No Further Change Needed after Updating String Literal Grammar</name>
        <t indent="0" pn="section-appendix.b.2-1">This document takes the simpler approach of leaving the processing of
the content of the byte string literal to a semantic step after
processing the syntax of the <tt>bytes</tt> and <tt>BCHAR</tt> rules, as updated by
Figures <xref target="e6527-new1" format="counter" sectionFormat="of" derivedContent="2"/> and <xref target="e6527-new2" format="counter" sectionFormat="of" derivedContent="4"/> in <xref target="e6527" format="default" sectionFormat="of" derivedContent="Section 2.1"/> (updates prompted by the combination
of <xref target="Err6527" format="default" sectionFormat="of" derivedContent="Err6527"/> and <xref target="Err6278" format="default" sectionFormat="of" derivedContent="Err6278"/>).</t>
        <t indent="0" pn="section-appendix.b.2-2">Therefore, the rules in <xref target="e6543-2" format="default" sectionFormat="of" derivedContent="Figure 14"/> (as updated by <xref target="e6527-new2" format="default" sectionFormat="of" derivedContent="Figure 4"/>) are 
applied to the result of this
processing where <tt>bsqual</tt> is given as <tt>h</tt> or <tt>b64</tt>.</t>
        <t indent="0" pn="section-appendix.b.2-3">Note that this approach also works well with the use of byte strings
in <xref section="3" sectionFormat="of" target="RFC9165" format="default" derivedLink="https://rfc-editor.org/rfc/rfc9165#section-3" derivedContent="RFC9165"/>.
It does require some care when copying-and-pasting into CDDL models from ABNF
that contains single quotes (which may also hide as apostrophes
in comments); these need to be escaped or possibly replaced by <tt>%x27</tt>.</t>
        <t indent="0" pn="section-appendix.b.2-4">Finally, the approach taken lends support to extending <tt>bsqual</tt> in CDDL
similar to the way this is done for CBOR diagnostic notation in <xref target="I-D.ietf-cbor-edn-literals" format="default" sectionFormat="of" derivedContent="EDN-LITERALS"/>.
(Note that, at the time of writing, the processing of string literals is quite similar for both
CDDL and Extended Diagnostic Notation (EDN), except that CDDL has end-of-line comments that are "<tt>;</tt>" based and EDN has
two comment syntaxes: one in-line "<tt>/</tt>" based and one end-of-line "<tt>#</tt>" based.)</t>
      </section>
    </section>
    <section numbered="false" anchor="acknowledgments" removeInRFC="false" toc="include" pn="section-appendix.c">
      <name slugifiedName="name-acknowledgments">Acknowledgments</name>
      <t indent="0" pn="section-appendix.c-1">Many thanks go to the submitters of the errata reports addressed in
this document.
In one of the ensuing discussions, <contact fullname="Doug Ewell"/> proposed  defining an
ABNF rule "NONASCII", of which we have included the essence.
Special thanks to the reviewers <contact fullname="Marco Tiloca"/>, <contact fullname="Christian Amsüss"/> (Shepherd Review and further guidance), <contact fullname="Orie Steele"/> (AD
Review and further guidance), and <contact fullname="Éric Vyncke"/>
(detailed IESG review).</t>
    </section>
    <section anchor="authors-addresses" numbered="false" removeInRFC="false" toc="include" pn="section-appendix.d">
      <name slugifiedName="name-authors-address">Author's Address</name>
      <author initials="C." surname="Bormann" fullname="Carsten Bormann">
        <organization showOnFrontPage="true">Universität Bremen TZI</organization>
        <address>
          <postal>
            <street>Postfach 330440</street>
            <city>Bremen</city>
            <code>D-28359</code>
            <country>Germany</country>
          </postal>
          <phone>+49-421-218-63921</phone>
          <email>cabo@tzi.org</email>
        </address>
      </author>
    </section>
  </back>
</rfc>
