<?xml version="1.0" encoding="UTF-8"?>
  <?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?>
  <!-- generated by https://github.com/cabo/kramdown-rfc2629 version 1.0.35 -->

<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
]>

<?rfc toc="yes"?>
<?rfc sortrefs="yes"?>
<?rfc symrefs="yes"?>

<rfc ipr="trust200902" docName="draft-strad-trans-redaction-00" category="exp">

  <front>
    <title abbrev="CT Domain Label Redaction">Certificate Transparency: Domain Label Redaction</title>

    <author initials="R." surname="Stradling" fullname="Rob Stradling">
      <organization>Comodo CA, Ltd.</organization>
      <address>
        <email>rob.stradling@comodo.com</email>
      </address>
    </author>
    <author initials="E." surname="Messeri" fullname="Eran Messeri">
      <organization>Google UK Ltd.</organization>
      <address>
        <email>eranm@google.com</email>
      </address>
    </author>

    <date year="2016" month="August" day="31"/>

    <area>Security</area>
    <workgroup>TRANS (Public Notary Transparency)</workgroup>
    <keyword>Internet-Draft</keyword>

    <abstract>


<t>We define a mechanism to allow DNS domain name labels that are considered to be
private to not appear in public Certificate Transparency (CT) logs, while still
retaining most of the security benefits that accrue from using Certificate
Transparency mechanisms.</t>



    </abstract>


  </front>

  <middle>


<section anchor="introduction" title="Introduction">

<t>Some domain owners regard certain DNS domain name labels within their registered
domain space as private and security sensitive. Even though these domains are
often only accessible within the domain owner’s private network, it’s common for
them to be secured using publicly trusted Transport Layer Security (TLS) server
certificates.</t>

<t>Certificate Transparency <xref target="I-D.ietf-trans-rfc6962-bis"></xref> describes a protocol for
publicly logging the existence of TLS server certificates as they are issued or
observed. Since each TLS server certificate lists the domain names that it is
intended to secure, private domain name labels within registered domain space
could end up appearing in CT logs, especially as TLS clients develop policies
that mandate CT compliance. This seems like an unfortunate and potentially
unnecessary privacy leak, because it’s the registered domain names in each
certificate that are of primary interest when using CT to look for suspect
certificates.</t>

<t>TODO: Highlight better the differences between registered domains and
subdomains, referencing the relevant DNS RFCs.</t>

<t>Section TBD of <xref target="I-D.ietf-trans-rfc6962-bis"></xref> proposes two mechanisms for dealing
with this conundrum: wildcard certificates and name-constrained intermediate
CAs. However, these mechanisms are insufficient to cover all use cases.</t>

<t>TODO(eranm): Expand on when each of the other mechanisms is suitable and when
this mechanism may be suitable.</t>

<t>We define a domain label redaction mechanism that covers all use cases, at the
cost of increased implementation complexity. CAs and domain owners should note
that there are privacy considerations (<xref target="privacy_considerations"/>) and that
TLS clients may apply additional requirements (relating to the use of this
redaction mechanism) for a certificate to be considered compliant.</t>

</section>
<section anchor="requirements-language" title="Requirements Language">

<t>The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”,
“SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be
interpreted as described in <xref target="RFC2119"></xref>.</t>

</section>
<section anchor="redacting_labels" title="Redacting Labels in Precertificates">

<t>When creating a precertificate, the CA MAY include a redactedSubjectAltName
(<xref target="redacted_san_extension"/>) extension that contains, in a redacted form,
the same entries that will be included in the certificate’s subjectAltName
extension. When the redactedSubjectAltName extension is present in a
precertificate, the subjectAltName extension MUST be omitted (even though it
MUST be present in the corresponding certificate).</t>

<t>Wildcard <spanx style="verb">*</spanx> labels MUST NOT be redacted, but one or more non-wildcard labels in
each DNS-ID <xref target="RFC6125"></xref> can each be replaced with a redacted label as follows:</t>

<figure><artwork><![CDATA[
  REDACT(label) = prefix || BASE32(index || _label_hash)
    _label_hash = LABELHASH(keyid_len || keyid || label_len || label)
]]></artwork></figure>

<t><spanx style="verb">label</spanx> is the case-sensitive label to be redacted.</t>

<t><spanx style="verb">prefix</spanx> is the “?” character (ASCII value 63).</t>

<t><spanx style="verb">index</spanx> is the 1 byte index of a hash function in the CT hash algorithm registry
(section TBD of <xref target="I-D.ietf-trans-rfc6962-bis"></xref>). The value 255 is reserved.</t>

<t><spanx style="verb">keyid_len</spanx> is the 1 byte length of the <spanx style="verb">keyid</spanx>.</t>

<t><spanx style="verb">keyid</spanx> is the keyIdentifier from the Subject Key Identifier extension
(section 4.2.1.2 of <xref target="RFC5280"></xref>), excluding the ASN.1 OCTET STRING tag and length
bytes.</t>

<t><spanx style="verb">label_len</spanx> is the 1 byte length of the <spanx style="verb">label</spanx>.</t>

<t><spanx style="verb">||</spanx> denotes concatenation.</t>

<t><spanx style="verb">BASE32</spanx> is the Base 32 Encoding function (section 6 of <xref target="RFC4648"></xref>). Pad
characters MUST NOT be appended to the encoded data.</t>

<t><spanx style="verb">LABELHASH</spanx> is the hash function identified by <spanx style="verb">index</spanx>.</t>

</section>
<section anchor="redacted_san_extension" title="redactedSubjectAltName Certificate Extension">

<t>The redactedSubjectAltName extension is a non-critical extension
(OID 1.3.101.77) that is identical in structure to the subjectAltName extension,
except that DNS-IDs MAY contain redacted labels (<xref target="redacting_labels"/>).</t>

<t>When used, the redactedSubjectAltName extension MUST be present in both the
precertificate and the corresponding certificate.</t>

<t>This extension informs TLS clients of the DNS-ID labels that were redacted and
the degree of redaction, while minimizing the complexity of TBSCertificate
reconstruction (<xref target="reconstructing_tbscertificate"/>). Hashing the redacted labels
allows the legitimate domain owner to identify whether or not each redacted
label correlates to a label they know of.</t>

<t>TODO: Consider the pros and cons of this ‘un’redaction feature. If the cons
outweigh the pros, switch to using Andrew Ayer’s alternative proposal of hashing
a random salt and including that salt in an extension in the certificate (and
not including the salt in the precertificate).</t>

<t>Only DNS-ID labels can be redacted using this mechanism. However, CAs can use
Name Constraints (section TBD of <xref target="I-D.ietf-trans-rfc6962-bis"></xref>) to allow DNS
domain name labels in other subjectAltName entries to not appear in logs.</t>

<t>TODO: Should we support redaction of SRV-IDs and URI-IDs using this mechanism?</t>

</section>
<section anchor="verifying_redacted_san" title="Verifying the redactedSubjectAltName extension">

<t>If the redactedSubjectAltName extension is present, TLS clients MUST check that
the subjectAltName extension is present, that the subjectAltName extension
contains the same number of entries as the redactedSubjectAltName extension, and
that each entry in the subjectAltName extension has a matching entry at the same
position in the redactedSubjectAltName extension. Two entries are matching if
either:</t>

<t><list style="symbols">
  <t>The two entries are identical; or</t>
  <t>Both entries are DNS-IDs, have the same number of labels, and each label in
the subjectAltName entry has a matching label at the same position in the
redactedSubjectAltName entry. Two labels are matching if either:
  <list style="symbols">
      <t>The two labels are identical; or,</t>
      <t>Neither label is <spanx style="verb">*</spanx> and the label from the redactedSubjectAltName entry is
equal to REDACT(label from subjectAltName entry) (<xref target="redacting_labels"/>).</t>
    </list></t>
</list></t>

<t>If any of these checks fail, the certificate MUST NOT be considered compliant.</t>

</section>
<section anchor="reconstructing_tbscertificate" title="Reconstructing the TBSCertificate">

<t>Section TBD of <xref target="I-D.ietf-trans-rfc6962-bis"></xref> describes how TLS clients can
reconstruct the TBSCertificate component of a precertificate from a certificate,
so that associated SCTs may be verified.</t>

<t>If the redactedSubjectAltName extension (<xref target="redacted_san_extension"/>) is present
in the certificate, TLS clients MUST also:</t>

<t><list style="symbols">
  <t>Verify the redactedSubjectAltName extension against the subjectAltName
extension according to <xref target="verifying_redacted_san"/>.</t>
  <t>Once verified, remove the subjectAltName extension from the TBSCertificate.</t>
</list></t>

</section>
<section anchor="security-considerations" title="Security Considerations">

<section anchor="avoiding-overly-redacting-domain-name-labels" title="Avoiding Overly Redacting Domain Name Labels">

<t>Redaction of domain name labels carries the same risks as the use of wildcards
(e.g., section 7.2 of <xref target="RFC6125"></xref>). If the entirety of the domain space below the
unredacted part of a domain name is not registered by a single domain owner
(e.g., REDACT(label).com, REDACT(label).co.uk and other <xref target="Public.Suffix.List"></xref>
entries), then the domain name may be considered by clients to be overly
redacted.</t>

<t>CAs should take care to avoid overly redacting domain names in precertificates.
It is expected that monitors will treat precertificates that contain overly
redacted domain names as potentially misissued. TLS clients MAY consider a
certificate to be non-compliant if the reconstructed TBSCertificate
(<xref target="reconstructing_tbscertificate"/>) contains any overly redacted domain names.</t>

</section>
</section>
<section anchor="privacy_considerations" title="Privacy Considerations">

<section anchor="ensuring-effective-redaction" title="Ensuring Effective Redaction">

<t>Although the domain label redaction mechanism removes the need for private
labels to appear in logs, it does not guarantee that this will never happen.
Anyone who encounters a certificate could choose to submit it to one or more
logs, thereby rendering the redaction futile.</t>

<t>Domain owners are advised to take the following steps to minimize the likelihood
that their private labels will become known outside their closed communities:</t>

<t><list style="symbols">
  <t>Avoid registering private labels in public DNS.</t>
  <t>Avoid using private labels that are predictable (e.g., “www”, labels
consisting only of numerical digits, etc). If a label has insufficient entropy
then redaction will only provide a thin layer of obfuscation, because it will
be feasible to recover the label via a brute-force attack.</t>
  <t>Avoid using publicly trusted certificates to secure private domain space.</t>
</list></t>

<t>CAs are advised to carefully consider each request to redact a label. When a CA
believes that redacting a particular label would be futile, we advise rejecting
the redaction request. TLS clients may have policies that forbid redaction, so
redaction should only be used when it’s absolutely necessary and likely to be
effective.</t>

</section>
</section>
<section anchor="acknowledgements" title="Acknowledgements">

<t>The authors would like to thank Andrew Ayer and TBD for their valuable
contributions.</t>

<t>A big thank you to Symantec for kindly donating the OID from the 1.3.101 arc
that is used in this document.</t>

</section>


  </middle>

  <back>

    <references title='Normative References'>





<reference  anchor='RFC2119' target='http://www.rfc-editor.org/info/rfc2119'>
<front>
<title>Key words for use in RFCs to Indicate Requirement Levels</title>
<author initials='S.' surname='Bradner' fullname='S. Bradner'><organization /></author>
<date year='1997' month='March' />
<abstract><t>In many standards track documents several words are used to signify the requirements in the specification.  These words are often capitalized. This document defines these words as they should be interpreted in IETF documents.  This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t></abstract>
</front>
<seriesInfo name='BCP' value='14'/>
<seriesInfo name='RFC' value='2119'/>
<seriesInfo name='DOI' value='10.17487/RFC2119'/>
</reference>



<reference  anchor='RFC4648' target='http://www.rfc-editor.org/info/rfc4648'>
<front>
<title>The Base16, Base32, and Base64 Data Encodings</title>
<author initials='S.' surname='Josefsson' fullname='S. Josefsson'><organization /></author>
<date year='2006' month='October' />
<abstract><t>This document describes the commonly used base 64, base 32, and base 16 encoding schemes.  It also discusses the use of line-feeds in encoded data, use of padding in encoded data, use of non-alphabet characters in encoded data, use of different encoding alphabets, and canonical encodings.  [STANDARDS-TRACK]</t></abstract>
</front>
<seriesInfo name='RFC' value='4648'/>
<seriesInfo name='DOI' value='10.17487/RFC4648'/>
</reference>



<reference  anchor='RFC5280' target='http://www.rfc-editor.org/info/rfc5280'>
<front>
<title>Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile</title>
<author initials='D.' surname='Cooper' fullname='D. Cooper'><organization /></author>
<author initials='S.' surname='Santesson' fullname='S. Santesson'><organization /></author>
<author initials='S.' surname='Farrell' fullname='S. Farrell'><organization /></author>
<author initials='S.' surname='Boeyen' fullname='S. Boeyen'><organization /></author>
<author initials='R.' surname='Housley' fullname='R. Housley'><organization /></author>
<author initials='W.' surname='Polk' fullname='W. Polk'><organization /></author>
<date year='2008' month='May' />
<abstract><t>This memo profiles the X.509 v3 certificate and X.509 v2 certificate revocation list (CRL) for use in the Internet.  An overview of this approach and model is provided as an introduction.  The X.509 v3 certificate format is described in detail, with additional information regarding the format and semantics of Internet name forms.  Standard certificate extensions are described and two Internet-specific extensions are defined.  A set of required certificate extensions is specified.  The X.509 v2 CRL format is described in detail along with standard and Internet-specific extensions.  An algorithm for X.509 certification path validation is described.  An ASN.1 module and examples are provided in the appendices.  [STANDARDS-TRACK]</t></abstract>
</front>
<seriesInfo name='RFC' value='5280'/>
<seriesInfo name='DOI' value='10.17487/RFC5280'/>
</reference>



<reference  anchor='RFC6125' target='http://www.rfc-editor.org/info/rfc6125'>
<front>
<title>Representation and Verification of Domain-Based Application Service Identity within Internet Public Key Infrastructure Using X.509 (PKIX) Certificates in the Context of Transport Layer Security (TLS)</title>
<author initials='P.' surname='Saint-Andre' fullname='P. Saint-Andre'><organization /></author>
<author initials='J.' surname='Hodges' fullname='J. Hodges'><organization /></author>
<date year='2011' month='March' />
<abstract><t>Many application technologies enable secure communication between two entities by means of Internet Public Key Infrastructure Using X.509 (PKIX) certificates in the context of Transport Layer Security (TLS). This document specifies procedures for representing and verifying the identity of application services in such interactions.   [STANDARDS-TRACK]</t></abstract>
</front>
<seriesInfo name='RFC' value='6125'/>
<seriesInfo name='DOI' value='10.17487/RFC6125'/>
</reference>



<reference anchor='I-D.ietf-trans-rfc6962-bis'>
<front>
<title>Certificate Transparency</title>

<author initials='B' surname='Laurie' fullname='Ben Laurie'>
    <organization />
</author>

<author initials='A' surname='Langley' fullname='Adam Langley'>
    <organization />
</author>

<author initials='E' surname='Kasper' fullname='Emilia Kasper'>
    <organization />
</author>

<author initials='E' surname='Messeri' fullname='Eran Messeri'>
    <organization />
</author>

<author initials='R' surname='Stradling' fullname='Rob Stradling'>
    <organization />
</author>

<date month='July' day='27' year='2016' />

<abstract><t>This document describes a protocol for publicly logging the existence of Transport Layer Security (TLS) certificates as they are issued or observed, in a manner that allows anyone to audit certification authority (CA) activity and notice the issuance of suspect certificates as well as to audit the certificate logs themselves. The intent is that eventually clients would refuse to honor certificates that do not appear in a log, effectively forcing CAs to add all issued certificates to the logs.  Logs are network services that implement the protocol operations for submissions and queries that are defined in this document.</t></abstract>

</front>

<seriesInfo name='Internet-Draft' value='draft-ietf-trans-rfc6962-bis-18' />
<format type='TXT'
        target='http://www.ietf.org/internet-drafts/draft-ietf-trans-rfc6962-bis-18.txt' />
</reference>




    </references>

    <references title='Informative References'>

<reference anchor="Public.Suffix.List" target="https://publicsuffix.org">
  <front>
    <title>Public Suffix List</title>
    <author >
      <organization>Mozilla Foundation</organization>
    </author>
    <date year="2016"/>
  </front>
</reference>


    </references>



  </back>
</rfc>

