<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [

<!ENTITY rfc1033 PUBLIC ''
   'http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.1033.xml'>
<!ENTITY rfc1034 PUBLIC ''
   'http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.1034.xml'>
<!ENTITY rfc1035 PUBLIC ''
   'http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.1035.xml'>
<!ENTITY rfc2045 PUBLIC ''
   'http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2045.xml'>
<!ENTITY rfc2119 PUBLIC ''
   'http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml'>
<!ENTITY rfc2782 PUBLIC ''
   'http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2782.xml'>
<!ENTITY rfc4055 PUBLIC ''
   'http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.4055.xml'>
<!ENTITY rfc4075 PUBLIC ''
   'http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.4075.xml'>
<!ENTITY rfc4279 PUBLIC ''
   'http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.4279.xml'>
<!ENTITY rfc5246 PUBLIC ''
   'http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.5246.xml'>
<!ENTITY rfc6762 PUBLIC ''
   'http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.6762.xml'>
<!ENTITY rfc6763 PUBLIC ''
   'http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.6763.xml'>
<!ENTITY rfc7626 PUBLIC ''
   'http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7626.xml'>
<!ENTITY rfc7844 PUBLIC ''
   'http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7844.xml'>
<!ENTITY rfc7858 PUBLIC ''
   'http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7858.xml'>

<!ENTITY I-D.ietf-intarea-hostname-practice PUBLIC ''  
   "http://xml2rfc.ietf.org/public/rfc/bibxml3/reference.I-D.ietf-intarea-hostname-practice.xml"> 
<!ENTITY I-D.ietf-dprive-dnsodtls PUBLIC ''  
   "http://xml2rfc.ietf.org/public/rfc/bibxml3/reference.I-D.ietf-dprive-dnsodtls.xml">
<!ENTITY I-D.ietf-tls-tls13 PUBLIC ''  
   "http://xml2rfc.ietf.org/public/rfc/bibxml3/reference.I-D.ietf-tls-tls13.xml">
<!ENTITY I-D.ietf-dnssd-push PUBLIC ''  
   "http://xml2rfc.ietf.org/public/rfc/bibxml3/reference.I-D.ietf-dnssd-push">

<!ENTITY kw14a PUBLIC ''
   "references/reference.kw14a.xml">
<!ENTITY kw14b PUBLIC ''
   "references/reference.kw14b.xml">
]>

<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc compact="yes"?>
<?rfc toc="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>

<!-- Expand crefs and put them inline -->
<?rfc comments='yes' ?>
<?rfc inline='yes' ?>

<rfc category="std" 
     docName="draft-huitema-dnssd-privacy-01.txt"
     ipr="trust200902">

<front>
    <title abbrev="DNS-SD Privacy Extensions">
      Privacy Extensions for DNS-SD
    </title>

   <author fullname="Christian Huitema" initials="C." surname="Huitema">
      <organization>Microsoft</organization>
      <address>
        <postal>
          <street> </street>
          <city>Redmond</city>
          <code>98052</code>
          <region>WA</region>
          <country>U.S.A.</country>
        </postal>
        <email>huitema@microsoft.com</email>
      </address>
    </author>

   <author fullname="Daniel Kaiser" initials="D." surname="Kaiser">
     <organization>University of Konstanz</organization>
      <address>
        <postal>
          <street> </street>
          <city>Konstanz</city>
          <code>78457</code>
          <region></region>
          <country>Germany</country>
        </postal>
        <email>daniel.kaiser@uni-konstanz.de</email>
      </address>
    </author>

    <date year="2016" />

    <abstract>
        <t> 
DNS-SD allows discovery of services published in DNS or MDNS. The publication
normally discloses information about the device publishing the services.
There are use cases where devices want to communicate without disclosing
their identity, for example two mobile devices visiting the same
hotspot.
</t>
<t>
  We propose to solve this problem by a two-stage approach.
  In the first stage, hosts discover Private Discovery Service Instances via
  DNS-SD using special formats to protect their privacy.
  These service instances correspond to Private Discovery Servers running on peers.
  In the second stage, hosts directly query these Private Discovery Servers via DNS-SD over TLS.
  A pairwise shared secret necessary to establish these connections
  is only known to hosts authorized by a pairing system.
</t>
    </abstract>
</front>

<middle>
<section title="Introduction">
<t>
DNS-SD <xref target="RFC6763" /> enables distribution and discovery in local networks 
without configuration. It is very convenient for users, but it requires the public exposure 
of the offering and requesting identities along with information about the offered and 
requested services.  Some of the information published by the 
announcements can be very revealing. These privacy issues and potential
solutions are discussed in <xref target="KW14a" /> 
and <xref target="KW14b" />.
</t>
<t>
There are cases when nodes connected to a network want to provide
or consume services without exposing their identity to the other
parties connected to the same network. Consider for example a
traveler wanting to upload pictures from a phone to a laptop
when connected to the Wi-Fi network of an Internet cafe, or
two travelers who want to share files between their laptops
when waiting for their plane in an airport lounge.
</t>
<t>
We expect that these exchanges will start with a discovery 
procedure using DNS-SD <xref target="RFC6763" />. One of the devices
will publish the availability of a service, such as a picture library
or a file store in our examples. The user of the other device will
discover this service, and then connect to it.
</t>
<t>
When analyzing these scenarios in <xref target="analysis"/>, we find that
the DNS-SD messages leak identifying information such as instance name,
host name or service properties. We review the design constraint of a solution
in <xref target="design"/>, and describe the proposed solution in
<xref target="solution"/>.
</t>
<section title="Requirements">
<t>
  The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
  "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
  document are to be interpreted as described in <xref target="RFC2119" />.
</t>
</section>
</section>

<section title="Privacy Implications of DNS-SD" anchor="analysis">
<t>
DNS-Based Service Discovery (DNS-SD) is defined in <xref target="RFC6763" />.
It allows nodes to publish the availability of an instance of a service by
inserting specific records in the DNS (<xref target="RFC1033"/>,
<xref target="RFC1034"/>, <xref target="RFC1035"/>) or by publishing
these records locally using
multicast DNS (mDNS) <xref target="RFC6762"/>.
Available services are described using three types of records:
</t>
<t>
<list style="hanging">
<t hangText="PTR Record:">Associates a service type in the domain with
an "instance" name of this service type.
</t>
<t hangText="SRV Record:">Provides the node name, port number, priority and
weight associated with the service instance, in conformance with <xref target="RFC2782" />.
</t>
<t hangText="TXT Record:">Provides a set of attribute-value pairs describing
specific properties of the service instance.
</t>
</list>
</t>
<t>
In the remaining subsections, we will review the privacy issues related to publishing
instance names, node names, service attributes and other data, as well as review 
the implications of using the discovery service as a client.
</t>

<section title="Privacy Implication of Publishing Service Instance Names" anchor="instanceLeak" >
<t>
In the first phase of discovery, the client obtains all
the PTR records associated with a service type in a given naming domain.
Each PTR record contains a Service Instance Name defined in Section 4 of <xref target="RFC6763" />:
</t>

<t>
<figure>
<artwork>
  Service Instance Name = &lt;Instance&gt; . &lt;Service&gt; . &lt;Domain&gt;
</artwork>
</figure>
</t>

<t>
The &lt;Instance&gt; portion of the Service Instance Name is meant to convey
enough information for users of discovery clients to easily select the desired service instance.
Nodes that use DNS-SD over mDNS <xref target="RFC6762" /> in a mobile environment will rely on the specificity
of the instance name to identify the desired service instance.
In our example of users wanting to upload pictures to a laptop in an Internet Cafe, the list of 
available service instances may look like:
</t>
<t>
<figure>
<artwork>
Alice's Images         . _imageStore._tcp . local
Alice's Mobile Phone   . _presence._tcp   . local
Alice's Notebook       . _presence._tcp   . local
Bob's Notebook         . _presence._tcp   . local
Carol's Notebook       . _presence._tcp   . local
</artwork>
</figure>
</t>
<t>
Alice will see the list on her phone and understand intuitively that she should
pick the first item. The discovery will "just work".
</t>
<t>
However, DNS-SD/mDNS will reveal to anybody that Alice is currently visiting the Internet Cafe.
It further discloses the fact that she uses two devices, shares an image store, 
and uses a chat application supporting the
_presence protocol on both of her devices. She might currently chat with Bob or Carol, 
as they are also using a _presence supporting chat application.
This information is not just available to devices actively browsing for and offering 
services, but to anybody passively listing to the network traffic.
</t>
</section>

<section title="Privacy Implication of Publishing Node Names">
<t>
The SRV records contain the DNS name of the node publishing the
service. Typical implementations construct this DNS name by
concatenating the "host name" of the node with the name of the 
local domain. The privacy implications of this
practice are reviewed in <xref target="I-D.ietf-intarea-hostname-practice" />.
Depending on naming practices, the host name is either a strong 
identifier of the device, or at a minimum a partial identifier.
It enables tracking of the device, and by extension of the device's owner.
</t>
</section>

<section title="Privacy Implication of Publishing Service Attributes">
<t>
The TXT record's attribute and value pairs contain information on the characteristics of
the corresponding service instance.
This in turn reveals information
about the devices that publish services. The amount of information
varies widely with the particular service and its implementation:
</t>
<t>
<list style="symbols">
<t>
Some attributes like the paper size available in a printer, are the
same on many devices, and thus only provide limited information
to a tracker.
</t>
<t>
Attributes that have freeform values, such as the name of a directory,
may reveal much more information.
</t>
</list>
</t>
<t>
Combinations of attributes have more information power than specific attributes,
and can potentially be used for "fingerprinting" a specific device.
</t>

<t>
Information contained in TXT records does not only breach privacy by making devices
trackable, but might directly contain private information about a device user.
For instance the _presence service reveals the "chat status" to everyone in the same network.
Users might not be aware of that.
</t>

<t>
  Further, TXT records often contain version information about services allowing potential attackers
  to identify devices running exploit-prone versions of a certain service.
</t>

</section>

<section title="Device Fingerprinting" anchor="serverFingerprint">
<t>
The combination of information published in DNS-SD has the potential to
provide a "fingerprint" of a specific device. Such information includes: 
</t>
<t>
<list style="symbols">
<t>
The list of services published by the device, which can be retrieved because the
SRV records will point to the same host name.
</t>
<t>
The specific attributes describing these services.
</t>
<t>
The port numbers used by the services.
</t>
<t>
The values of the priority and weight attributes in the SRV records.
</t>
</list>
</t>
<t>
This combination of services and attributes will often be sufficient to identify
the version of the software running on a device. If a device publishes
many services with rich sets of attributes, the combination may be
sufficient to identify the specific device.
</t>
<t>
There is however an argument that devices providing services can be discovered
by observing the local traffic, because different services have different traffic 
patterns. The observation could in many cases also reveal some specificities
of the service's implementation. Even if the traffic is encrypted, the size
and the timing of packets may be sufficient to reveal that information. This
argument can be used to assess the priority of, for example, protecting the 
fact that a device publishes a particular service. However, we may assume that the 
developers of sensitive services will use counter-measures to defeat such
traffic analysis.
</t>
</section>

<section title="Privacy Implication of Discovering Services" anchor="clientPrivacy" >
<t>
The consumers of services engage in discovery, and in doing so
reveal some information such as the list of services they
are interested in and the domains in which they are looking for the
services. When the clients select specific instances of services,
they reveal their preference for these instances. This can be benign if
the service type is very common, but it could be more problematic
for sensitive services, such as for example some private messaging services.
</t>
<t>
One way to protect clients would be to somehow encrypt the requested service types.
Of course, just as we noted in <xref target="serverFingerprint"/>, traffic
analysis can often reveal the service. 
</t>
</section>
</section>

<section title="Limits of a Simple Design" anchor="towards" >
<t>
We first tried a simple design for mitigating the issues outlined in <xref target="analysis" />. 
The basic idea was to advertise obfuscated names, so as to not reveal the particularities
of the service providers. This design is tempting, because it only requires minimal
changes in the DNS-SD processing. However, as we will see in the following subsections,
it has two important drawbacks:
</t>
<t>
<list style="symbols">
<t>
The simple design leads to UI issues, because users of unmodified DNS-SD agents will see
a mix of clear text names and obfuscated names, which is unpleasant.
</t>
<t>
With this simple design, there is no good way to hide the type of services provided 
or consumed by a specific node.
</t>
<t>
The simple design either requires having a shared key between all "authorized users" of a 
service, which implies substandard key management practices, or publishing as many instances
of a service as there are authorized users, which leads to the scaling issues  
discussed in <xref target="scalingIssues"/>.
</t>
</list>
</t>
<t>
Both issues are mitigated by the two-stage design presented in <xref target="design" />. The following
subsections detail the simple design, and its drawbacks.
</t>
<section title="Obfuscated Instance Names" anchor="obfuscatedInstanceName" >
<t>
The privacy issues described in <xref target="instanceLeak"/> 
could be solved by obfuscating the instance names. Instead
of a user friendly description of the instance,
the nodes would publish a random looking string of characters.
To prevent tracking over time and location, different string
values would be used at different locations, or at different times.
</t>
<t>
Authorized parties have to be able to "de-obfuscate" the names,
while non-authorized third parties will not be. For example,
if both Alice's notebook and Bob's laptop use an obfuscation process, 
the list of available services should appear differently 
to them and to third parties. Alice's phone will be able to
de-obfuscate the name of Alice's notebook, but not that of 
Bob's laptop. Bob's phone will do the opposite. Carol will do
neither.
</t>
<t>
Alice will see something like:
</t>
<t>
<figure>
<artwork>
QwertyUiopAsdfghjk (Alice's Images)       . _imageStore._tcp . local
GobbeldygookBlaBla (Alice's Mobile Phone) . _presence._tcp   . local
MNbvCxzLkjhGfdEdhg (Alice's Notebook)     . _presence._tcp   . local
Abracadabragooklybok (Bob's Notebook)     . _presence._tcp   . local
Carol's Notebook                          . _presence._tcp   . local
</artwork>
</figure>
</t>
<t>
Bob will see:
</t>
<t>
<figure>
<artwork>
QwertyUiopAsdfghjk                    . _imageStore._tcp . local
GobbeldygookBlaBla                    . _presence._tcp   . local
MNbvCxzLkjhGfdEdhg                    . _presence._tcp   . local
Abracadabragooklybok (Bob's Notebook) . _presence._tcp   . local
Carol's Notebook                      . _presence._tcp   . local
</artwork>
</figure>
</t>
<t>
Carol will see:
</t>
<t>
<figure>
<artwork>
QwertyUiopAsdfghjk   . _imageStore._tcp . local
GobbeldygookBlaBla   . _presence._tcp   . local
MNbvCxzLkjhGfdEdhg   . _presence._tcp   . local
Abracadabragooklybok . _presence._tcp   . local
Carol's Notebook     . _presence._tcp   . local
</artwork>
</figure>
</t>
<t>
In that example, Alice, Bob and Carol will be able to select the
appropriate instance. It would probably be preferable to filter out the
obfuscated instance names, to avoid confusing the user. In our example, Alice 
and Bob have updated their software to understand obfuscation, and they
could easily filter out the obfuscated strings that they do not like.
But Carol is not using this system, and we could argue that her experience 
is suboptimal.
</t>

</section>


<section title="Names of Obfuscated Services">

<t>
Instead of publishing the actual service name in the SRV records,
nodes could publish a randomized name. There are two plausible reasons
for doing that:
</t>
<t>
<list style="symbols">
<t>
Having a different service name for privacy enhanced services will ensure
that hosts that are not privacy aware are not puzzled by obfuscated service names.
</t>
<t>
Using obfuscated service names prevents third parties from discovering
which service a particular host is providing or consuming.
</t>
</list>
</t>
<t>
The first requirement can be met with a simple modification of an existing 
name. For example, instead of publishing:
</t>
<t>
<figure>
<artwork>
QwertyUiopAsdfghjk . _imageStore._tcp . local
GobbeldygookBlaBla . _presence._tcp   . local
</artwork>
</figure>
</t>
<t>
Alice could publish some kind of "translation" of the service name, such as:
</t>
<t>
<figure>
<artwork>
QwertyUiopAsdfghjk . _vzntrFgber._tcp . local
GobbeldygookBlaBla . _cerfrapr._tcp   . local
</artwork>
</figure>
</t>
<t>
The previous examples use rot13 translation. It does not provide any 
particular privacy, but it does ensure that obfuscated services are
named differently from clear text services.
</t>
<t>
Making the service name actually private would require some actual encryption.
The main problem with such solutions is that the client needs to know the
service name in order to compose the DNS-SD query for services. There are several 
options:
</t>
<t>
<list style="symbols">
<t>
The service name is chosen by the client. For example, the client could 
encrypt the original service 
name and a nonce with a key shared between client and server. Upon receiving
the queries, the server would attempt to decrypt the service name. If that 
succeeds, the server would respond with PTR records created on the fly for
the new service name.
</t>
<t>
The service name is chosen by the server and cannot be predicted in advance by
the client. For example, the server could encrypt a nonce and
the original service name. The client retrieves such services by doing
a wild card query, then attempting to decrypt the received responses.
</t>
<t>
The service name is chosen by the server in a way that can be predicted in advance by
the client. For example, the server could encrypt some version of the data and time and
the original service name. The data and time are encoded with a coarse precision, enabling
the client to predict the value that the server is using, and to send the corresponding
queries.
</t>
</list>
</t>
<t>
None of these solutions is very attractive. Creating records on the fly is a burden
for the server. If clients must use wildcard queries, they will need to process
lots of irrelevant data. If clients need to predict different instance names for
each potential server, they will end up sending batches of queries with many
different names. All of these solutions appear like big departures from the 
simplicity and robustness of the DNS-SD design.
</t>

</section>


<section title="Scaling Issues with Obfuscation" anchor="scalingIssues" >
<t>
In <xref target="obfuscatedInstanceName" />, we assumed that each advertised
record contains a name obfuscated with a shared key. This approach is easy
to understand, but it contains hidden assumptions.
Let's look at one of our examples:
</t>
<t>
<figure>
<artwork>
Abracadabragooklybok (Bob's Notebook) . _presence._tcp   . local
</artwork>
</figure>
</t>
<t>
We only see one record for Bob's Notebook, obfuscated using 
the unique shared secret associated with Bob's Notebook. That means that
every device paired with Bob's Notebook will have a copy of that shared secret.
This is a possible solution, but there are known issues with having a secret 
shared with multiple entities:
</t>
<t>
<list style="symbols">
<t>
If for some reason the secret needs to be changed, every paired
device will need a copy of the new secret before it can participate
again in discovery.
</t>
<t>
If one of the previous pairings becomes invalid, the only way
to block the corresponding devices from discovery is to change the
secret for all other devices.
</t>
</list>
</t>
<t>
Key management becomes much easier if it is strictly pair-wise. Two
paired devices, or to pairs of users, can simply renew their
pairing and get a new secret. If a device ceases to be trusted, 
the pairing data and the corresponding secret can just be 
deleted and forgotten.
But using strictly pair-wise keys yields a scaling issue.
Let's assume that:
</t>
<t>
<list style="symbols">
<t>
Each device maintains an average of N pairings.
</t>
<t>
There are on average M devices present during discovery.
</t>
</list>
</t>
<t>
In the single key scenario, after issuing a broadcast query, the querier 
will receive a series of responses, each of which may well be obfuscated 
with a different key. If the receiver has N pre-existing pairings and 
receives M obfuscated responses, the cost will scale as O(M*N), i.e. try 
all N pairing keys for each of the M responses to see what matches. But
if the keys are specific to each pair of devices, the obfuscation becomes 
complicated. When receiving a request, the publisher does not know which 
of its N keys the querier can decrypt. One simple solution would be to 
send N responses, but then the load on the querier will scale as O(M*N^2).
That can go out of hand very quickly.
</t>
<t>
To solve the scaling issue, we consider a two-stage solution that uses
an optimized discovery procedure to discover privacy-compatible devices;
and uses point to point encrypted exchanges to privately discover
the available services.
</t>
</section>


</section>

<section title="Design of the Private DNS-SD Discovery Service" anchor="design" >
<t>
In this section, we present the design of a two-stage solution that enables private
use of DNS-SD, without affecting existing users, and with better scalability than 
the simple solution presented in <xref target="towards" />. The solution is largely based
on the architecture proposed in <xref target="KW14b" />, which separates the 
general private discovery problems in three components: Pairing, discovery of 
a private discovery service, and actual service discovery through this private service. 
Pairing has to provide the private discovery servers with means for mutual authentication, 
e.g. with an authenticated shared secret.
The private discovery servers provide actual service discovery with an authenticated connection.
Our solution applies this architecture in the context of DNS-SD.
It is based on the following components:
</t>
<t>
<list style="symbols">
<t>
Adding a pairing system to DNS-SD, described in <xref target="pairingDesign"/>, 
through which authorized peers can establish shared secrets;
</t>
<t>
Defining the Private Discovery Service through which other services can be advertised
in a private manner;
</t>
<t>
And, publishing availability
of the Private Discovery Service using DNS-SD,
so that peers can discover their services without compromising their privacy.
</t>
</list>
</t>
<t>
These are independent with respect to means used for transmitting the necessary data.
</t>

<section title="Device Pairing" anchor="pairingDesign" >


<t>
  Any private discovery solution needs to differentiate
  between authorized devices, which are allowed to get information about discoverable entities,
  and other devices, which should not be aware of the availability of private entities.
  The commonly used solution to this problem is establishing a  "device pairing".
  In our discovery scenarios, we envisage two kinds of pairings:
</t>

<t>
<list style="numbers">
<t>
Inter-user pairing is a pairing between devices of "friends".
      Since it has to be performed manually, e.g. by the means described above,
      it is important to limit it to once per pair of friends.
</t>
<t>
Intra-user pairing
      is a pairing of devices of the same user. It can be performed
      without any configuration by a meta-service (pairing data synchronization service) in
      a trusted (home) network.
    </t>
</list>
</t>

<t>
  The result of the pairing will be a shared secret, and optionally
  mutually authenticated public keys added to a local web of trust.
  Public key technology has many advantages, but shared secrets are typically easier to
  handle on small devices.
  We offer both a simple pairing just exchanging a shared secret, and an authenticated pairing
  using public key technology.
</t>

<!-- CH-Comments:
I leave this as is, because we will have time to revisit it later, but I think
that the discussion here is a bit confusing. The private discovery service
should just assume the existence of a shared secret by pair of peers. The
pairing service can establish this secret in many ways, including of
course taking advantage of lready authenticated public keys. 
-->

<section title="Shared Secret">
  <t>
  Goal of the pairing process is establishing pairwise shared secrets.
  If two users can leverage a secure private off-channel,
  it suffices for one user to generate the shared secret and transmit it over this
  off-channel.
  It would be possible for the users to meet and orally agree on a password that
  both users enter in their devices. This has the disadvantage of user-chosen passwords to
  have low entropy and the inconvenience of having to type the password.
  Leveraging QR-codes can overcome these disadvantages:
  one user generates a shared secret, displays it in form of a QR-code, and the other user scans this code.
  Strictly speaking, displaying and scanning QR-codes does not establish a secure private channel,
  as others could also photograph this code; but it is reasonable secure for the application area of private service discovery.

  Using Bluetooth LE might also be considered satisfactory as a compromise between
  convenience and security.
</t>
</section>


<section title="Secure Authenticated Pairing Channel">
<t>
  Optionally, various versions of authenticated DH can be used to exchange a mutually authenticated shared secret
  (which among other possibilities can leverage QR-codes for key fingerprint verification).
  Using DH gives the benefit of provable security and the possibility to perform a pairing when not being able to meet in person.
  Further, using DH to generate the shared secret has the advantage of both parties contributing to the shared secret (multiparty computation).
</t>
</section>


<section title="Public Authentication Keys">
  <t>
  The public/private key pair - if at all - is just used for the aforementioned authenticated DH to
  grant a mutually authenticated shared secret.
  Obtaining and verifying a friend's public key can be achieved by different means.
  For obtaining the keys, we can either leverage an existing PKI, e.g. the PGP web of trust,
  or generate our own key pairs (and exchange them right before verifying).
  For authenticating the keys, which boils down to comparing fingerprints on an off-channel,
  we distinguish between means that demand users to be in close proximity of each other,
  and means where users do not have to meet in person.
  The former can e.g. be  realized by verifying a fingerprint leveraging QR-Codes,
  the latter by reading a fingerprint during a phone call or using the socialist millionaires protocol.
</t>
</section>

</section>


<section title="Discovery of the Private Discovery Service" anchor="stage1Design">
<t>
The first stage of service discovery is to check whether 
instances of compatible Private Discovery Services are available in the local scope.
The goal of that stage is to identify devices that share a pairing with the querier, and
are available locally. The service instances can be discovered 
using regular DNS-SD procedures, but the list of
discovered services will have to be filtered so only paired
devices are retained.
</t>
<t>
We have demonstrated in <xref target="scalingIssues" /> that 
simple obfuscation would require publishing as many records per
publisher as there are pairings, which ends up scaling as O(M*N^2)
in which M is the number of devices present and N is the number of
pairings per device. We can mitigate this problem by using a special
encoding of the instance name. Suppose that the publisher manages 
N pairings with the associated keys K1, K2, ... Kn. The instance
name will be set to an encoding of N "proofs" of the N keys,
where each proof is computed as function of the key and a nonce:
</t>
<t>
<list>
<t>
instance name = &lt;nonce&gt;&lt;F1&gt;&lt;F2&gt;..&lt;Fn&gt;
</t>
<t>
Fi = hash (nonce, Ki), where hash is a cryptographic hash function.
</t>
</list>
</t>
<t>
The querier can test the instance name by computing the same "proof" for
each of its own keys. Suppose that the receiver manages P pairings, with
the corresponding keys X1, X2, .. Xp. The receiver verification
procedure will be:
</t>
<t>
<figure>
<artwork>
   for each received instance name:
      retrieve nonce from instance name
      for (j = 1 to P)
         retrieve the key Xj of pairing number j
         compute F = hash(nonce, Xj)
         for (i=1 to N)
            retrieve the proof Fi
            if F is equal to Fi
               mark the pairing number j as available
</artwork>
</figure>
</t>
<t>
The procedure presented here requires on average O(M*N) iterations of the
hash function, which is the same scaling as the "shared secret" variant.
It requires O(M*N^2) comparison operations, but these are less onerous
than cryptographic operations.
Further, when setting the nonce to a timestamp, the Fi have to be
calculated only once per time interval.
</t>
<t>
The number of pairing proofs that can be encoded in a single record is
limited by the maximum size of a DNS label, which is 63 bytes. Since
this are characters and not pure binary values, nonce and proofs
will have to be encoded using BASE64 (<xref target="RFC2045" /> section 6.8),
resulting in at most 378 bits. The nonce should 
not be repeated, and the simplest way to achieve that is to set
the nonce to a 32 bit timestamp value. The remaining 346 bits could encode
up to 10 proofs of 32 bits each, which would be sufficient for many
practical scenarios. 
</t>
<t>
In practice, a 32 bit proof should be sufficient to
distinguish between available devices. However, there is clearly a risk
of collision. The Private Discovery Service as described here will
find the available pairings, but it might also find a spurious number of 
"false positives." The chances of that happening are however quite small: 
less than 0.02% for a device managing 10 pairings and processing 10000
responses.
</t>

</section>

<section title="Private Discovery Service" >
<t>
The Private Discovery Service discovery allows
discovering a list of available paired devices, and verifying that either party knows the corresponding 
shared secret. At that point, the querier can engage in a series of
directed discoveries.
</t>
<t>
We have considered defining an ad-hoc protocol for the private discovery service, but 
found that  just using TLS would be much simpler. The Directed Private Discovery service 
is just a regular DNS-SD service, accessed over TLS, using the encapsulation of DNS over
TLS defined in <xref target="RFC7858" />. The main difference with simple DNS over TLS is
the need for authentication.
</t>
<t>
We assume that the pairing process has provided each pair of authorized client and server
with a shared secret. We can use that shared secret to provide mutual authentication of
clients and servers using "Pre Shared Key" authentication, as defined in <xref target="RFC4279" />
and incorporated in the latest version of TLS <xref target="I-D.ietf-tls-tls13" />.
</t>
<t>
One difficulty is the reliance on a key identifier in the protocol. 
For example, in TLS 1.3 the PSK extension is defined as:
</t>
<t>
<figure>
<artwork>
   opaque psk_identity&lt;0..2^16-1&gt;;

   struct {
       select (Role) {
           case client:
               psk_identity identities&lt;2..2^16-1&gt;;

           case server:
               uint16 selected_identity;
       }
   } PreSharedKeyExtension
</artwork>
</figure>
</t>
<t>
According to the protocol, the PSK identity is passed in clear text at the beginning of
the key exchange. This is logical, since server and clients need to identify the secret
that will be used to protect the connection. But if we used a static identifier for the
key, adversaries could use that identifier to track server and clients. The solution
is to use a time-varying identifier, constructed exactly like the "hint" described in
<xref target="stage1Design" />, by concatenating a nonce and the hash of the nonce with
the shared secret.
</t>


<section title="A Note on Private DNS Services" >
<t>
Our solution uses a variant of the DNS over TLS  protocol 
<xref target="RFC7858" /> defined by the DNS Private Exchange working group
(DPRIVE). DPRIVE is also working on an UDP variant, 
DNS over DTLS <xref target="I-D.ietf-dprive-dnsodtls" />, which
would also be a candidate.
</t>
<t>
DPRIVE and Private Discovery solve however two somewhat different
problems. DPRIVE is concerned with the confidentiality to DNS transactions, 
addressing the problems outlined in <xref target="RFC7626" />. However,
DPRIVE does not address the confidentiality or privacy issues with
publication of services, and is not a direct solution to DNS-SD privacy:
</t>
<t>
<list style="symbols" >
<t>
Discovery queries are scoped by the domain name within which services
are published. As nodes move and visit arbitrary networks, there
is no guarantee that the domain services for these networks
will be accessible using DNS over TLS or DNS over DTLS.
</t>
<t>
Information placed in the DNS is considered public. Even if
the server does support DNS over TLS, third parties will 
still be able to discover the content of PTR, SRV and TXT
records.
</t>
<t>
Neither DNS over TLS nor DNS over DTLS applies to MDNS.
</t>
</list>
</t>
<t>
In contrast, we propose using mutual authentication of the client and server
as part of the TLS solution, to ensure that only authorized parties learn
the presence of a service.
</t>
</section>


 </section>



<section title="Randomized Host Names" >
<t>
Instead of publishing their actual name in the SRV records, nodes 
could publish a randomized name. That is the solution argued for
in <xref target="I-D.ietf-intarea-hostname-practice" />.
</t>
<t>
Randomized host names will prevent some of the tracking.
Host names are typically not visible by the users, and
randomizing host names will probably not cause much
usability issues.
</t>
</section>


<section title="Timing of Obfuscation and Randomization" anchor="timing" >
<t>
It is important that the obfuscation of instance names is performed at the right time,
and that the obfuscated names change in synchrony with other identifiers,
such as MAC Addresses, IP Addresses or host names.
If the randomized host name changed
but the instance name remained constant, an adversary would have no difficulty
linking the old and new host names. Similarly, if IP or MAC addresses changed but 
host names remained constant, the adversary could link the new addresses to the
old ones using the published name.
</t>
<t>
The problem is handled in <xref target="I-D.ietf-intarea-hostname-practice" />, 
which recommends to pick a new random host name at the time of connecting to 
a new network. New instance names for the Private Discovery Services should be
composed at the same time.
</t>

</section>
</section>

<section title="Private Discovery Service Specification" anchor="solution" >
<t>
The proposed solution uses the following components:
</t>

<t>
<list style="symbols">
<t>
Host name randomization to prevent tracking.
</t>
<t>
Device pairing yielding pairwise shared secrets.
</t>
<t>
A Private Discovery Server (PDS) running on each host.
</t>
<t>
Discovery of the PDS instances using DNS-SD.
</t>
</list>
</t>

<t>
These components are detailed in the following subsections.
</t>

<section title="Host Name Randomization" >
<t>
Nodes publishing services with DNS-SD and concerned about their privacy MUST
use a randomized host name. The randomized name MUST be changed when
network connectivity changes, to avoid the correlation issues described in
<xref target="timing" />. The randomized host name MUST be used in
the SRV records describing the service instance, and the corresponding 
A or AAAA records MUST be made available through DNS or MDNS, within the
same scope as the PTR, SRV and TXT records used by DNS-SD.
</t>
<t>
If the link-layer address of the network connection is properly obfuscated 
(e.g. using MAC Address Randomization), 
The Randomized Host Name MAY be computed using the algorithm described
in section 3.7 of <xref target="RFC7844" />. 
If this is not possible, the randomized host name SHOULD be constructed by simply
picking a 48 bit random number meeting the 
Randomness Requirements for Security expressed in <xref target="RFC4075" />,
and then use the hexadecimal representation of this number as the
obfuscated host name.
</t>
</section>

<section title="Device Pairing" anchor="solution:pairing">
<t>
  Nodes that want to leverage the Private Directory Service for private service discovery among peers
  MUST share a secret with each of these peers. The shared secret MUST be a 256 bit randomly chosen number.
  The secret SHOULD be exchanged via device Pairing. The pairing process SHALL establish a mutually authenticated secure channel to
  perform the shared secret exchange.
  It is RECOMMENDED for both parties to contribute to the shared secret, e.g. by using a Diffie-Hellman key exchange.
</t>
<t>
TODO: need to define the pairing service, or API. The API approach assumes that pairing is outside our scope,
and is done using BT-LE, or any other existing mechanism. This is a bit of a cope-out. We could also define
a pairing system that just sets the pairing with equivalent security as the "push button" or "PIN" solutions
used for BT or Wi-Fi. And we could at this stage leverage a pre-existing security association, e.g. PGP
identities or other certificates. If we do that, we should probably dedicate a top level section to
specifying the minimal pairing service.

Using a pre-existing asymmetric security association, we can use a key exchange similar to
IKEv2 (RFC 7296). IKEv2 leverages the SIGMA protocols, which provide various methods of authenticated DH.
It would also be possible to authenticate DH using symmetric passwords (e.g. Bellovin-Merritt).
</t>
</section>


<section title="Private Discovery Server" anchor="solution:pns">
<t>
  A Private Discovery Server (PDS) is a minimal DNS server running on each host.
  Its task is to offer resource records corresponding to private services only to
  authorized peers. These peers MUST share a secret with the host 
  (see <xref target="solution:pairing" />). To ensure privacy of the requests, the service is 
  only available over TLS <xref target="RFC5246" />, and the shared secrets
  are used to mutually authenticate peers and servers.
</t>
<t>
  The Private Name Server SHOULD support DNS push notifications <xref target="I-D.ietf-dnssd-push" />,
  e.g. to facilitate an up-to-date contact list in a chat application without polling.
</t>

<section title="Establishing TLS Connections" anchor="solution:tls" >
<t>
  The PDS MUST only answer queries via DNS over TLS <xref target="RFC7858"/> and MUST use
  a PSK authenticated TLS handshake <xref target="RFC4279"/>. The client and server
  should negotiate a forward secure cypher suite such as DHE-PSK or ECDHE-PSK when 
  available. The shared secret exchanged during pairing MUST be used as PSK.
</t>
<t>
  When using the PSK based authentication, the "psk_identity" parameter identifying
  the pre-shared key MUST be composed as follow, using the conventions
  of TLS <xref target="RFC7858"/>:
</t>
<t>
<figure>
<artwork>
   struct {

             uint32 gmt_unix_time;

             opaque random_bytes[4];
         
   } nonce;

   long_proof = HASH(nonce  | pairing_key )
   proof = first 12 bytes of long_proof 
   psk_identity = BASE64(nonce) "." BASE64(proof) 
</artwork>
</figure>
</t>
<t>
In this formula, HASH SHOULD be the function SHA256 
defined in <xref target="RFC4055"/>. Implementers MAY eventually 
replace SHA256 with a stronger algorithm, in which cases both
clients and servers will have to agree on that algorithm during
the pairing process. The first 32 bits of the nonce are set
to the current time and date in standard UNIX 32-bit format
 
(seconds since the midnight starting Jan 1, 1970, UTC, ignoring
leap seconds) according to the client's internal clock. The 
next 32 bits of the nonce are set to a value generated by
a secure random generator.
</t>
<t>
In this formula, the identity is finally set to a character
string, using BASE64 (<xref target="RFC2045" /> section 6.8).
This transformation is meant to comply with the PSK identity encoding
rules specified in section 5.1 of <xref target="RFC4279"/>.
</t>
<t>
The server will check the received key identity, trying the key against the valid
keys established through pairing. If one of the key matches, the TLS connection is
accepted, otherwise it is declined.
</t>
</section>
</section>

<section title="Publishing Private Discovery Service Instances" anchor="solution:publishPds" >
<t>
Nodes that provide the Private Discovery Service SHOULD advertise their
availability by publishing instances of the service through DNS-SD.
</t>
<t>
The DNS-SD service type for the Private Discovery Service is "_pds._tls".
</t>
<t>
Each published instance describes one server and up to 10 pairings.
In the case where a node manages more than 10 pairings, it should
publish as many instances as necessary to advertise all available
pairings.
</t>
<t>
Each instance name is composed as follows:
</t>
<t>
<figure>
<artwork>
   pick a 32 bit nonce, e.g. using the Unix GMT time.
   set the binary identifier to the nonce.

   for each of up to 10 pairings
      hint = first 32 bits of HASH(&lt;nonce&gt;|&lt;pairing key&gt;)
      concatenate the hint to the binary identifier

   set instance-ID = BASE64(binary identifier)
</artwork>
</figure>
</t>
<t>
In this formula, HASH SHOULD be the function SHA256 
defined in <xref target="RFC4055"/>, and BASE64 is defined 
in section 6.8 of <xref target="RFC2045" />. The concatenation
of a 32 bit nonce and up to 10 pairing hints result a bit string 
at most 332 bit long. The BASE64 conversion will produce 
a string that is up to 59 characters long, which fits
within the 63 characters limit defined in
<xref target="RFC6763"/>.
</t>
</section>

<section title="Discovering Private Discovery Service Instances"  anchor="solution:discoverPds" >
<t>
Nodes that wish to discover Private Discovery Service Instances will issue a DNS-SD 
discovery request for the service type. These request will return a series
of PTR records, providing the names of the instances present in the scope.
</t>
<t>
The querier SHOULD examine each instance to see whether it hints at one
of its available pairings, according to the following conceptual algorithm:
</t>
<t>
<figure>
<artwork>
   for each received instance name:
      convert the instance name to binary using BASE64
      if the conversion fails, 
         discard the instance.
      if the binary instance length is a not multiple of 32 bits,
         discard the instance.

      nonce = first 32 bits of binary.
      for each 32 bit hint after the nonce
         for each available pairing
            retrieve the key Xj of pairing number j
            compute F = hash(nonce, Xj)
            if F is equal to the 32 bit hint
               mark the pairing number j as available
</artwork>
</figure>
</t>
<t>
Once a pairing has been marked available, the querier SHOULD 
try connecting to the corresponding instance, using the selected key.
The connection is likely to succeed, but it MAY fail for a variety
of reasons. One of these reasons is the probabilistic nature of the
hint, which entails a small chance of "false positive" match. This
will occur if the hash of the nonce with two different keys produces
the same result. In that case, the TLS connection will fail with
an authentication error or a decryption error.
</t>
</section>

<section title="Using the Private Discovery Service" >
<t>
Once instances of the Private Discovery Service have been discovered, 
peers can establish TLS connections and send DNS requests over
these connections, as specified in DNS-SD.
</t>
</section>

</section>

<section title="Security Considerations">
<t> 
This document specifies a method to protect the privacy of 
service publishing nodes. This is especially useful when operating
in a public space.
Hiding the identity of the publishing nodes prevents
some forms of "targeting" of high value nodes. However,
adversaries can attempt various attacks to break the anonymity
of the service, or to deny it. A list of these attacks and their
mitigations are described in the following sections.
</t>

<section title="Attacks Against the Pairing System" >
<t>
There are a variety of attacks against pairing systems. They 
may result in compromised pairing keys. If an adversary manages to
acquire a compromised key, the adversary will be able to perform 
private service discovery according to <xref target="solution:discoverPds" />.
This will allow tracking of the service. The adversary will also
be able to discover which private services are available for
the compromised pairing.
</t>
<t>
To mitigate such attacks, nodes MUST be able to quickly revoke 
a compromised pairing. This is however not sufficient, as the 
compromise of the pairing key could remain undetected for
a long time. For further safety, nodes SHOULD assign a time limit
to the validity of pairings, discard the corresponding keys when
the time has passed, and establish new pairings.
</t>
<t>
This later requirement of limiting the Time-To-Live can raise
doubts about the usability of the protocol. The usability issues
would be mitigated if the initial pairing provided both
a shared secret and the means to renew that secret over time,
e.g. using authenticated public keys.
</t>
</section>

<section title="Denial of Discovery of the Private Discovery Service" >
<t>
The algorithm described in <xref target="solution:discoverPds" /> scales as
O(M*N), where M is the number of pairing per nodes and N is the number of nodes in
the local scope. Adversaries can attack this service by publishing "fake"
instances, effectively increasing the number N in that scaling equation.
</t>
<t>
Similar attacks can be mounted against DNS-SD: creating fake instances
will generally increase the noise in the system and make discovery less
usable. Private Discovery Service discovery SHOULD use the same
mitigations as DNS-SD.
</t>
<t>
The attack is amplified because the clients need to compute proofs for
all the nonces presented in Private Discovery Service Instance names. One
possible mitigation would be to require that such nonces correspond to
rounded timestamps. If we assume that timestamps must not be too old, there
will be a finite number of valid rounded timestamps at any time. Even
if there are many instances present, they would all pick their nonces
from this small number of rounded timestamps, and a smart client
could make sure that proofs are only computed once per valid
time stamp.
</t>
</section>

<section title="Replay Attacks Against Discovery of the Private Discovery Service" >
<t>
Adversaries can record the service instance names published by
Private Discovery Service instances, and replay them later in different
contexts. Peers engaging in discovery can be misled into believing
that a paired server is present. They will attempt to connect to the
absent peer, and in doing so will disclose their presence in a 
monitored scope.
</t>
<t>
The binary instance identifiers defined in <xref target="solution:publishPds"/>
start with 32 bits encoding the "UNIX" time. In order to
protect against replay attacks, clients MAY verify that this time
is reasonably recent.
</t>
<t>
TODO: should we somehow encode the scope in the identifier? Having both scope and
time would really mitigate that attack.
</t>
</section>


<section title="Denial of Private Discovery Service" >
<t>
The Private Discovery Service is only available through a 
mutually authenticated TLS connection, which provides good
protections. However, adversaries can mount a denial of service 
attack against the service. In the absence of shared secrets,
the connections will fail, but the servers will expend some
CPU cycles defending against them.
</t>
<t>
To mitigate such attacks, nodes SHOULD restrict the 
range of network addresses from which they accept connections,
matching the expected scope of the service. 
</t>
<t>
This mitigation will not prevent denial of service attacks performed by locally connected 
adversaries; but protecting against local denial of service attacks is generally very difficult.
For example, local attackers can also attack mDNS and DNS-SD by generating a large number of
multicast requests.
</t>

</section>

<section title="Replay Attacks against the Private Discovery Service" >
<t>
Adversaries may record the PSK Key Identifiers used in successful
connections to a private discovery service. They could attempt
to replay them later against nodes advertising the private 
service at other times or at other locations. If the PSK Identifier
is still valid, the server will accept the TLS connection, and in doing 
so will reveal being the same server observed at a previous time or
location.
</t>
<t>
The PSK identifiers defined in <xref target="solution:tls"/>
start with 32 bits encoding the "UNIX" time. In order to
mitigate replay attacks, servers SHOULD verify that this time
is reasonably recent, and fail the connection if it is too old,
or if it occurs too far in the future. 
</t>
<t>
The processing of
timestamps is however mitigated by the accuracy of computer clocks. 
If the check is too strict, reasonable connections could fail. To
further mitigate replay attacks, servers MAY record the list of 
valid PSK identifiers received in a recent past, and fail connections
if one of these identifiers is replayed.
</t>
</section>

</section>

<section title="IANA Considerations" anchor="iana">
<t> 
This draft does not require any IANA action. (Or does it? What about the _pds tag?)
</t> 
</section>

<section title="Acknowledgments">
    <t>
This draft results from initial discussions with Dave Thaler, and encouragements from the DNS-SD working group members.
    </t>
</section>
</middle>

<back>
<references title="Normative References">
       &rfc2045;
       &rfc2119;
       &rfc4055;
       &rfc4075;
       &rfc6763;
       &rfc4279;
       &rfc5246;
</references>
<references title="Informative References">
       &rfc1033;
       &rfc1034;
       &rfc1035;
       &rfc2782;
       &rfc6762;
       &rfc7626;
       &rfc7844;
       &rfc7858;
       &I-D.ietf-intarea-hostname-practice;
       &I-D.ietf-dprive-dnsodtls;
       &I-D.ietf-tls-tls13;
       &I-D.ietf-dnssd-push;

<reference anchor="KW14a" target="http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7011331">
  <front>
    <title>Adding Privacy to Multicast DNS Service Discovery</title>
    <author initials="D." surname="Kaiser" fullname="Daniel Kaiser">
      <organization/>
    </author>
    <author initials="M." surname="Waldvogel" fullname="Marcel Waldvogel">
      <organization/>
    </author>
    <date year="2014"/>
  </front>
  <seriesInfo name="DOI" value="10.1109/TrustCom.2014.107"/>
</reference>

<reference anchor="KW14b" target="http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7056899">
  <front>
    <title>Efficient Privacy Preserving Multicast DNS Service Discovery</title>
    <author initials="D." surname="Kaiser" fullname="Daniel Kaiser">
      <organization/>
    </author>
    <author initials="M." surname="Waldvogel" fullname="Marcel Waldvogel">
      <organization/>
    </author>
    <date year="2014"/>
  </front>
  <seriesInfo name="DOI" value="10.1109/HPCC.2014.141"/>
</reference>


</references>  

</back>
</rfc>
