HTTP/1.1 200 OK Date: Tue, 09 Apr 2002 01:39:28 GMT Server: Apache/1.3.20 (Unix) Last-Modified: Wed, 15 Mar 2000 15:47:00 GMT ETag: "2e9d03-8a7e-38cfb074" Accept-Ranges: bytes Content-Length: 35454 Connection: close Content-Type: text/plain INTERNET-DRAFT N. Popp February 8, 2000 RealNames Inc. Expires August 8, 2000 M. Mealling draft-ietf-cnrp-02.txt Network Solutions, Inc. M. Moseley Netword, Inc. CNRP PROTOCOL SPECIFICATION 1. Status of this memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Please send comments on this draft to CNRP-IETF@LISTS.INTERNIC.NET. 2. Abstract People often refer to things in the real world by a common name or phrase, e.g., a trade name, company name, or a book title. These names are sometimes easier for people to remember and type than URLs. Furthermore, because of the limited syntax of URLs, companies and individuals are finding that the ones that might be most reasonable for their resources are being used elsewhere and so are unavailable. Services are arising that offer a mapping from common names to Internet resources (e.g., as identified by a URI). These services often resolve common name categories such as company names, trade names, or common keywords. Thus, such a resolution service may operate in one or a small number of categories or domains, or may expect the client to limit the resolution scope to a limited number of categories or domains. For example, the phrase "Internet Engineering Task Force" is a common name in the "organization" category, as is "Moby Dick" in the book category. Two classes of clients of such services are being built, browser improvements and web accessible front-end services. Browser enhancements modify the "open" or "address" field of a browser so that a common name can be entered instead of a URL. Internet search sites integrate common name resolution services as a complement to search. In both cases, these may be clients of back-end resolution services. In the browser case, the browser must talk to a service that will resolve the common name. The search sites are accessed via a browser. In some cases, the search site may also be the back- end resolution service, but in others, the search site is a front-end to a collection of back-end services. This effort is about the creation of a protocol for client applications to communicate with common name resolution services, as exemplified in both the browser enhancement and search site paradigms. Although the protocol's primary function is resolution, it is intended to address the issues of internationalization and privacy as well. Name resolution services are not generic search services and thus do not need to provide complex Boolean query, relevance ranking or similar capabilities. The protocol is a simple, minimal interoperable core. Mechanisms for extension are provided, so that additional capabilities can be added. Several other issues, while of importance to the deployment of common name resolution services, are outside of the resolution protocol itself and are not in the initial scope of the proposed effort. These include discovery and selection of resolution service providers, administration of resolution services, name registration, name ownership, and methods for creating, identifying or insuring unique common names. 3. Introduction For the purposes of this document, a "common name" is a word or a phrase, without imposed syntactic structure, that may be associated with a resource. These common names will be used primarily by humans, as opposed to machine agents. A common name "resolution service" handles these associations between common names and data (resources, information about resources, pointers to locations, etc). A single common name may be associated with different data records, and more than one resolution service is expected to exist. Any common name may be used in any resolution service. Common names are not URIs (Uniform Resource Identifiers) in that they lack the syntactic structure imposed by URIs; furthermore, unlike URNs, there is no requirement of uniqueness or persistence of the association between a common name and a resource. (Note: common names may be expressed in a URI, the syntax for which is described herein.) This document will define a protocol for the parameterized resolution necessary to make common names useful. "Resolution" is defined as the retrieval of data associated (a priori) with descriptors that match the input request. "Parameterized" means the ability to have a multi-component descriptor. Descriptors are not required to provide unique identification, therefore 0 or more records may be returned to meet a specific input query. 4. Basic object model The protocol will consist of a simple request /response mechanism. There will be two types of queries. 1. A `special' initial query that establishes the schema for a particular CNRP database and communicates that to the client. The CNRP client will send this query, and in turn receive an XML document defining the query properties that the database supports. (In CNRP, XML is used to define and express all objects.) This query is called the ServiceQeuery in the DTD. 2. A `standard' query, which is the submission of the CNRP search string to the database. The query will conform to the previously established schema. There will be a set of query properties, listed below, treated as hints by the server. Note: a CNRP database will accept any correctly encoded CNRP query property; the extent to which a query result is responsive to those properties is a service differentiator. The base properties that are always supported are common name, language, geography, category, and range (start and length of the result set). CNRP allows database service providers to create unique data types and surface them to any CNRP client via the CNRP schema XML documents. Note: the descriptive portions of this document contain pieces of XML code that are *illustrative examples only*. Section 6 of this document contains the XML DTD for CNRP, which is definitive. 4.1 Hints A hint is an assertion by the user about him or her self and the context in which he/she is operating. There is no data type `hint'; a hint is expressed within the structure of the query itself and is limited or enabled by the richness of the defined query namespace. In effect, a query and any property within it is a hint. An example of this would be the required property "language", in which a query might be created that specifies the primary language in which you want to see results, the secondary language, and so on. So seeing results in US English followed by European French and South American Spanish would be: en-US fr-FR sp-MX Note that the property statements say nothing about whether the language is primary, secondary,etc. In this example the ordering of the statement controls that--the first statement, being first, means that US English is the primary language. The second statement specifies the second region/language, and so on. *But this is only an example.* The extent to which hints are supported (or not) is a service differentiator. The fact that a hint exists does not mean that a CNRP database must respond to it. This best-effort approach is similar to relevance ranking in a search engine (high precision, low recall); hints are similar to a search engine's selection criteria. CNRP services will attempt to return the results "closest" to the selection criteria. This is quite different from a SQL database approach where a SQL query returns the entire results set and each result in the set must match all the requirements expressed by the qualifier (the SQL WHERE clause). 4.2 Transport independence This document defines CNRP in terms of an object model, the encoding scheme used to express it (XML documents), and response/request interaction model. Therefore it is transport-independent. It is expected that the primary transport used for CNRP will be HTTP, but that is certainly not a requirement. Most aspects of authentication and security are a requirement of the transport and not of CNRP. The protocol does not, in and of itself, support any authentication and security. Discovery of the transport associated with a CNRP database is accomplished through DNS. The syntax for a CNRP URI is: CNRP:<[host]>:<[port]>/path/;paraname=value, paraname=value,... "CNRP", in conjunction, with the URI content, denotes a DNS entry containing a Naming Authority Pointer (NAPTR). The NAPTR specifies how a CNRP URI is dynamically rewritten by the client to adhere to some transport (HTTP, GOPHER, etc.) Because this rewrite can be a URL, a CNRP URI can thus be cached and assigned a time-to-live (TTL). The CNRP URI scheme is fully defined in "A URI Scheme for the Common Name Resolution Protocol" 5. Object Model: 5.1 Properties: 5.1.1 Base properties In CNRP, objects are property lists. A property has a unique name and type. Some properties can be part of the query or the results list or both. For simplicity, CNRP is limiting property values to string values. CNRP introduces a set of base properties. Among these properties, CNRP distinguishes between core properties and optional properties. Core properties are the minimal set of properties that all CNRP services MUST support. The core properties define the level of interoperability between CNRP services. The proposed core properties are: 1. CommonName: the common name associated with a resource. 2. ID: an opaque string that serves as a unique identifier (typically a database ID) 3. URI: An URI as define by RFC-2026. In addition to core properties, CNRP introduces optional properties to enable a wider range of CNRP based services. Although, these properties are not required, it is expected that many services, especially large one, will implement them. An equally important goal for introducing additional properties is to provide a powerful results filtering mechanism. This is a requirement for large namespaces that contain several million of names. The optional properties are: 1. Language: The language of a resource associated with a resource. 2. Geography: The geographical region or location associated with a resource. 3. Category: The category associated with a resource. 4. Description: A short text abstract associated with a resource. 5. Range: The range is a results set control property. The range property is used to specify the starting point and the length of a results set (e.g. I want 5 records starting at the 10th record) The language property is expressed using language values as defined by RFC 1766. 5.1.2 Multi type properties The "geography" and "category" properties introduced in the CNRP model can be expressed using many different value sets. For example, geography can be specified in terms of a country code, a postal code or in terms of spatial coordinates. Therefore, for such properties, CNRP introduces a "type" attribute. To facilitate interoperability, CNRP defines the main primitive types as well. Property types can be extended by a specific service through the definition of new type values (see extensibility section). The multi-type properties and the main types are defined below: 5.1.2.1 Geography: 1. type = "freeform" value = a free form expression for a geographical location (e.g. "palo alto in california"). 2. type = "ISO3166-1" value = a geographical region expressed using a standard country code as defined by ISO3166-1 (e.g. "US"). 3. type = "ISO3166-2" value = a geographical region expressed using a standard region and country codes as defined by ISO3166-2 (e.g. "US-CA"). 4. type = "GPS" value = a geographical location expressed using the standard GPS coordinates system. 5. type = "ISO6709" value = a geographical location expressed using the Latitude-Longitude-Elevation coordinates system. 5.1.2.2 Category 1. type = "freeform" value = a free form expression for a category (e.g. "movies"). 2. type = "NAICS" value = The North American Industry Code System. When the "type" is unspecified, the value defaults to "freeform". The free form type value is important because it allows very simple user interface where the user can enter a value in a text field. It is up to the serviced to interpret the value correctly and take advantage of it to increase the relevance of results (using specialized dictionaries for instance). 5.1.2.3 Common name - String encoding and equivalence rules CNRP specifies that common name strings should be encoded using UTF-8. CNRP does not specify any string equivalence rules for matching a common name in the query against a common name of a Resource. String equivalence rules are language and service dependant. They are specific to relevance ranking algorithms, hence treated as CNRP services. Consequently, string equivalence rules are not part of the CNRP protocol specification. For example, the query member: bmw Should be read as a selection criterion for a resource with a common name LIKE (similar to) the string "bmw" where the exact definition of the LIKE operator is intuitive, yet specific to the queried CNRP service. 5.2 Objects: 5.2.1 Query: The Query object encapsulates all the query properties such as CommonName, ID, language, geography, category, and range. A Query cannot be empty. A Query must contain either a common name, or an ID. A Query can also contain the custom properties defined by a specific CNRP service. For example, a query for the first 5 resources whose common name is like "bmw" would be expressed as: bmw 1-5 5.2.1.1 Logical operations within a Query The Query syntax is extremely simple. CNRP does not extensively support Boolean logic operator such as OR, AND or NOT. However, there exist two implicit logical operations that can be expressed through the Query object and its properties. First, a query with multiple property-value pairs implicitly expresses an AND operation on the query terms. For instance, the CNRP query to request all the resources whose common name is like "bmw", AND whose language is "German" can be expressed as: bmw de-DE Note however, that because the server is only trying to best match the Query criteria, there is no guarantee that all or any of the resources in the results match both requirements. In addition, for enumerated value types only (e.g. language), CNRP allows the client to express a logical OR by specifying multiple values for the same property within the Query. For example, the logical expression: property = value1 OR property = value2 .OR property = valueN Will be expressed as: value1 value2 valueN So if there are different properties expressed, CNRP ANDs them; if there are multiples of the same property exprssed, CNRP ORs them. It is important to underline that this form is only applicable to enumerated types. In particular, logical OR operations on the common name are not supported. Note that the ordering or the property-value pairs in the query implies a precedence. As a consequence, CNRP also introduces one special string value: "*". Not surprisingly, "*" means all admissible values for the typed property. For example, the following query requests all the resources whose common name is like BMW and whose language is preferably in German or French or any other language. bmw de-DE fr-FR * 5.3 Results: The results object is a container for CNRP results. The type of objects contained in Results can be: Resource, Service, Error, Referral and Schema. 5.3.1 Resource A Resource object describes a resource (e.g. a Web page, a person, an object identified by a URI). The Resource object can contain the commonname, URI, ID, description, language, geography, and category of the resource. A Resource can also be augmented using custom properties. Lastly, a Resource can also reference a service object to indicate its origin. bmw de-DE carcompanies DE foo.com:234364 http://www.bmw.de Wunderbar BMWs! 5.3.2 Service The Service object provides an encapsulation of an instance of a CNRP service. A service is uniquely identified through the ServiceURI property. Services can also include a description, a brief textual description of the service. http://cnrp.foo.com foo.com is a CNRP service specialized on cocktail recipes The service object can also be extended by including existing properties to further describe the service. For instance, a service that focuses on French companies could be expressed as: http://cnrp.foo.com companies FR The service object also encapsulates a list of server objects. The server object is used to describe a CNRP server (or a cluster of servers). A server is identified through its serverURI. A server can be further described using existing properties. For example, the following example defines two clusters of CNRP servers one in the US and one in France. http://cnrp.foo.com http://router.us.widgetco.com:4321/foo? US http://router.fr.acmeco.com:4321/foo? FR 5.3.3 Error An Error object indicates an error in the results set. The error object encapsulates two properties: an error number and an error description. 345 The CNRP foo.com database is temporarily unreachable 5.3.4 Referral A Referral object in the results set is a place holder for un-fetched results from a different service. Referrals typically occur when a CNRP server knows of another service capable of providing relevant results for the query and wants to notify the client about this possibility. The client can decide whether it wants to follow the referral and resolve the extra results by contacting the referred-to service using the information contained within the Referral object (a Service object). The Referral is a simple mechanism to enable hierarchical resolution as well as to join multiple resolution services together. http://cnrp.bar.com/ bmw de-DE foo.com:234364 http://www.bmw.de/ 5.3.5 ServiceQuery & schema: A subclass of Query, the ServiceQuery object supports the dynamic discovery of a specific CNRP service's characteristics. To give a full description of a CNRP service, the response to a ServiceQuery returns the Service object described in section 5.3.2 with the following schema information. 1. The new Properties introduced by the CNRP service (Property schema), 2. The properties used to describe the Service object (Service schema) 3. The properties that belong to the query interface (Query schema) 4. The properties that belong to a resource within the results (Resource schema). These leads to the following new objects definitions: * PropertySchema -- A property schema describes all the custom properties introduced by the service. * PropertyDefinition -- A property definition describes a custom property. A property definition has a name and a type (the name and the type of the property). * ProperyReference -- A property reference is a reference to a property definition so that it can be included within a given schema (a service, query or resource schema). * ServiceSchema -- The service schema defines the properties used to describe the service. * QuerySchema -- A query schema describes the structure of a query handled by the CNRP service. * ResourceSchema -- A resource schema describes the resource returned as a result by the CNRP service. For example, a CNRP query to discover a service's capabilities will be in the form: And for a CNRP service for cocktail recipes in French, the corresponding response would be: cocktailrecipe freeform language 6. XML DTD for CNRP 6.1 Examples 6.2 Service Description Request This is what the client sends when it is requesting a servers schema. This is the result. Notice how the Service tag is used to allow the service to describe itself in its own terms. urn:foo:bar http://host1.acmecorp.com:4321/foo? smtp://host2.acmecorp.com:4321/foo? This is the AcmeCorp CNRP Service 544554 http://adserver.acmecorp.com/ workgroupID freeform domainname BannerAdServer URI 6.3 Sending A Query and Getting A Response This is the query that is sent from the client to the server: Fido CA-QC CA fr-CA This is the result set. It is sent back in response to the query. This result set includes a referral and a non-fatal error. http://acmecorp.com http://serverfarm.acmecorp.com http://servers.acmecorp.co.uk Fidonet 1333459455 http://www.fidonet.ca This is ye olde Canadian Fidonet Fidonet 1333459455 http://host:port/bla 6.4 Examples to be done: 6.4.1 Complex Result 6.4.2 No Results 6.4.3 Error Conditions 7. Transport Two CNRP transport protocols are specified. HTTP is used due to its popularity and ease of integration with other web applications. SMTP is also used as a way to illustrate a protocol that has a much different range of latency than most protocols. 7.1.1 HTTP transport The HTTP transport is fairly simple. The client connects to an HTTP based CNRP server and issues the POST method with the Content-type and Accept header set to "application/xml". The content of the POST body is the CNRP XML document that is being sent. The results are sent back to the client with a Content-Type of "application/xml". The body of the result is the CNRP XML document being sent to the client. 7.1.2 SMTP transport The SMTP transport is very similar to the HTTP transport. Since there is no method to specify, the CNRP XML document is simply sent to a particular SMTP endpoint with its Content-Type set to "application/xml". The server responds by sending a response to the originator of the request with the results in the body and the Content-Type set to "application/xml". 8. Security Considerations This is where we talk about the various security threats. Two that need to be addressed are Man in the Middle attacks and posing as a service by spoofing a Service object. The proposed solution for man in the middle attacks is to utilize transport level authentication and encryption where available. In the case where the transport can't provide the level of required authentication, individual entries or the entire response can be signed/encrypted. In the case of where a service attempts to pose as another by spoofing the serviceURI in the Service object, the Service object should be signed. A client can then verify the Service object's veracity by verifying the signature. How the client obtains that authoritative public key is out of scope since it depends on the service discovery problem. 8.1 XXX To be done: add additional threat scenarios 9. IANA Considerations The major consideration for the IANA is that the IANA will be registering well known properties and property types. It will not register values. Since this document does not discuss CNRP service discovery, the IANA will not be registering the existence of servers or Server objects. There are two types of entities the IANA can register: properties and property types. If a property or type is not registered with the IANA then they must start with "x-". The required information for the registration of a new property is the property's name, its default type, and a general description. A new type requires the type's name, what properties it is valid for, and a description. See Appendix A for some example property and type registrations. 10. Appendix A: Well Known Property and Type Registration Templates Property Name: Geography Default Type: ISO3166-1 Description: A geographic location Paramater Name: Language Default Type: RFCXXXX3 Description: A language specification Property Name: Category Default Type: freeform Description: A node in some system of semantic relationships that is considered relevant to the common-name. Property Name: Range Default Type: range Description: A range given in the format "x,y" where x is the starting point and y is the length. This property is used by the client to tell the server that is is requesting a subrange of the results. Types: Type: freeform Property: ALL Description: The value is to be interpreted by the server the best way it knows how. This value has no defined structure. Type: ISO3166-2 Property: Geography Description: The combination of country and sub-region codes. Type: ISO3166-1 Property: Geography Description: Country Codes Type: POSTALCODE Property: Geography Description: A postal code that is valid for some region. A good example is the Zip code system used in the US. Type: GPS Property: Geography Description: A code in the format used by the Global Positioning System. Type: ISO636 Property: Language Description: language codes Type: NAICS Property: Category Description: North American Industry Code System 11. Author contact information Nico Popp RealNames Corporation 2 Circle Star Way, 2nd Floor San Carlos, CA 94070-1350 Phone: (650) 298 8080 Email: nico@realnames.com Michael Mealling Network Solutions, Inc. 505 Huntmar Park Drive Herndon, VA 22070 Phone: (703) 742-0400 EMail: michaelm@netsol.com Marshall Moseley Netword, Inc. 702 Russell Avenue Gaithersburg, MD 20877-2606 Phone: (240) 631-1100 Email: marshall@netword.com