<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY RFC2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY RFC4566 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4566.xml">
<!ENTITY RFC5646 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5646.xml">
<!ENTITY I-D.ietf-slim-negotiating-human-language SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-slim-negotiating-human-language.xml">
]>
<?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?> <!-- used by XSLT processors -->
<!-- OPTIONS, known as processing instructions (PIs) go here. -->
<!-- For a complete list and description of PIs,
     please see http://xml.resource.org/authoring/README.html. -->
<!-- Below are generally applicable PIs that most I-Ds might want to use. -->
<?rfc strict="yes" ?> <!-- give errors regarding ID-nits and DTD validation -->
<!-- control the table of contents (ToC): -->
<?rfc toc="yes"?> <!-- generate a ToC -->
<?rfc tocdepth="2"?> <!-- the number of levels of subsections in ToC. default: 3 -->
<!-- control references: -->
<?rfc symrefs="yes"?> <!-- use symbolic references tags, i.e, [RFC2119] instead of [1] -->
<?rfc sortrefs="yes" ?> <!-- sort the reference entries alphabetically -->
<!-- control vertical white space: 
     (using these PIs as follows is recommended by the RFC Editor) -->
<?rfc compact="yes" ?> <!-- do not start each main section on a new page -->
<?rfc subcompact="yes" ?> <!-- keep one blank line between list items -->
<!-- end of popular PIs -->
<rfc  category="std" docName="draft-hellstrom-slim-modalitypref-00" ipr="trust200902">
  <front>
    <title abbrev="Negotiating Modality">Negotiating Modality in Real-Time Communications</title>
    <author fullname="Gunnar Hellstrom" initials="G" surname="Hellstrom">
      <organization>Omnitor</organization>
      <address>
        <postal>
      <street>Hammarby Fabriksvag 23</street>
      <city>Stockholm</city>
<!-- <region/> -->
      <code>120 30</code>
      <country>Sweden</country>
        </postal>
      <phone>+46 708 204 288</phone>
<!-- <facsimile/> -->
      <email>gunnar.hellstrom@omnitor.se</email>
<!-- <uri/> -->
      </address>
    </author>
    <date year="2017" />
      <area>ART</area>
      <workgroup>slim</workgroup>
      <keyword>modality</keyword>
      <keyword>language</keyword>
      <keyword>sdp</keyword>
      <keyword>preference</keyword>
    <abstract>
      <t>
When negotiating language for a real-time session, users may have very specific preferences for using one modality (spoken, written or signed) over other possible but less preferred modalities. This specification introduces indication of modality preference to be used in session negotiation in combination with an earlier speified mechanism for language preference negotiation.      
   </t>
    </abstract>
  </front>
  <middle>
    <section title="Introduction">
	 <t>
        A mechanism for negotiating human language for real-time communication is specified in <xref target="I-D.ietf-slim-negotiating-human-language"/>. The indication of language preference is expressed per media and specified in SDP <xref target="RFC4566"/> attributes 'hlang-send' and 'hlang-recv'. Negotiation of language can take place by the answering part selecting from the languages, media and direction alternatives expressed by the offering part. Languages are expressed by using language-tags as specified in BCP 47 <xref target="RFC5646"/>.
	  </t>
	  <t>
		When starting a conversation in a media-rich environment, the users may have very specific preferences for using one modality (spoken, written or signed) over other possible but less preferred modalities. 
		In traditional call establishment, it is the answering part who is expected to start the conversation by a greeting. In the media-rich environment, the modality and language of this greeting sets the expectations for what modality and language to mainly use in the session. Deviation from this initial expectation is usually possible during the session by mutual agreement between the participants, but may be time consuming and cause uncertainty.
	  </t>	
	  <t>
		A way for the parties to not only indicate alternative languages and modalities for the communication directions in the session, but also indicate preference for specific modalities per direction provides the opportunity to more exactly describe the desired language communication for a session, while still providing information about less preferred alternatives. This specification extends <xref target="I-D.ietf-slim-negotiating-human-language"/> with a mechanism for indicating modality preference by a condensed notation integrated with the syntax of the language indications of <xref target="I-D.ietf-slim-negotiating-human-language"/>.
      </t>
	  <t>
		The expected application area is wide. By old tradition, the most common modality for real-time interaction is spoken communication. In some settings, e.g. where silence is required, it may be desirable to express a preference for using written communication, while still leaving a possibility open for traditional spoken communication by an indication on lower preference level. For persons having full ability to both use sign language and spoken language, but not wanting to force the other party to bring in a sign language interpreter in the call, it may be of importance to be able to indicate the sign language capability on a lower preference level and the spoken laanguage capability on a higher level. Some persons with disabilities may strongly prefer to conduct a written conversation, while still wanting to express that a spoken conversation is possible as a last resort. Many other situations exist in the media-rich communication environment when the media preference indication is of value for a smooth initiation of a real-time session. 
	  </t>

    </section>    
	<section anchor="Terminology" title="Terminology">
	  
      <t>
        The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in
        this document are to be interpreted as described in
        <xref target="RFC2119"/>.
      </t>
	  
      

    </section>

	<section anchor="Modality-Preference" title="Modality Preference Indication">
	 <t>
	 This specification extends the use of the asterisk in the 'hlang'send' and 'hlang-recv' SDP <xref target="RFC4566"/> attributes introduced by <xref target="I-D.ietf-slim-negotiating-human-language"/>.
	 </t>
	 <t>
	 In <xref target="I-D.ietf-slim-negotiating-human-language"/>, the asterisk appended at the end of the attribute value indicates a preference to not get the call denied if no languages match. 
	 </t>
	 <t>
	 This specification adds the following meaning of the asterisk:
	 </t>
	 <t>
	In an offer or answer, a 'hlang-send' or 'hlang-recv' attribute value  MAY have an asterisk appended as the    final token.  An asterisk appended to a value in an offer indicates a
     the caller has higher preference for the corresponding modality
     to be used in the specified direction than other modalities for the indicated 
     direction without an asterisk. 
     In an answer, the asterisk indicates a modality that is preferred by the callee to be used in the session.
    </t>
	<t>
   A user may have a clear preference to use one specific modality in a 
   direction, while use of other modalities may be acceptable but lower in 
   preference. This condition MAY be indicated by appending an asterisk 
   as the last parameter in the corresponding 'hlang-' value. 
   
   Note that the asterisk appended at the end of a 'hlang-' attribute value also  
   should also be seen as a preference to not have the call
   denied even if no indicated languages are in common as specified in <xref target="I-D.ietf-slim-negotiating-human-language"/>.
   </t>
   <t>
   When negotiating language use for a direction, languages and modalities
  specified together with the asterisk should be given preference to be selected for use.    
   </t>
   <t>
   If there is no specific preference between modalities in the same direction, this condition should
   be indicated by appending an asterisk on all or no 'hlang-' values for that direction. 
   </t>

   
    </section>    

	<section anchor="Interaction-with-Denial" title="Interaction with Call Denial Indication">
	  <t>
	   If no modality preference is indicated in any 'hlang-' 
attribute by no attached asterisk, this should also be taken as a preference by the caller to get the call denied
if no languages are in common between the caller and the callee.
      </t>
	  <t>
A caller with language capabilities in multiple media, but no specific modality preferences should attach the asterisk to all 'hlang-' attributes in at least one direction for indication that the call should not be denied.
	  </t>
	  <t>
   If there is a preference for denying the call when no languages match, no asterisk should be appended on any 'hlang-' attribute value, and then it is not possible to indicate any preferred modality at the same time. 
      </t>
    </section>

	<section anchor="Interaction-with-Simultaneity" title="Interaction with Simultaneity Indication">
	  <t>
	  - - Interaction with simultaneity indication - -
	  </t>
    </section>
	<section anchor="Examples" title="Examples">
	 <t>
		   An offer requesting the following media streams: audio for the caller
   to send using spoken English (most preferred modality) or American 
   Sign Language (less preferred modality),
   audio for the caller to receive spoken English (most preferred modality) or
   American Sign Language (less preferred modality), 
   supplemental text.  The offer also requests that the
   call proceed even if the callee does not support any of the
   languages. The offer is likely from a hearing person with knowledge in sign language:
    </t>
	<t>
	<list style="empty">
	 <t>
      m=text 45020 RTP/AVP 103 104
	 </t>
	 <t>

      m=audio 49250 RTP/AVP 20
	  	 </t>
	 <t>
      a=hlang-recv:en *
	 </t>
	 <t>
      a=hlang-send:en *
	 </t>
	 <t>

     m=video 51372 RTP/AVP 31 32
	 
	 </t>
	 <t>
     a=hlang-recv: ase 
	 </t>
	 <t>
     a=hlang-send: ase
	 </t>
	 </list>
	 </t>
	
     <t>
 An answer for the above offer, indicating video in which the callee will send and receive American Sign Language, because that callee had no capability for spoken English. The text and audio streams are opened as supplementary streams.
     </t>
	 <t>
	 <list style="empty">
     <t>
      m=text 45020 RTP/AVP 103 104
	  	 </t>
	 <t>
      
      m=audio 49250 RTP/AVP 20
	 </t>
	 <t>

      m=video 51372 RTP/AVP 31 32
	  	 </t>
	 <t>
      a=hlang-send: ase
	  	 </t>
	 <t>
      a=hlang-recv: ase
     </t>
	 </list>
	 </t>
	  <t>
		   An offer requesting the following media streams: audio for the caller
   to send using spoken French (most preferred modality) or written French (less preferred modality),
   text for the caller to receive written French. The offer also requests that the
   call proceed even if the callee does not support any of the
   languages. Video is supplemental.The offer is likely from a hard-of-hearing person with no use of received spoken language and a preference to use spoken language rather than type French:
    </t>
	<t>
	<list style="empty">
	 <t>
      m=text 45020 RTP/AVP 103 104
	  </t>
	 <t>
	  a=hlang-send:fr 
	  </t>
	 <t>
	  a=hlang-recv:fr 
	 </t>
	 <t>

      m=audio 49250 RTP/AVP 20
	  	 </t>

	 <t>
      a=hlang-send:fr *
	 </t>
	 <t>

     m=video 51372 RTP/AVP 31 32
	 
	 </t>

	 </list>
	 </t>
	
     <t>
 An answer for the above offer, indicating text in which the callee will send written French, and audio in which the callee is prepared to receive spoken French. The video stream is opened as a supplementary stream.
     </t>
	 <t>
	 <list style="empty">
     <t>
      m=text 45020 RTP/AVP 103 104
	  	 </t>
		 <t>
	 a=hlang-send: fr 
	</t>
	 <t>
      
      m=audio 49250 RTP/AVP 20
	  </t>
	 <t>
	  a=hlang-recv: fr
	 </t>
	 <t>

      m=video 51372 RTP/AVP 31 32
	  	 </t>

	 </list>
	 </t>
    </section>
    <section anchor="Acknowledgements" title="Acknowledgements">
	<t>
	Thanks to Randall Gellens for providing the background for this extension. Brian Rosen and Paul Kyzivat for thorough discussions and guidance.
	</t>
    </section>
    <section anchor="IANA" title="IANA Considerations">
    </section>
    <section anchor="Security" title="Security Considerations">
    </section>
  </middle>
  <back>
    <references title="Normative References">
      &RFC2119;
      &RFC4566;
      &RFC5646;
    </references>
    <references title="Informative References">
      &I-D.ietf-slim-negotiating-human-language;
    </references>
  </back>
</rfc>
