<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE rfc SYSTEM 'rfc2629.dtd' [
<!ENTITY rfc2119 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml'>
<!ENTITY rfc6716 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.6716.xml'>
<!ENTITY rfc7845 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.7845.xml'>
]>
<?rfc toc="yes" symrefs="yes" ?>

<rfc ipr="trust200902" category="std" docName="draft-ietf-codec-ambisonics-00">

<front>
<title abbrev="Opus Ambisonics">Ambisonics in an Ogg Opus Container</title>
<author initials="M.G." surname="Graczyk" fullname="Michael Graczyk">
<organization>Google Inc.</organization>
<address>
<postal>
<street>1600 Amphitheatre Parkway</street>
<city>Mountain View</city>
<region>CA</region>
<code>94043</code>
<country>USA</country>
</postal>
<email>mgraczyk@google.com</email>
</address>
</author>

<date day="19" month="July" year="2016"/>
<area>RAI</area>
<workgroup>codec</workgroup>

<abstract>
<t>
This document defines an extension to the Ogg format to encapsulate
 ambisonics coded using the Opus audio codec.
</t>
</abstract>
</front>

<middle>
<section anchor="intro" title="Introduction">
<t>
Ambisonics is a representation format for three dimensional sound fields which
 can be used for surround sound and immersive virtual reality playback.
See <xref target="gerzon75"/> and <xref target="daniel04"/> for technical
 details on the ambisonics format.
For the purposes of the this document, ambisonics can be considered a
 multichannel audio stream.
Ogg is a general purpose container, supporting audio, video, and other media.
It can be used to encapsulate audio streams coded using the Opus codec.
See <xref target="RFC6716"/> and <xref target="RFC7845"/> for technical details
 on the Opus codec and its encapsulation in the Ogg container respectively.
</t>

<t>
This document extends the Ogg format by defining a new channel mapping family for
encoding ambisonics. The Ogg Opus format is extended indirectly by adding an
item with value 2 to the IANA "Opus Channel Mapping Families" registry. When
2 is used as the Channel Mapping Family Number in an Ogg stream, the semantic
meaning of the channels in the multichannel Opus stream is the ambisonics layout
defined in this document. This mapping can also be used in other contexts which
make use of the channel mappings defined by the Opus Channel Mapping Families
registry.
</t>
</section>

<section anchor="terminology" title="Terminology">
<t>
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
 "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this
 document are to be interpreted as described in <xref target="RFC2119"/>.
</t>

</section>

<section anchor="ogg_extension" title="Ambisonics With Ogg Opus">
<t>
Ambisonics MAY be encapsulated in the Ogg format by encoding with the Opus codec
and setting the Channel Mapping Family value to 2 in the Ogg Identification
Header. A demuxer implmentation encountering Channel Mapping Family 2 MUST
interpret the Opus stream as containing ambisonics with the format described in
<xref target="channel_mapping"/>. 
</t>

<section anchor="channel_mapping" title="Channel Mapping Family 2">
<t>
Allowed numbers of channels: (1 + n)^2 for n = 0...14. 
Explicitly 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225.
Periphonic ambisonics from zeroth to fourteenth order.
</t>

<t>
This channel mapping uses the same channel mapping table format used by channel
mapping families 1 and 255. Each output channel is assigned to an ambisonic
component in Ambisonic Channel Number (ACN) order. The ambisonic component with
order n and degree m corresponds to channel (n * (n + 1) + m). The reverse
correspondence can also be computed for a channel with index k.
</t>

<figure align="center">
<artwork align="center"><![CDATA[
order   n = ceil(sqrt(k)) - 1,
degree  m = k - n * (n + 1).
]]></artwork>
</figure>

<t>
Channels are normalized with Schmidt Semi-Normalization (SN3D).
The interpretation of the ambisonics signal as well as detailed definitions of
 ACN channel ordering and SN3D normalization are described in
 <xref target="ambix"/> Section 2.1.
</t>

</section>

<section anchor="downmixing" title="Downmixing">
<t>
An Ogg Opus player MAY use the matrix in Figure
 <xref target="stereo_downmix_matrix" format="counter"/> to implement
 downmixing from multichannel files using Channel Mapping Family 2
 <xref target="channel_mapping"/>, which is known to give acceptable
 results for stereo. The first and second ambisonic channels are known as "W"
 and "Y" respectively.
</t>

<figure anchor="stereo_downmix_matrix" title="Stereo Downmixing Matrix" align="center">
<artwork align="center"><![CDATA[
/   \   /                  \ /  W  \
| L |   | 0.5  0.5 0.0 ... | |  Y  |
| R | = | 0.5 -0.5 0.0 ... | | ... |
\   /   \                  / \ ... /
]]></artwork>
</figure>

<t>
The first ambisonic channel (W) is a mono audio stream which represents the
average audio signal over all directions. Since W is not directional, Ogg Opus
players MAY use W directly for mono playback.
</t>

</section>

</section>

<section anchor="security" title="Security Considerations">
<t>
Implementations of the Ogg container need take appropriate security
 considerations into account, as outlined in Section 10 of <xref target="RFC7845"/>.
The extension defined in this document requires that semantic meaning be
 assigned to more channels than the existing Ogg format requires.
Since more allocations will be required to encode and decode these semantically
 meaningful channels, care should be taken in any new allocation paths.
Implementations MUST NOT overrun their allocated memory nor read from
 uninitialized memory when managing the ambisonic channel mapping.
</t>

</section>

<section anchor="iana" title="IANA Considerations">
<t>
This document updates the IANA Media Types registry "Opus Channel Mapping
Families" to add a new assignment.
</t>
<texttable>
<ttcol>Value</ttcol><ttcol>Reference</ttcol>
<c>2</c><c>This Document <xref target="channel_mapping"/></c>
</texttable>

</section>

<section anchor="Acknowledgments" title="Acknowledgments">
<t>
Thanks to Timothy Terriberry and Marcin Gorzel for their guidance and
 valuable contributions to this document.
</t>
</section>

</middle>
<back>
<references title="Normative References">
 &rfc2119;
 &rfc6716;

 &rfc7845;

<reference anchor="ambix"
 target="http://iem.kug.ac.at/fileadmin/media/iem/projects/2011/ambisonics11_nachbar_zotter_sontacchi_deleflie.pdf">
  <front>
    <title>AMBIX - A SUGGESTED AMBISONICS FORMAT</title>
    <author initials="C." surname="Nachbar" fullname="Christian Nachbar"/>
    <author initials="F." surname="Zotter" fullname="Franz Zotter"/>
    <author initials="E." surname="Deleflie" fullname="Etienne Deleflie"/>
    <author initials="A." surname="Sontacchi" fullname="Alois Sontacchi"/>
    <date month="June" year="2011"/>
  </front>
</reference>

</references>


<references title="Informative References">

<reference anchor="gerzon75"
 target="http://www.michaelgerzonphotos.org.uk/articles/Ambisonics%201.pdf">
  <front>
    <title>Ambisonics. Part one: General system description</title>
    <author initials="M." surname="Gerzon" fullname="Michael Gerzon"/>
    <date month="August" year="1975"/>
  </front>
</reference>

<reference anchor="daniel04"
 target="http://pcfarina.eng.unipr.it/Public/phd-thesis/aes116%20high-passed%20hoa.pdf">
  <front>
    <title>Further Study of Sound Field Coding with Higher Order Ambisonics</title>
    <author initials="J." surname="Daniel" fullname="Jérôme Daniel"/>
    <author initials="S." surname="Moreau" fullname="Sébastien Moreau"/>
    <date month="May" year="2004"/>
  </front>
</reference>

</references>

</back>
</rfc>

