Internet Engineering Task Force Johan Sjoberg, Ericsson Audio Video Transport WG Magnus Westerlund, Ericsson INTERNET-DRAFT Ari Lakaniemi, Nokia August 14, 2000 Petri Koskelainen, Nokia Expires: February 14, 2001 Berhard Wimmer, Siemens Tim Fingscheidt, Siemens RTP payload format for AMR Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or cite them other than as "work in progress". The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/lid-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This document is an individual submission to the IETF. Comments should be directed to the authors. Abstract This document describes a proposed real-time transport protocol (RTP) [8] payload format for AMR speech encoded [1] signals. The AMR payload format is designed to be able to interoperate with existing AMR transport formats. This document also includes a MIME type registration for AMR. The MIME type is specified for both real-time transport and storage. Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 1] INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000 1. Introduction The adaptive multi-rate (AMR) speech codec was developed by the European Telecommunications Standards institute (ETSI). The AMR codec is standardized for GSM, and is also chosen by 3GPP as the mandatory codec for third generation systems. It is currently under standardization for TDMA. I.e. the AMR codec will be widely used in cellular systems. The AMR codec is developed to preserve high speech quality under a wide range of transmission conditions. The AMR codec is a multi-mode codec with 8 narrow band modes with bit rates between 4.75 and 12.2 kbps. The sampling frequency is 8000 Hz and processing is done on 20 ms frames, i.e. 160 samples per frame. The AMR modes are closely related to each other and uses the same coding framework. Three of the AMR modes are already adopted and used standards of there own, the 6.7 kbps mode as PDC-EFR [7], the 7.4 kbps mode as IS-641 codec in TDMA [6], and the 12.2 kbps mode as GSM- EFR [5]. The AMR codec is designed with a voice activity detector (VAD) and generation of comfort noise (CN) parameters during silence periods. Hence, the AMR codec can reduce the number of transmitted bits and packets during silence periods to a minimum. The operation to send CN parameters at regular intervals during silence periods is usually called discontinuous transmission (DTX) or source controlled rate (SCR) operation. AMR implementations must support all 8 speech coding modes, and mode switching can occur to any mode at any time. The mode information must therefore be transmitted together with the speech encoded bits, to indicate the mode. The AMR speech codec is designed with modes producing different bit rates to be able to adapt the source bit rate according to the radio link quality in mobile phone systems. The objective was to give highest possible speech quality under a variety of radio channel conditions. To realize rate adaptation the decoder needs to signal the mode it prefers to receive to the encoder. Due to the flexibility and robustness of AMR, it is suitable also for other purposes than circuit switched cellular systems. Other suitable applications are real-time services over packet switched networks, e.g. over RTP. To be optimized for transmission over networks with high packet loss rates, the possibility to use extra redundancy is built into the RTP payload format for AMR. The speech encoded bits have different perceptual sensitivity to bit errors and cellular systems exploit this by using unequal error protection and detection (UEP and UED). This mechanism concentrates the correction and detection of corrupted bits to the perceptually most sensitive bits. A frame is only regarded as lost or damaged if errors are detected in the most sensitive bits. The UED can also be employed on RTP if UDP lite is used as transport layer protocol (UDP lite [10] is work in Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 2] INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000 progress). To enable this, the bits in the payload have to be ordered in sensitivity order. The AMR encoded bits are defined in sensitivity order in [2]. If the receiver supports option to retransmit redundant frames, the different sensitivity could also be used for transmitting only the most sensitive bits of a redundant frame. The special problems with IP real-time traffic over cellular access networks are further discussed in [9]. Other AMR scenarios are possible, e.g. one end is circuit switched GSM, which is connected through a gateway to IP network and an IP terminal in the other end. To improve quality, also frames damaged by the GSM radio should be transmitted to the decoder in the IP network. To make this possible, frame quality information has to be transmitted over the IP network. The quality bit is also needed for the AMR RTP payload format to interwork with for example the ATM AAL2 AMR profile. 2. Requirements The AMR payload format for RTP was designed to meet the following requirements: o Different levels of robustness must be supported, from no redundant data to extreme robustness capable of handling very high packet loss rates with no or small speech quality degradation. o Fast, bandwidth efficient, frame-wise AMR mode adaptation must be supported. This means that it must be possible to send Codec Mode Requests back from the receiving side to the transmitting side with information on the preferred mode. o Source controlled rate operation (SCR) (also called DTX) and comfort noise parameter (CN) transmission defined in AMR must be supported. 3. Payload format The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119 [3]. The AMR payload format is designed to be flexible, ranging from very low overhead to an extended format with the possibility to send redundancy information and several speech frames in one packet. The payload format consists of payload header and one or more payload frames. Neither the payload header nor the payload frames are octet aligned on their own but the full payload is. If the option to Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 3] INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000 transmit robust sorted payload is enabled and employed, the full payload SHALL finally be ordered in descending bit error sensitivity order to be prepared for unequal error protection or unequal error detection schemes, e.g. UDP lite [10]. The AMR encoded bit streams are defined in sensitivity order in Annex B of [2], the original order as delivered from the speech encoder is defined in [1]. The last octet of an AMR payload packet is padded with zeroes at the end if not all bits are used. The AMR frame types, or modes, are defined in [2]. Frame type 15, no transmission, is needed to indicate not transmitted frames or lost frames. Not transmitted could mean both no data produced by the speech encoder for this frame or no data transmitted in this payload, i.e. valid data for this frame could be sent in another payload. For example, when multiple frames are sent in each payload and comfort noise starts. A frame type sequence in a payload with 8 frames, speech frames with AMR mode 7 are interrupted by CN in the fifth frame, could look like: {7,7,7,7,8,15,15,8}. The AMR SCR is described in [4]. The AMR payload format supports robust transmission, multiple frames in one payload packet, and the use of fast codec mode adaptation. The robust behavior is accomplished by using the optional possibility to retransmit previously transmitted frames together with the current frame or frames. The redundant frames could be transmitted in their entirety or only partly. If only a part of the redundant frame is transmitted, the least sensitive bits are omitted. A partially transmitted redundant frame SHALL fill the number of used octets for that frame. The bits in the payload are sorted in descending sensitivity order to support UED, like in UDP lite [10], if partial redundancy is used. Each full AMR speech frame SHALL be transmitted at least once. The bits in redundant frames that are not transmitted MUST be reconstructed on the receiver side when the partial redundant frame is used for speech decoding. It is RECOMMENDED to produce the non received bits with state of the art error concealment unit (ECU) actions. Nothing resulting in worse quality than using random generated bits SHOULD be used. The use of a fixed pattern SHOULD be avoided for speech quality reasons. A frame quality indicator is included for interoperability with the ATM payload format described in ITU-T I.366.2, the UMTS Iu interface [13] and other transport formats. The speech quality is significantly increased if damaged frames are forwarded to the speech decoder error concealment unit and not dropped. In many communication scenarios the AMR encoded bits will be transmitted from one IP/UDP/RTP terminal to a terminal in a system with another transport format and/or vice versa. The transport format transcoding will be done in a gate way. A Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 4] INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000 second likely scenario is that IP/UDP/RTP is used as transport between other systems, i.e. IP is originated and terminated in gate ways on both sides of the IP transport. AMR over I.366.{2,3} or +------+ +----------+ 3G Iu or | | IP/UDP/RTP/AMR | | -------------->| GW |----------------------->| TERMINAL | GSM Abis | | | | etc. +------+ +----------+ Figure 1: GW to VoIP terminal scenario AMR over AMR over I.366.{2,3} or +------+ +------+ I.366.{2,3} or 3G Iu or | | IP/UDP/RTP/AMR | | 3G Iu or -------------->| GW |-------------------->| GW |---------------> GSM Abis | | | | GSM Abis etc. +------+ +------+ etc. Figure 2. GW to GW scenario 3.1. The payload header The payload header has dynamic length, 3 or 6 bits. The bits in the header are specified as follows: S (1bit): Indicates if set that the payload is robust sorted, otherwise simple payload sorting is employed. Note that this bit can be set only if the receiver has signaled support for the option robust payload sorting. L (1 bit): Indicates the existence of LEN fields in the payload frames. Note that this bit can be set only if the receiver has signaled support for the option to transmit redundant data. R (1 bit): Indicates, if set, that the Codec Mode Request (CMR) is sent. CMR (3 bits): OPTIONAL field, depending on the R bit. Requested codec mode for the other communication direction. The mapping of existing AMR modes to CMR is are given by the three least significant bits in Table 1a in [2]. Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 5] INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000 0 0 1 2 +-+-+-+ |S|L|R| +-+-+-+ Figure 3: AMR payload header, when R=0 0 0 1 2 3 5 6 +-+-+-+-+-+-+ |S|L|R| CMR | +-+-+-+-+-+-+ Figure 4: AMR payload header, when R=1 3.2. AMR payload frame An AMR payload frame represent one encoded speech frame. Each payload frame includes several specified fields as follows: F (1 bit): Indicates if this frame is followed by further frames. F=1 further frames follow, F=0 last frame. Q (1 bit): The payload quality bit indicates, if not set, that the payload is severely damaged and the receiver should set the RX_TYPE, see [4], to SPEECH_BAD or SID_BAD depending on the frame type (FT). FT (4 bits): Frame type indicator, indicating the AMR speech coding mode or comfort noise (CN) mode. The mapping of existing AMR modes to FT is given in Table 1a in [2]. If FT=15 (No transmission) no LEN or AMR encoded bits follow. LEN (5 bits): OPTIONAL field, exists if the payload header bit L is set, L=1. LEN specifies the number of octets used for the AMR encoded bits field in this frame. If LEN indicates more bits than the AMR mode information in the FT field, the implicit knowledge of the number of bits for the AMR mode indicated by FT is the valid number of AMR encoded bits, in octets. If LEN indicates fewer bits than given by the mode information in the FT field, LEN gives the number of encoded bits. If a frame is transmitted only partially the least sensitive bits at the end of the frame are omitted. This use is intended for partial redundant data. AMR encoded bits: This is the speech codec encoded data field. The length of this field is either defined implicitly by the AMR mode in the FT field, or by the LEN field. The last payload frame SHALL always contain a full AMR frame, i.e. no LEN field is needed or used. Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 6] INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |F|Q| FT | LEN | | +-+-+-+-+-+-+-+-+-+-+-+ + | | + + / AMR encoded bits / + +-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 5: Payload frame format, F=1 and L=1 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |F|Q| FT | | +-+-+-+-+-+-+ + | | + + / AMR encoded bits / + +-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 6: Payload frame format, F=0 or L=0 3.3. Compound AMR payload The compound AMR payload consists of one AMR payload header and one or more AMR payload frames, see section 3.1. and 3.2. These can be put together with robust or simple payload sorting. The payload header bit S indicates the method used. Definitions for describing the compound AMR payload: b(m) - bit m of the compound AMR payload f(n,m) - bit m in payload frame n F(n) - number of bits in payload frame n, defined by FT or by LEN h(m) - bit m of payload header H - number of payload header bits, 3 or 6 bits N - number of payload frames in the payload S - number of unused bits Payload frames f(n,m) are ordered in consecutive order, where frame n=1 is preceding frame n=2. Within one payload all frames between the oldest and most recent must be present. If speech data is missing for one frame, due to e.g. DTX, send the NO_TRANSMISSION frame type. Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 7] INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000 Before sorting the payload consists of data ordered as described in Figure 7. +-------------+ | h(0)-h(H-1) | +------------------------+ | f(0,0) _ f(0,F(0)) | +----------------------------+ | f(1,0) _ f(1,F(1)) | +----------------------------+ | f(2,0) _ f(2,F(2)) | +----------------------+ \ \ +-------------------------------+ | f(N-1,0) _ f(N-1,F(N-1)) | +-------------------------------+ Figure 7: The payload header and N payload frames before sorting. 3.3.1. Robust payload sorting A bit error in a more sensitive bit is subjectively more annoying than in a less sensitive bit. Therefore, to be able to protect the most sensitive bits in a payload packet with a forward error detection code, e.g. a CRC outside RTP, the bits inside a frame are ordered into sensitivity order. If the option to transmit redundant data is employed, the full RTP payload MUST be further sorted into sensitivity order. The protection SHOULD then cover an appropriate number of octets from the beginning of the payload, covering at least the AMR payload header, F, Q, FT, LEN bits and class A bits (see [2]). Exactly how many octets that needs protection depends on the channel and application. To maintain sensitivity ordering inside the AMR payload, when more than one speech frame is transmitted in one payload, reordering of the data is needed. The reordering to maintain the sensitivity ordered AMR payload SHALL be performed on bit level. The AMR payload header SHALL still be placed unchanged in the beginning of the payload. Thereafter, the payload frames are sorted with one bit alternating from each payload frame. The robust payload sorting algorithm is defined in C-style as: for (i = 0; i < H; i++){ b(i) = h(i); } max = max(F(0),..,F(N-1)); k = H; for (i = 0; i < max; i++){ for (j = 0; j < N; j++){ Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 8] INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000 if (i < F(j)){ b(k++) = f(j,i); } } } S = 8 - k%8; if (S < 8){ for (i = 0; i < S; i++){ b(k++) = 0; } } 3.3.2. Simple payload sorting If multiple new frames are encapsulated into the payload and robust payload sorting is not used. The payload is formed by concatenating the payload header and the bits from each AMR frame in the payload. However, the bits inside a frame are ordered into sensitivity order as defined in [2]. The simple payload sorting algorithm is defined in C-style as: for (i = 0; i < H; i++){ b(i) = h(i); } k = H; for (j = 0; j < N; j++){ for (i = 0; i < F(j); i++){ b(k++) = f(j,i); } } } S = 8 - k%8; if (S < 8){ for (i = 0; i < S; i++){ b(k++) = 0; } } 3.4. Decoding security consideration If the payload length calculation, using F, FT and LEN fields, do not indicate the same length as the actually received payload size the payload MUST be dropped. Decoding a packet that has errors in length indicator bits could severely degrade the speech quality. Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 9] INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000 4. RTP header usage The RTP header marker bit (M) is used to mark (M=1) the packages containing the first speech frame after CN. For all other packages the marker bit is set to 0 (M=0). The timestamp corresponds to the sampling time of the first sample encoded for the first encoded speech frame in the packet. The timestamp unit is in samples. The duration of one AMR speech frame is 20 ms and the sampling frequency is 8 kHz, corresponding to 160 encoded speech samples per frame. Thus, the timestamp is increased by 160 for each consecutive frame. All frames in a packet MUST be successive 20 ms frames. 5. Congestion Control The need of congestion control for data transported with RTP is addressed in [14]. AMR speech data have some elastic properties due to the different bandwidth demand for each mode. Another parameter that can reduce the bandwidth demand for AMR are how many frames of speech data that are encapsulated in each payload. This will reduce the number of packets and the overhead from IP/UDP/RTP headers. If using FEC there is also the need to regulate the amount, so the FEC itself does not worsen the problem. Therefore, it is RECOMMENDED that applications using this payload implements congestion control. The actual mechanism for congestion control is not specified but should be suitable for real-time flows, e.g. "Equation-Based Congestion Control for Unicast Applications" [15]. 6. Examples 6.1. Simple example In the simple example we just send one full (L=0) frame in each RTP packet, no Codec Mode Request CMR is sent (R=0), the payload was not damaged at IP origin (Q=1). In this example we transmit one frame encoded with the 5.9 kbps mode (FT=2). The speech encoded bits are put into f(0) to f(117) in descending sensitivity order according to [2]. Simple payload sorting is used, S=0. | Bit no. | Oct| 0 1 2 3 4 5 6 7 | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 0 | S=0 | L=0 | R=0 | F=0 | Q=1 | 0 | 0 | 1 | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 1 | 0 | f(0) | f(1) | f(2) | ... | ... | ... | ... | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 15 | ... | ... | ... | ... | f(115)| f(116)| f(117)| 0 | ---+-------+-------+-------+-------+-------+-------+-------+-------+ Figure 8: One frame per packet example. Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 10] INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000 6.2. Example with partial redundancy In this example the 6.7 kbps mode (FT=3) is sent with one redundant frame, also FT=3. Only a part of the redundant frame is sent, in this example 12 octets, (L=1, LEN=12). A mode request is sent(R=1), requesting the 10.2 kbps mode for the other link(CMR=6). The redundant frame (12 octets) including FT is r(0) to r(91) and the current frame (134 bits) is f(0) to f(133). | Bit no. | Oct| 0 1 2 3 4 5 6 7 | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 0 | S=1 | L=1 | R=1 | 1 | 1 | 0 | F=1 | F=0 | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 1 | Q=1 | Q=1 | 0 | 0 | 1 | 0 | 1 | 1 | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 2 | 0 | 1 | 0 | f(0) | 0 | f(1) | 0 | f(2) | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 3 | 1 | f(3) | 1 | f(4) | r(0) | f(5) | r(1) | f(6) | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 4 | r(2) | f(7) | r(3) | f(8) | ... | ... | ... | ... | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 26 | r(90) | f(95) | r(91) | f(96) | f(97) | f(98) | ... | ... | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 30 | ... | ... | ... | ... | ... | ... | f(131)| f(132)| ---+-------+-------+-------+-------+-------+-------+-------+-------+ 31 | f(133)| 0 | 0 | 0 | 0 | 0 | 0 | 0 | ---+-------+-------+-------+-------+-------+-------+-------+-------+ Figure 9: Example with partial redundancy. 6.3. Example with multiple frames per payload In this example two 5.9 kbps mode (FT=2) frames are sent in one packet. No partial redundancy is used (L=0). A mode request is sent(R=1), requesting the 7.95 kbps mode for the other link(CMR=5). The first frame is represented by the 118 bits f(0) to f(117) and the subsequent frame by g(0) to g(117). Robust sorting is not used. | Bit no. | Oct| 0 1 2 3 4 5 6 7 | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 0 | S=0 | L=0 | R=1 | 1 | 0 | 1 | F=1 | Q=1 | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 1 | 0 | 0 | 1 | 0 | f(0) | f(1) | ... | ... | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 15 | ... | ... | ... | ... | ... | ... | ... | f(115)| ---+-------+-------+-------+-------+-------+-------+-------+-------+ 16 | f(116)| f(117)| F=0 | Q=1 | 0 | 0 | 1 | 0 | Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 11] INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000 ---+-------+-------+-------+-------+-------+-------+-------+-------+ 17 | g(0) | g(1) | g(2) | ... | ... | ... | ... | ... | ---+-------+-------+-------+-------+-------+-------+-------+-------+ 31 | ... | ... | ... | ... | g(116)| g(117)| 0 | 0 | ---+-------+-------+-------+-------+-------+-------+-------+-------+ Figure 10: Example two frames per payload. 7. The AMR MIME type registration This chapter defines the MIME type for the Adaptive Multi-Rate (AMR) speech codec [1]. The data format and parameters are specified for both real-time transport and for storage type applications (e.g. e- mail attachment, multimedia messaging). The former is referred as RTP mode and the latter as storage mode. AMR implementations according to [1] MUST support all eight coding modes. The mode change can occur at any time during operation and therefore the mode information is transmitted in-band together with speech bits to allow mode change without any additional signaling. In addition to the speech codec, AMR specifications also include Discontinuous Transmission / comfort noise (DTX/CN) functionality [11]. The DTX/CN switches the transmission off during silent parts of the speech and only CN parameter updates are sent in regular intervals. 7.1. RTP mode It is possible that the decoder may want to receive a certain AMR mode or a subset of AMR modes, due to link limitations in some cellular systems, e.g. the GSM radio link can only use a subset of maximum four modes. Therefore, it is possible to request a specific set of AMR modes in capability description and the encoder MUST abide this request. If the request for mode set is not given any mode may be used or requested. Although in principle the AMR codec can perform a mode change at any time between any two modes, it is possible to set limitations for mode changes. The decoder has possibility to define the minimum number of frames between mode changes and to limit the mode change to happen into neighboring modes only. Also this is motivated by limitations on the GSM radio link. It is also possible to limit the number of AMR frames encapsulated into one RTP packet. This is an optional feature and if no parameter is given in capability description, the transmitter can encapsulate any number of AMR speech frames into one RTP packet. Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 12] INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000 There is also an option to retransmit one or more previously transmitted frames together with a new frame to help the receiver to recover from packet losses in difficult transmission conditions. It is also possible to transmit these frames only partially in such a way that only the most sensitive bits are retransmitted. Since the transmission of partly redundant frames is an optional property, it can be used only if the receiver has signaled support for this functionality in capability description. The partial redundancy is RECOMMENDED to be implemented and turned on at least for conversational services. To support unequal error protection and/or detection the payload format supports robust payload sorting. The robust payload sorting is an optional feature and can only be used if the receiver has signaled support for this functionality in capability description. 7.2. Storage mode For storing AMR frames e.g. as a file or e-mail attachment, the AMR frames must be encapsulated in consecutive compound AMR frames, see chapter 3. Some limitations of the storage format is needed, since no exchange of particular coding considerations can be signaled before downloading or receiving stored AMR data and no timestamp information is available in the file. The receiving entity (AMR decoder) MUST be able to decode all eight coding modes as well as the AMR DTX/CN [6]. The compound AMR payload SHALL be stored without partial redundancy and with simple payload sorting, see section 3.3. Not transmitted frames, during for example DTX MUST be stored as NO_TRANSMISSION frames to keep synchronization with original media. 7.3. MIME Registration MIME-name for the AMR codec is allocated from IETF tree since AMR is expected to be widely used speech codec in VoIP applications. Some parts of this chapter will distinguish between RTP and storage modes. Media Type name: audio Media subtype name: AMR Required parameters: none Optional parameters for RTP mode: ptime: Definition as usual in RTP audio. mode-set: Requested AMR mode set. Restricts the active codec mode set to a subset of all modes. Possible values are comma separated list of modes: 0,...,7 (see Table 1a [2] an example is given in section 7.4). If not present, all speech modes are available. Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 13] INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000 mode-change-period: Defines a number N which restricts the mode changes in such a way that mode changes are only allowed on multiples of N, initial state of the phase is arbitrary. If this parameter is not present, mode change can happen at any time. mode-change-neighbor: If present, mode changes SHALL only be made to neighboring modes in the active codec mode set. If not present, change between any two modes is allowed. maxframes: Maximum number of AMR speech frames in one RTP packet. The receiver may set this parameter in order to limit the buffering requirements or delay. redundancy: If present, transmission of partly redundant frames is supported, otherwise not supported. robust-sorting: If present, robust payload sorting is supported, otherwise not supported and simple payload sorting SHALL be used. Optional parameters for storage mode: none Encoding considerations for RTP mode: See section 3 in this document. Encoding considerations for storage mode: The AMR speech frames are packed into consecutive compound AMR payloads, see section 3. The compound AMR payloads must be stored in sequential order. This implies that the first octet after payload n must be the first octet of payload (n+1). Furthermore, missing frames and non-received frames between CN updates during non-speech period must be encapsulated into a compound AMR payload as NO_TRANSMISSION frames (frame type 15, see definition in [2]). Each receiving entity that accepts this MIME type must be able to decode all eight AMR coding modes [1] and the AMR DTX/CN [11]. Security considerations: none Public specification: please refer to chapter 8 "References". Additional information for storage mode: Magic number: none File extensions: amr, AMR Macintosh file type code: none Object identifier or OID: none Person & email address to contact for further information: johan.sjoberg@ericsson.com ari.lakaniemi@nokia.com Bernhard.Wimmer@mch.siemens.de Intended usage: COMMON. It is expected that many VoIP applications (as well as mobile applications) will use this type. Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 14] INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000 Author/Change controller: johan.sjoberg@ericsson.com ari.lakaniemi@nokia.com Bernhard.Wimmer@mch.siemens.de 7.4 Mapping to SDP Parameters Please note that this chapter applies to the RTP mode only. Parameters are mapped to SDP [12] as usual. Example usage in SDP: m=audio 49120 RTP/AVP 97 a=rtpmap:97 AMR a=fmtp:97 mode-set=0,2,5,7; maxframes=2 8. References [1] 3G TS 26.090, "Adaptive Multi-Rate (AMR) speech transcoding". [2] 3G TS 26.101, "AMR Speech Codec Frame Structure". [3] IETF RFC 2119, "Key words for use in RFCs to Indicate Requirement Levels". [4] 3G TS 26.093, "AMR Speech Codec; Source Controlled Rate operation". [5] GSM 06.60, "Enhanced Full Rate (EFR) speech transcoding". [6] TIA/EIA -136-Rev.A, part 410 - "TDMA Cellular/PCS - Radio Interface, Enhanced Full Rate Voice Codec (ACELP). Formerly IS- 641. TIA published standard, 1998". [7] ARIB, RCR STD-27H, "Personal Digital Cellular Telecommunication System RCR Standard". [8] IETF RFC1889, "RTP: A Transport Protocol for Real-Time Applications". [9] IETF draft-westberg-realtime-cellular-01.txt, "Realtime Traffic over Cellular Access Networks". [10] IETF draft-larzon-udplite-03.txt, "The UDP Lite Protocol". [11] GSM 06.92, "Comfort noise aspects for Adaptive Multi-Rate (AMR) speech traffic channels". [12] M. Handley and V. Jacobson, "SDP: Session Description Protocol", RFC 2327, April 1998 Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 15] INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000 [13] 3G TS 25.415 "UTRAN Iu Interface User Plane Protocols" [14] IETF draft-ietf-rtp-new-08.txt, Chapter 10, "RTP: A Transport Protocol for Real-Time Applications". [15] S. Floyd, M. Handley, J. Padhye, J. Widmer, "Equation-Based Congestion Control for Unicast Applications", ACM SIGCOMM 2000, Stockholm, Sweden 9. Authors' addresses Johan Sjoberg Ericsson Research Ericsson Radio Systems AB Torshamnsgatan 23 SE-164 80 Stockholm SWEDEN E-mail: Johan.Sjoberg@ericsson.com Magnus Westerlund Ericsson Research Ericsson Radio Systems AB Torshamnsgatan 23 SE-164 80 Stockholm SWEDEN E-mail: Magnus.Westerlund@ericsson.com Ari Lakaniemi Nokia Research Center P.O.Box 407 FIN-00045 Nokia Group Finland E-mail: ari.lakaniemi@nokia.com Petri Koskelainen Nokia Research Center P.O.Box 100 FIN-33721 Tampere Finland E-mail: petri.koskelainen@nokia.com Tim Fingscheidt Siemens AG, ICP CD Grillparzerstrasse 10-18 D - 81675 Munich Germany Phone: +49 89 722 57658 Fax: +49 89 722 46489 E-mail: Tim.Fingscheidt@mch.siemens.de Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 16] INTERNET-DRAFT RTP Payload Format for AMR August 14, 2000 Bernhard Wimmer Siemens AG, ICP CD Grillparzerstrasse 10-18 D - 81675 Munich Germany Phone: +49 89 722 23247 Fax: +49 89 722 46489 E-mail: Bernhard.Wimmer@mch.siemens.de This Internet-Draft expires February 14, 2001. Sjoberg/Westerlund/Lakaniemi/Koskelainen/Wimmer/Fingscheidt [Page 17]