Internet Engineering Task Force AVT WG Internet Draft J.Rosenberg,H.Schulzrinne draft-ietf-avt-fec-00.txt Bell Laboratories July 1997 Expires: January 1998 An A/V Profile Extension for Generic Forward Error Correction in RTP STATUS OF THIS MEMO This document is an Internet-Draft. Internet-Drafts are working docu- ments of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute work- ing documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference mate- rial or to cite them other than as ``work in progress''. To learn the current status of any Internet-Draft, please check the ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Distribution of this document is unlimited. 1 Abstract This document specifies an extension to RFC 1890 which allows for forward correction (FEC) of continuous media encapsulated in RTP. The profile is engineered for FEC algorithms based on the exclusive or (parity) operation, although it can be used with other techniques. The profile extension allows end systems to transmit using arbitrary block lengths and parity schemes. It also allows for the recovery of both the payload and critical RTP header fields. It is backwards com- patible with existing RFC 1890 implementations, so that receivers which do not wish to implement FEC can just ignore the extensions. 2 Background The quality of packet voice on the Internet has been mediocre due, in part, to high packet loss rates. This is especially true on wide-area connections. Unfortunately, the strict delay requirements of real- time multimedia usually eliminate the possibility of retransmissions. It is for this reason that forward error correction (FEC) has been J.Rosenberg,H.Schulzrinne July 30, 1997 [Page 1] Internet Draft Generic Error Correction July 1997 proposed to compensate for packet loss in the Internet [1]. In par- ticular, the use of traditional error correcting codes, such as par- ity, Reed-Solomon, and Hamming codes, has attracted attention. To support these mechanisms, protocol support is required. Budge, McKenzie, Mills, Diss, and Long have proposed a payload format for RTP which allows for the encapsulation of FEC-protected media on top of RTP [2]. We briefly summarize their proposal, and urge the reader to consult their draft for more details. They define a new RTP payload type which identifies the packet contents as FEC-protected media. The RTP payload format in their proposal consists of two ele- ments, the media-correction header and the payload. The media- correction header is 24 bits, and consists of three fields. The first is called the scheme, the second the mode, and the third, the length. The scheme identifies the particular error correction scheme in use. In particular, it defines the set of data packets over which the FEC is applied, and the order in which the packets (data and FEC) are sent. The mode identifies which packet in a group of data and FEC packets (typically called a block) this particular one corresponds to. For packets that contain just data (and not FEC), the length field contains the length of the payload. For packets which contain FEC, the length field contains the xor of the length fields of the packets which are covered by the FEC. Since packets must be padded out with zeroes (to be equal lengths) in order to perform the xor operation, the length field allows recovery of the actual length of the pre-padded packets. 3 Motivation The payload format proposed in [2] works quite well, but has a number of drawbacks: oIt does not indicate the media type of the actual data being pro- tected. This is because the RTP PT field always indicates that the payload format is "FEC-protected media". Since many applications will need to change media payload types mid-stream (for example, sending DTMF tones in-band), the presence of this field is impor- tant. oThe RTP timestamp field and marker bit are not covered by FEC. When a packet is lost and then reconstructed, the timestamp and marker bits are copied from the another packet. Correct recovery of these fields is important. oIt defines four very specific schemes (one of which is no error correction), and assigns a value for the scheme field in the header to each. New schemes must be registered with IANA, the details written up, and receivers and senders alike must be J.Rosenberg,H.Schulzrinne July 30, 1997 [Page 2] Internet Draft Generic Error Correction July 1997 upgraded to recognize and support them. This makes backwards com- patibility difficult, requiring capabilities negotiation. It also means that transmitters are restricted to using the schemes defined thus far. The three non-null schemes defined in [2] use heavy forward error correction. These schemes are not appropriate for all loss conditions. oIt results in substantial overhead: an additional 24 bits per packet. It is our aim to generalize the payload format for forward error protec- tion. To do this, the details of the scheme are transmitted inside the data packets with minimal overhead. This allows sender-based adaptation of the FEC schemes. This adaptation can be static or dynamic, and based on any information available at the sender. Changing schemes mid-stream is then trivially supported, whereas special protocol support is required in [2]. Capability exchanges are avoided, simplifying the pro- tocol and eliminating compatibility problems. 4 Protocol Overview Before discussing the profile, we define a few terms for clarity. A media payload is a piece of raw, un-protected user data. A media header is the RTP header which would be used for this media payload if no error correction were to be applied. The combination of a media payload and media header is called a media packet. The forward error correction algorithms at the transmitter take the media packets as an input. They output both the media packets that they are passed, and new packets called FEC packets. The FEC packet contains an FEC header and FEC payload. Each FEC packet is said to be associated with one or more media packets when those media packets are used to generate the FEC packet (by use of the exclusive or operation, for example). At the receiver, arriving FEC and media packets are used to generate a stream of media packets for direct use by the application. This results in a clean separation of error protection from the applica- tion. The protocol operates by assuming that the error correction algorithm works by applying some function f to one or more media payloads, which are specified as the arguments to f. The result of this func- tion is an FEC payload. When the function is applied to just a single media payload, the result is that media payload (f(a) = a). When the function is applied to multiple media payloads, the result is some combination of those payloads (the exclusive or would be defined as: f(a,b,c) = a xor b xor c). We assume f can combine any number of pay- loads, each with arbitrary lengths. If some media packet xi is lost, recovery of its payload pi is accomplished if f(p0,..,pn) is J.Rosenberg,H.Schulzrinne July 30, 1997 [Page 3] Internet Draft Generic Error Correction July 1997 received, i < n, and the remaining n-1 payloads combined by f are received. Any function which meets these assumptions can be used by the protocol. We have been careful to avoid discussing how media headers are com- bined to generate FEC headers. The details of this operation are defined in Section 5.2. For example, consider the case where f is the exclusive or. Media packets w,x,y, and z, with media payloads a,b,c and d are to be transmitted. Pairs of media payloads will be xor'ed together to gen- erate the FEC payloads. We would denote the resulting network payload stream as: a, b, f(a,b), c, d, f(c,d) In this example, the error correction scheme introduces a 50% over- head. But if b is lost, a and f(a,b) can be used to recover b. The way in which the various schemes differ is in the set of media payloads over which the exclusive-or (or more generally, f(.)) is applied, and the order in which the resulting packets are sent. For example, Budge et. al. describe four schemes, 0, 1, 2, and 3 which take media payloads a,b,c,d, etc., and generate FEC payloads as fol- lows: Scheme 0 -------- This scheme is null, and has no error correction. The scheme is formally defined as: a,b,c,d, ... -> a, b, c, d, .... Scheme 1 -------- This scheme is the similar to the one in the example above. The switching of the positions of f(b) and f(a,b) allow some bursts of two consecutive packet losses to be recovered. It is defined as: a,b,c,d,e,f -> a, f(a,b), b, f(b,c), c, f(c,d), d, ... Scheme 2 J.Rosenberg,H.Schulzrinne July 30, 1997 [Page 4] Internet Draft Generic Error Correction July 1997 -------- This scheme allows for recovery of all single packet losses and some consecutive packet losses, but with less overhead than scheme 1: a,b,c,d,e,f,g -> f(a,b),f(a,c),f(a,b,c),f(c,d),f(c,e),f(c,d,e)... Scheme 3 -------- This scheme requires 4 packet delays to recover the original media payloads, but it can recover from 1,2, or 3 consecutive packet losses: a,b,c,d,e,f -> f(a),f(b),f(a,b,c),f(c),f(a,c,d),f(a,b,d),f(d), ... In order to decode the FEC payloads to media payloads, all that is necessary is for the receiver to know the function being applied, and the set of media payloads in each FEC payload to which it is applied. This is exactly the information provided by the profile extension. To determine which media payloads are associated with the FEC pay- load, the semantics of the Sequence Number (SN) field in the FEC header are redefined. Instead of incrementing monotonically as in RFC 1890 [3], the SN field is defined to be the minimum of the SN fields of the media headers of the media payloads associated with the FEC packet. For example, assume two consecutive media packets x and y have media payloads a and b, and media sequence numbers 5 and 6. An FEC packet is to be transmitted with a network payload that is the xor of a and b. The SN in the header of this FEC packet, z, will have a SN which is min(5,6) = 5. Note that this will cause FEC packets to frequently have the same sequence number as media packets which pre- ceded them. An additional field is present in the FEC header, called the offset mask. If bit k in the mask is set to 1, then the media packet with sequence number M + k is associated with this FEC packet, where M is defined as the value of the SN field in the FEC header. Based on the definition of the SN field, bit zero of the offset mask is always a one. From the example above, the offset mask in FEC packet z is set to 0b11. This indicates that two media packets (with media sequence numbers 5 and 5+1 = 6) are associated with this FEC packet. This modified SN field and offset mask are sufficient to signal arbitary forward error correction schemes with little overhead. 5 Protocol Specifics J.Rosenberg,H.Schulzrinne July 30, 1997 [Page 5] Internet Draft Generic Error Correction July 1997 The following section fills in the details based on the general dis- cussion above. 5.1 RTP Media Packet Structure Not all packets transmitted by the source contain FEC. Many contain just regular media information, which would be sent if no error cor- rection were used. The syntax and semantics of the RTP header and payload fields are identical to those defined in RFC 1889 and RFC 1890. 5.2 RTP FEC Packet Structure When a packet is to be transmitted which contains FEC data (i.e., its payload is derived from one or more media payloads), the semantics of the RTP header are changed. The format of the FEC packets is as follows: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P|X| CC |M| PT | sequence number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | timestamp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | synchronization source (SSRC) identifier | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | contributing source (CSRC) identifiers | | .... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | defined by profile | length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | length recovery | Offset Mask | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | Additional Offset Mask | | .... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The semantics of these fields is as follows: oVersion. This field is always 2. oPadding. This field has the same semantics as RFC 1889. J.Rosenberg,H.Schulzrinne July 30, 1997 [Page 6] Internet Draft Generic Error Correction July 1997 oExtension. This bit is always set to 1, indicating the presence of a header extension following the CSRC list. oCC. This field has the same semantics as in RFC 1889. oMarker. This bit is set to the f operator applied to the marker bits of all the media packets associated with this FEC packet. oPT. This field is set to the f operator applied to the PT fields of all of the media packets associated with this FEC packet. oSequence Number. This field is set to the minimum of the sequence numbers of the media packets associated with this media packet. oTimestamp. This field is set to the f operator applied to the timestamps of the media packets associated with this FEC packet. oSSRC. This field has the same semantics as in RFC 1889. oCSRC list. This field has the same semantics as in RFC 1889. oDefined by Profile. This field is part of the header extension. It allows for end-systems to determine which extension mechanism is in use. It is always set to 0x3a to identify this extension as conforming to the syntax and semantics defined here. olength. This field is part of the header extension. It indicates the number of 32-bit words in the extension. When it is one, a single word follows, with 16 bits length recovery and 16 bits off- set mask. When it is greater than one (say L), a 16 bit length recovery field is present, followed by an offset mask of length 32*L - 16 bits. olength recovery. This field is set to be equal to the f operator applied to the lengths of the media payloads associated with this FEC packet. It allows for variable length payloads to be protected by FEC. ooffset mask. This field indicates the sequence numbers of the media packets associated with this FEC packet. If bit k in the mask is set to 1, then the media packet with sequence number M+k is associated with this FEC packet. This implies that its media payload has been operated on by f to generate the FEC payload. It also implies that its sequence number, timestamp, marker bit, and PT field have been used to generate the sequence number, times- tamp, marker bit, and PT field for the FEC packet. Since this field is variable length, arbitrary media packets can be associ- ated with any FEC packet. J.Rosenberg,H.Schulzrinne July 30, 1997 [Page 7] Internet Draft Generic Error Correction July 1997 End systems which cannot recognize the extension should discard the FEC packet. This provides backwards compatibility. The media packets are still recognizable by any application. End systems which can recognize the extension can additionally process the FEC packets. This makes this extension ideally suited for multicast scenarios, where there are a mix of FEC-capable and non FEC-capable receivers. It also makes for good efficiency. Unlike in [2], the extra header fields are only present in FEC packets. 5.3 Transmit Procedures This section describes how a transmitter sets the fields in the header for FEC packets. It is assumed that a transmitter will ocas- sionally send an FEC packet which is derived from one or more media packets. The protocol does not in any way mandate when to send an FEC packet, or determine which media packets the FEC is derived from. It is assumed that transmitters generate FEC packets in a reasonable fashion so that they can actually be used for recovery. Define the list of media packets over which the FEC is derived as T. The FEC packet is generated as follows: 1. If the lengths of the payloads of packets in T are not equal, they are padded with zeroes to be as long as the longest pay- load. The original, unpadded length of each packet is stored. 2. The possibly padded payloads are operated on by f. The result is placed in the FEC packet payload. 3. The SSRC, CC, version, padding, and CSRC list in the FEC packet header are copied from one of the media packets in T. 4. The timestamp, marker bit, and PT fields in the FEC packet header are computed by applying f to the corresponding fields of the media packets in T. 5. The sequence number in the FEC packet is set to the minimum of the sequence numbers of the media packets in T. Call this sequence number M. 6. For each media packet in T, the difference between its sequence number and M is computed. Call this difference k. Bit k in the offset mask field is set to 1. 7. The length recovery field is set to f applied to the original, unpadded lengths of the media packets in T. 8. The defined by profile field in the extension is set to 0x3a J.Rosenberg,H.Schulzrinne July 30, 1997 [Page 8] Internet Draft Generic Error Correction July 1997 This procedure defines all of the fields in the FEC packet. 5.4 Recovery Procedures The FEC packets allow end systems to recover from the loss of media packets. Both the payload and the RTP header of the media packet can be reconstructed. This section describes the procedure for performing this recovery. Assume a receiver has received several media and FEC packets, but the media packet with sequence number xi was lost. When an FEC packet arrives (they are identified via the extension bit), the following steps are taken: 1. The end system checks if the FEC packet can be used to recon- struct packet xi. To do this, the sequence number from the FEC packet is read (call it M). The list of media packets associ- ated with this FEC packet, S, is initialized to be empty. The bitmask is scanned. If bit k is 1, sequence number M+k is added to the list. 2. If the list of sequence numbers includes xi, and the remaining sequence numbers in the list correspond to media packets which have all been received (or were recovered), this FEC packet can be used to recover xi. 3. The payload of xi is recovered by applying the inverse of f to the other received media payloads and the recently received FEC payload. In the case of xor, this would imply xor'ing the payloads together. 4. This payload may have padding in it. The length of the actual payload is computed via the length recovery field. The f oper- ator is applied to the length recovery fields in the other received media packets and the recently received FEC packet. The result is the unpadded payload length. 5. The timestamp, marker bit, and PT fields of the RTP header for media packet xi are recovered in the same fashion. The inverse of f is applied to the fields of the other received media packets and the field in the recently received FEC packet. 6. The SSRC, CC, version, padding, and CSRC list for media packet xi are copied from one of the media or FEC packets used to recover it. This implies that end systems should not change these fields frequently, as they may not be recovered prop- erly. J.Rosenberg,H.Schulzrinne July 30, 1997 [Page 9] Internet Draft Generic Error Correction July 1997 7. The extension bit in the header of packet xi is set to zero. This means that FEC cannot be applied to packets with other header extensions. This procedure completely recovers the lost packet, including the pay- load and RTP header fields. 5.5 Example Consider 2 media packets to be sent, x and y. We wish to protect them by sending one FEC packet which is derived from x and y. The f opera- tor is implemented using xor. The three packets are: Media Packet x -------------- Version: 2 (10) Padding: 0 (0) Extension: 0 (0) Marker: 0 (0) PTI: 11 (01011) SN: 8 (1000) TS: 3 (011) SSRC: 2 (10) The payload length is 10 bytes. Media Packet y -------------- Version: 2 (10) Padding: 0 (0) Extension: 0 (0) Marker: 1 (1) PTI: 18 (10010) SN: 9 (1001) TS: 5 (101) SSRC: 2 (10) The payload length is 11 bytes. The FEC packet is then: J.Rosenberg,H.Schulzrinne July 30, 1997 [Page 10] Internet Draft Generic Error Correction July 1997 FEC Packet (contains a xor b) ----------------------------- Version: 2 (10) Padding: 0 (0) Extension: 1 (1) Marker: 1 (1) (NOTE: 0 xor 1 = 1) PTI: 25 (11001) (NOTE: 11 xor 18 = 01011 xor 10010 = 11001) SN: 8 (1000) (NOTE: min(8,9) = 8) TS: 6 (110) (NOTE: 3 xor 5 = 011 xor 101 = 110) SSRC: 2 (10) ext. def.: 0x3a length: 1 len. rec.: 1 (1) (NOTE: 10 xor 11 = 1010 xor 1011 = 0001) mask: (00000000000000000000011) The payload length is 11 bytes. 6 Open Issues There are a number of open issues to be resolved. The change in defi- nition of the RTP header fields will affect many of the parameters sent in RTCP packets. For example, highest sequence number received and jitter computations may have to exclude FEC packets. Octet counts and number of transmitted packets probably should include FEC, how- ever. To simplify some of the sequence number based computations, an alter- nate sematic for the SN field in the FEC packets is possible. All packets can have sequence numbers which are one higher than the pre- vious transmitted packet, FEC or media. The offset mask field in FEC packets then covers positive and negative offsets. This makes less efficient use of the offset mask, but makes the sequence numbers more meaningful. 7 Conclusion This draft has presented an extension to RFC 1890 which allows for forward error correction of audio visual media. It is generic, allow- ing any sender defined error correction schemes to be used which meets the required criteria (any xor based strategy meets the crite- ria). It is also backwards compatible with existing implementations. Receivers which cannot understand FEC can discard the FEC packets, and still receive the media packets. J.Rosenberg,H.Schulzrinne July 30, 1997 [Page 11] Internet Draft Generic Error Correction July 1997 8 Security Considerations There are no security considerations beyond those discussed in [3] and [4]. 9 Author's Addresses Jonathan Rosenberg Lucent Technologies, Bell Laboratories 101 Crawfords Corner Rd. Holmdel, NJ 07733 Rm. 4D-534B email: jdrosen@bell-labs.com Henning Schulzrinne Columbia University M/S 0401 1214 Amsterdam Ave. New York, NY 10027-7003 email: schulzrinne@cs.columbia.edu 10 Bibliography [1] J.-C. Bolot and A. Garcia, The case for fec-based error control for packet audio in the internet, Multimedia Systems , 1997. [2] D. Budge, R. McKenzie, W. Mills, and P. Long, Media-independent error correction using rtp, (internet draft), Internet Engineering Task Force, May 1996. Work in Progress. [3] H. Schulzrinne, RTP profile for audio and video conferences with minimal control, Request for Comments (Proposed Standard) RFC 1890, Internet Engineering Task Force, Jan. 1996. [4] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, RTP: a transport protocol for real-time applications, Request for Comments (Proposed Standard) RFC 1889, Internet Engineering Task Force, Jan. 1996. J.Rosenberg,H.Schulzrinne July 30, 1997 [Page 12]