HTTP/1.1 200 OK Date: Tue, 09 Apr 2002 01:07:20 GMT Server: Apache/1.3.20 (Unix) Last-Modified: Tue, 29 Nov 1994 23:00:00 GMT ETag: "2e9e30-19794-2edbb270" Accept-Ranges: bytes Content-Length: 104340 Connection: close Content-Type: text/plain Internet Engineering Task Force Audio-Video Transport WG INTERNET-DRAFT Schulzrinne/Casner/Frederick/Jacobson draft-ietf-avt-rtp-06.txt GMD/ISI/Xerox/LBL November 28, 1994 Expires: 3/1/95 RTP: A Transport Protocol for Real-Time Applications Status of this Memo This document is an Internet Draft. Internet Drafts are working documents of the Internet Engineering Task Force (IETF), its Areas, and its Working Groups. Note that other groups may also distribute working documents as Internet Drafts. Internet Drafts are draft documents valid for a maximum of six months. Internet Drafts may be updated, replaced, or obsoleted by other documents at any time. It is not appropriate to use Internet Drafts as reference material or to cite them other than as a ``working draft'' or ``work in progress.'' Please check the I-D abstract listing contained in each Internet Draft directory to learn the current status of this or any other Internet Draft. Distribution of this document is unlimited. Abstract This memorandum describes the real-time transport protocol, RTP. RTP provides end-to-end network transport functions suitable for applications transmitting real-time data, such as audio, video or simulation data over multicast or unicast network services. RTP does not address resource reservation and does not guarantee quality-of-service for real-time services. The data transport is augmented by a control protocol (RTCP) designed to provide minimal control and identification functionality, particularly in multicast networks. RTP and RTCP are designed to be independent of the underlying transport and network layers. The protocol supports the use of RTP-level translators and mixers. ***** DISCLAIMER: This document is not completed. See the Open Issues Section. INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 This specification is a product of the Audio/Video Transport working group within the Internet Engineering Task Force. Comments are solicited and should be addressed to the working group's mailing list at rem-conf@es.net and/or the authors. Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1 Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2 Open Issues and Items to be Completed . . . . . . . . . . . . . 6 2 RTP Use Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1 Simple Multicast Audio Conference . . . . . . . . . . . . . . . 7 2.2 Mixers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3 Translators . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.4 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 4 Byte Order, Alignment, and Reserved Values . . . . . . . . . . . . . 11 5 RTP Data Transfer Protocol . . . . . . . . . . . . . . . . . . . . . 11 5.1 RTP Fixed Header Fields . . . . . . . . . . . . . . . . . . . . 11 5.2 SSRC Random Identifier Allocation . . . . . . . . . . . . . . . 13 5.3 RTP Header Extension . . . . . . . . . . . . . . . . . . . . . 14 6 RTP Control Protocol --- RTCP . . . . . . . . . . . . . . . . . . . . 15 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 15 6.2 RTCP packet format . . . . . . . . . . . . . . . . . . . . . . 16 6.3 SR: Sender report . . . . . . . . . . . . . . . . . . . . . . . 17 6.4 RR: Receiver report . . . . . . . . . . . . . . . . . . . . . . 20 6.5 SDES: Source description . . . . . . . . . . . . . . . . . . . 21 6.5.1 CNAME: Canonical end-point identifier . . . . . . . . . . 22 Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 2] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 6.5.2 NAME: User name . . . . . . . . . . . . . . . . . . . . . 24 6.5.3 EMAIL: User's electronic mail address . . . . . . . . . . 24 6.5.4 PHONE: User's phone number . . . . . . . . . . . . . . . 24 6.5.5 LOC: Geographic user location . . . . . . . . . . . . . . 25 6.5.6 TXT: Text describing the source . . . . . . . . . . . . . 25 6.5.7 TOOL: Name of application or tool . . . . . . . . . . . . 25 6.5.8 PRIV: Private extensions . . . . . . . . . . . . . . . . 26 6.6 BYE: Goodbye . . . . . . . . . . . . . . . . . . . . . . . . . 27 6.7 APP: Application-defined . . . . . . . . . . . . . . . . . . . 27 7 RTP Translators and Mixers . . . . . . . . . . . . . . . . . . . . . 28 7.1 General Description . . . . . . . . . . . . . . . . . . . . . . 28 7.2 Behavior of Mixers/Translators . . . . . . . . . . . . . . . . 30 7.3 Cascaded Mixers . . . . . . . . . . . . . . . . . . . . . . . . 30 8 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 8.1 Security Considerations . . . . . . . . . . . . . . . . . . . . 30 8.2 Confidentiality . . . . . . . . . . . . . . . . . . . . . . . . 31 9 RTP over Network and Transport Protocols . . . . . . . . . . . . . . 32 10Summary of Protocol Constants . . . . . . . . . . . . . . . . . . . . 33 10.1RTCP packet types . . . . . . . . . . . . . . . . . . . . . . . 33 10.2SDES types . . . . . . . . . . . . . . . . . . . . . . . . . . 33 11RTP Profiles and Payload Format Specifications . . . . . . . . . . . 34 A Implementation Notes . . . . . . . . . . . . . . . . . . . . . . . . 35 A.1 RTP Header Consistency Check . . . . . . . . . . . . . . . . . 37 A.2 Parsing RTCP Packets . . . . . . . . . . . . . . . . . . . . . 38 A.3 Generating SDES RTCP Packets . . . . . . . . . . . . . . . . . 38 A.4 Parsing SDES RTCP Packets . . . . . . . . . . . . . . . . . . . 39 Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 3] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 A.5 Generating a Random 32-bit Identifier . . . . . . . . . . . . . 40 A.6 Computing the RTCP Transmission Period . . . . . . . . . . . . 41 A.7 Estimating the Interarrival Jitter . . . . . . . . . . . . . . 44 A.8 Determining the Expected Number of RTP Packets . . . . . . . . 44 B Addresses of Authors . . . . . . . . . . . . . . . . . . . . . . . . 45 1 Introduction This memorandum specifies the real-time transport protocol (RTP), which provides end-to-end delivery services for data with real-time characteristics, for example, interactive audio and video. RTP itself does not provide any mechanism to ensure timely delivery or provide other quality-of-service guarantees, but relies on lower-layer services to do so. It does not guarantee delivery or prevent out-of-order delivery, nor does it assume that the underlying network is reliable and delivers packets in sequence. The sequence numbers included in RTP allow the end system to reconstruct the sender's packet sequence, but sequence numbers might also be used to determine the proper location of a packet, for example in video decoding, without necessarily decoding packets in sequence. RTP typically runs on top of UDP but may be used with other suitable underlying network or transport protocols (see Section 9). RTP transfers data in a single direction, possibly to multiple destinations if supported by the underlying network. RTP is intended to follow the principles of Application Level Framing and Integrated Layer Processing proposed by Clark and Tennenhouse [1]. That is, RTP is intended to be malleable to provide the information required by a particular application and will often be integrated into the application processing rather than being implemented as a separate layer. While RTP is primarily designed to satisfy the needs of multi-participant multimedia conferences, it is not limited to that particular application. Storage of continuous data, interactive distributed simulation, active badge, and control and measurement applications may also find RTP applicable. This document defines RTP, consisting of two closely-linked parts: o the real-time transport protocol (RTP), for exchanging data that has real-time properties. o the RTP control protocol (RTCP), for monitoring quality of service and for conveying information about the participants in an on-going session. The latter aspect of RTCP is used for "loosely controlled" sessions, i.e., where there is no explicit membership control and Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 4] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 set-up. This functionality may be fully or partially subsumed by a session control protocol, which is beyond the scope of this document. In addition to this document, a complete specification of RTP for a particular application will require one or more companion documents (see Section 11): o a profile specification document, which defines payload type codes and which may be used to define extensions or modifications to RTP that are specific to a particular class of applications. Typically an application will operate under only one profile. A profile for audio and video data may be found in the companion Internet draft draft-ietf-avt-profile(1). o payload format specification documents, which define how a particular payload, such as an audio or video encoding, is to be carried in RTP. A discussion of real-time services and algorithms for their implementation and background on some of the RTP design decisions can be found in [2]. The current Internet does not support the widespread use of real-time services. High-bandwidth services using RTP, such as video, can potentially seriously degrade other network services. Thus, implementors should take appropriate precautions to limit accidental bandwidth usage. Application documentation should clearly outline the limitations and possible operational impact of high-bandwidth real-time services on the Internet and other network services. 1.1 Changes This section highlights the changes since the July 1994 draft. o Length fields in RTCP all have zero as their lowest valid value to simplify error checking. o The algorithm determining the RTCP send frequency has been specified. o The RTP header file has been brought into agreement with the specification. ------------------------------ 1. ftp://ds.internic.net/internet-draft/draft-ietf-avt-profile-03.txt Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 5] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 o The intended use of the RTP header extension mechanism has been clarified. o A separate table calls out the protocol constants. o The name 'bridge' has been changed to 'mixer'; generally, the behavior of mixers and translators has been clarified. The description has been moved after the protocol has been described to avoid forward references. o A start has been made toward defining the delay jitter algorithm. A few variations are being discussed. o PHONE and TOOL SDES items have been added as standard types, as these are likely to be used by a large number of applications. o For private, application-specific extensions, the PRIV SDES type has been added. o The implementation appendix adds parsing of SDES items. o The implementation appendix emphasizes that the header file is valid for big-endian bit order only. 1.2 Open Issues and Items to be Completed There are several items which were not completed in time to make the Internet Draft submission deadline, or need wider input in forming a decision. Please note that these things mean this draft should not be considered complete and ready to implement. o Additional explanation is needed for the algorithms to calculate the RTCP report rate, to calculate the interarrival jitter report value, to perform SSRC ID collition and loop detection, and to perform RTP and RTCP header validation. o Guidelines on the use of SDES items other than CNAME is needed. Other than limited use of these values can negatively impact the RTCP reception reporting mechanism. o The numeric values assigned to the RTCP types still needs to be decided. There are implementations using SR=0, some using SR=1, and in addition a recommendation to set SR=201 in order to aid in header validity checking. o In the common case where no session member has transmitted anything, the receiver report would be empty. Should it be permissible to simply omit it? Is there anything to be gained by mandating its inclusion Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 6] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 given that an application should probably not fall over when it is missing? 2 RTP Use Scenarios The following sections describe some aspects of the use of RTP. The examples were chosen to illustrate the basic operation of applications using RTP, not to limit what RTP may be used for. In these examples, RTP is carried on top of IP and UDP, and follows the conventions established by the profile for audio and video specified in the companion Internet draft draft-ietf-avt-profile. 2.1 Simple Multicast Audio Conference A working group of the IETF meets to discuss the latest protocol draft, using the IP multicast services of the Internet for voice communications. Through some allocation mechanism the working group chair obtains a multicast group address and pair of ports. One port is used for control (RTCP) packets, and the other is used for audio data. This address and port information is distributed to the intended participants. The exact details of the allocation and distribution mechanism are beyond the scope of RTP. The audio conferencing application used by each conference participant sends audio data in small chunks of, say, 20 ms duration. Each chunk of audio data is preceded by an RTP header; RTP header and data are in turn contained in a UDP packet. The Internet, like other packet networks, occasionally loses and reorders packets and delays them by variable amounts of time. To cope with these impairments, the RTP header contains timing information and a sequence number that allow the receivers to reconstruct the timing seen by the source, so that, in this example, a chunk of audio is delivered to the speaker every 20 ms. The sequence number can also be used by the receiver to estimate how many packets are being lost. Each RTP packet also indicates what type of audio encoding (such as PCM, ADPCM or GSM) is being used, so that senders can change the encoding during a conference, for example, to accommodate a new participant that is connected through a low-bandwidth link. Each audio source has to have its timing reconstructed separately at the receiver. Sources are identified by the synchronization source identifier (SSRC), not their network address. The SSRC identifier is a randomly chosen value meant to be globally unique within a particular conference. Since members of the working group join and leave during the conference, it is useful to know who is participating at any moment and how well they are receiving the audio data. For that purpose, each instance of the audio application in the conference periodically multicasts a reception report Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 7] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 plus the name of its user on the RTCP (control) port. The email address and other user information may also be included. A site sends the RTCP BYE (Section 6.6) packet when it leaves a conference. The RTCP reception report indicates how well the current speaker is being received and may be used to control adaptive encodings. 2.2 Mixers So far, we have assumed that all sites want to receive audio data in the same format. However, this may not always be appropriate. Consider the case where participants in one area are connected through a low-speed link to the majority of the conference participants, who enjoy high-speed network access. Instead of forcing everyone to use a lower-bandwidth, reduced-quality audio encoding, a mixer is placed near the low-bandwidth area. This mixer resynchronizes incoming audio packets to reconstruct the constant 20 ms spacing generated by the sender, mixes these reconstructed audio streams, translates the audio encoding to a lower-bandwidth one and forwards the lower-bandwidth packet stream to the low-bandwidth sites. Since the mixer has constructed a new (mixed) stream of audio, it is now the synchronization source for the stream. In order to preserve the identity of the sites which are speaking, the mixers inserts one or more contributing source (CSRC) identifiers after the fixed RTP header. These identifiers are the synchronization source identifiers (SSRC) of those sites that contributed to the mixed packet. An example of this is shown for mixer M1 in Fig. 1. As name and location information is received by the mixer in RTCP packets from the high-speed sites, that information is passed on to the receivers served by the mixer, either aggregated or as received. 2.3 Translators Not all sites are reachable by IP multicast. For these sites, mixing may not be necessary, but a translation of the underlying transport protocol is. RTP-level gateways that do not mix packets from different sources are called translators in this document. Application-level firewalls, for example, will not let any IP packets pass. Two translators are installed, one on either side of the firewall, the outside one funneling all multicast packets received through the secure connection to the translator inside the firewall. The translator inside the firewall sends them again as multicast packets to a multicast group restricted to the site's internal network. Other examples include the connection of a group of hosts speaking only IP/UDP to a group of hosts that understand only ST-II. The packet-by-packet encoding translation of single sources is another example. The SSRC identifier makes it possible to identify individual sources even though they all pass through the same translator, i.e., carry the same network source address. In Fig. 1, hosts T1 and T2 are translators. Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 8] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 [E1] [E6] | | E1:17 | E6:15 | | | E6:15 V M1:48 (1,17) M1:48 (1,17) V M1:48 (1,17) (M1)------------->----------------->--------------->[E7] ^ ^ E4:47 ^ E4:47 E2:1 | E4:47 | | M3:89 (64,45) | | | [E2] [E4] M3:89 (64,45) | | legend: [E3] --------->(M2)----------->(M3)------------| [End system] E3:64 M2:12 (64) ^ (Mixer) | E5:45 | [E5] source: SSRC (CSRCs) -------------------> Figure 1: Sample RTP network with end systems, mixers and translators 2.4 Security Conference participants would often like to ensure that nobody else can listen to their deliberations. Encryption provides that privacy. In Section 8.1, RTP specifies a mechanism for using encryption, but the actual key distribution must be accomplished by external means. 3 Definitions RTP payload is the data following the RTP fixed header and the CSRC list. The payload format and interpretation are beyond the scope of this memo. Examples of payload include audio samples and video data. RTP packets consist of the fixed RTP header, a possibly empty list of contributing sources (CSRC list), and the payload, if any. Some underlying protocols may require an encapsulation of the RTP packet to be defined. A single packet of the underlying protocol may contain several RTP packets if permitted by the encapsulation method. (protocol) port is the "abstraction that transport protocols use to distinguish among multiple destinations within a given host computer. TCP/IP protocols identify ports using small positive integers." [4] The transport selectors (TSEL) used by the OSI transport layer are equivalent to ports. Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 9] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 Synchronization source: All packets from a synchronization source form part of the same timing and sequence number space. Examples of synchronization sources are a microphone, a mixer or a camera. A receiver groups packets by synchronization source for playback. Typically a single synchronization source emits a single medium (e.g., audio or video). A synchronization source may change its data format, e.g., audio encoding, over time. Synchronization sources are identified by the SSRC value, a numeric identifier contained in the RTP header. SSRC is defined in Section 5.2. Contributing source: A contributing source identifies sources which contributed to the data coming from a synchronization source. They are used by mixers (see below) to indicate which sources were combined to generate a particular packet. An example application is audio conferencing where a mixer could indicate all the speakers whose speech was combined to produce the outgoing packet, allowing the receiver to indicate the current speaker, even though all audio packet originated from the same synchronization source. End system: An end system generates the content to be sent in RTP packets and consume the content of received RTP packets. An end system can act as one or more synchronization sources in a given media session, but typically only one. Mixer: A mixer receives RTP packets from one or more sources, possibly changes their data format, combines them in some manner and then forwards a new RTP packet. Since the timing among multiple input sources will not generally be synchronized, the mixer will make timing adjustments among the streams and generate its own timing for the combined stream. Thus, all data packets originating from a mixer will be identified as having the mixer as their synchronization source. A mixer may indicate the contributing sources (see above) for the convenience of the receiver. Translator: A translator forwards RTP packets with their synchronization source intact. Examples of translators include devices that convert encodings without mixing or convert from multicast to unicast, and application-level filters in firewalls. QOS monitor: A (QOS) monitor is an application that receives RTCP messages, including quality-of-service reports, and estimates the current quality of service for monitoring, fault diagnosis and long-term statistics. Recorder: A recorder records RTP and RTCP packets for later playback. A recorder is usually separate from an end system. It should try to recreate the timing at the sender, without the jitter introduced by the network, using the RTP timestamp. A recorder may not have access to the same encryption keys as the other participants in a session, in which case sender timing must be estimated if the RTP timestamps are encrypted. Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 10] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 Non-RTP mechanisms: refers to other protocols and mechanisms that may be needed to provide a usable service. In particular, for multimedia conferences, a conference control application may distribute multicast addresses and keys for encryption and authentication, negotiate the encryption algorithm to be used, and determine the mapping from the RTP format field to the actual data format used. For simple applications, electronic mail or a conference database may also be used. The specification of such mechanisms is outside the scope of this memorandum. 4 Byte Order, Alignment, and Reserved Values All integer fields are carried in network byte order, that is, most significant byte (octet) first. This byte order is commonly known as big-endian. The transmission order is described in detail in [5], Appendix A. Unless otherwise noted, numeric constants are in decimal (base 10). All header data is aligned to its natural length, i.e., 16-bit words are aligned on even byte addresses, 32-bit long words are aligned at addresses divisible by four, etc. Octets designated as padding have the value zero. Fields designated as "reserved" or R are set aside for future use; they should be set to zero by senders and ignored by receivers. NTP timestamps are represented as a 64-bit unsigned fixed-point number, in seconds relative to 0h UTC on 1 January 1900. The integer part is in the first 32 bits and the fraction part in the last 32 bits [6]. 5 RTP Data Transfer Protocol 5.1 RTP Fixed Header Fields The RTP header has the following format: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |T=2|P|X| CC |M| PT | sequence number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | timestamp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | synchronization source (SSRC) identifier | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | content source (CSRC) identifiers | | .... | Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 11] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ The first twelve octets are present in every RTP packet, while the list of CSRC identifiers is present only when inserted by a bridge. The fields have the following meaning: type (T): 2 bits Identifies the type of RTP packet. The type of the packet described here is two (2). (The value of 2 was chosen to easily distinguish packets from those of the prior version of RTP and the protocol used by the vat audio tool.) padding (P): 1 bit If the padding bit is one, the packet contains one or more additional octets at the end which are not part of the payload. The very last octet of the packet is a count of how many padding octets should be ignored. Padding may be needed by some encryption algorithms with fixed block sizes or for carrying several RTP packets in a lower-layer protocol data unit. extension (X): 1 bit The bit indicates that the fixed header is followed by exactly one header extension, with a format defined in Section 5.3. CSRC count (CC): 4 bits This field contains the number of CSRC identifiers that follow the fixed header. marker (M): 1 bit The interpretation of this field is defined by a profile. A profile may define additional marker bits by reducing the number of bits in the payload type field. payload type (PT): 7 bits The payload type forms an index into a table defined through profiles or non-RTP mechanisms (see Section 3). The mapping establishes the format of the RTP payload and determines its interpretation by the application. A profile specifies a standard mapping. An initial set of default mappings for audio and video is specified in the companion profile document RFC TBD, and may be extended in future editions of the Assigned Numbers RFC. sequence number: 16 bits The sequence number counts RTP packets. The sequence number increments by one for each packet sent. The sequence number may be used by the receiver to detect packet loss and to restore packet sequence. The initial value of the sequence number is random (unpredictable) to make known-plaintext attacks on encryption more difficult, even if the source itself does not encrypt, because the packets may flow through a Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 12] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 translator that does. timestamp: 32 bits The timestamp reflects the sampling instant of the first octet in the RTP data packet. The timestamp is incremented with the nominal clock frequency determined by the format of data carried as payload. For example, for fixed-rate audio, the timestamp would likely increment by one for each sample. The clock frequency is determined statically for each payload type by a profile or payload format specification, or dynamically through non-RTP means. If RTP packets are generated periodically, the nominal sampling instant is to be used, not a reading of the system clock. For example, for 160-octet audio packets and a one-octet-per-sample encoding, the timestamp should be increased by 160 for each block of 160 samples read from the input device whether the block is transmitted in a packet or dropped as silent. All samples must be counted so that the clock is stable. Several consecutive RTP packets may have equal timestamps if they are (logically) generated at once, e.g., belong to the same video frame. The initial value of the timestamp is random, as for the sequence number. SSRC: 32 bits Synchronization source identifier. This value is chosen randomly, with the intent that no two synchronization sources within the same media session will have the same SSRC value. Details are described in Section 5.2. CSRC: up to 15 items, 32 bits each Zero or more contributing source identifiers. The number of identifiers is given by CC. There can be no more than 15 contributing sources identified. CSRC identifiers are inserted by mixers, using the SSRC identifiers of contributing sources. For example, for audio packets, all sources that were mixed together to create a packet are enumerated, allowing correct talker indication at the receiver. 5.2 SSRC Random Identifier Allocation The SSRC identifier described above is a random 32-bit quantity that is intended to be globally unique within a media session. In particular, a local network address such as the IPv4 address, is not to be used as an SSRC identifier. An example of how to generate such an identifier is presented in Section A.5. If a source discovers at any time that another source is already using the same SSRC identifier, it randomly chooses a different SSRC identifier. If a source has transmitted packets with the colliding identifier, it should send a BYE control packet with the old SSRC identifier before switching to allow applications to clear any records for this SSRC. Statistics for Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 13] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 receiver reports are keyed to SSRC (not CNAMEs or other identifiers), thus, a receiver does not have to attempt to carry over statistics when a source changes SSRC identifiers. A source that changes its SSRC identifier should reset the statistics transmitted through sender reports. If N is the number of sources and L the length of the identifier (here, 32 bits), the probability that two sources independently pick the same value can be approximated for large N [7, p. 33] as 1 - exp(- N**2 / 2**(L+1)). For N=1000, the probability is roughly 0.01%. Because the random identifiers are globally unique, they can be used to detect loops that may be introduced by bridges. For each CSRC, the application should check that packets contain a single SSRC value. However, duplicate SSRC values may also indicate a collision resolution in progress. 5.3 RTP Header Extension The existing RTP data packet header is believed to be complete for the set of functions required in common across all the application classes that RTP might support. If a particular class of applications, operating under one profile, needs additional functionality, that profile may define additional fixed fields to follow the SSRC field of the existing fixed header. If it turns out that additional functionality is needed across all profiles, then a new version of RTP should be defined to make a permanent change to the fixed header. However, an escape hatch is provided to allow individual implementations to experiment with new mechanisms that require additional information to be carried in the RTP data packet header. The header extension mechanism is designed so that it may be ignored by other interoperating implementations that have not been extended. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | defined by profile | length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ If the X bit in the RTP header is one, a variable-length header extension is appended to the RTP header, following the CSRC list if present. The header extension contains a 16-bit length field that counts the number of 32-bit words in the extension, excluding the four-octet extension header (therefore zero is a valid length). Only a single extension may be appended to the RTP data header. To allow multiple interoperating implementations to each experiment independently with different header extensions, or to allow a particular implementation to experiment with more than one type of header extension, the first 16 bits of the header extension are left open for distinguishing identifiers or parameters. The format of these 16 bits is Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 14] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 to be defined by the profile specification under which the implementations are operating. This RTP specification does not define any header extensions itself. Note that this mechanism is intentionally cumbersome. Many candidate uses would better be done another way, for example with a profile-specific extension to the fixed header. In particular, additional information required for a particular payload type, such as a video encoding, should be carried in the payload section of the packet. This might be in a header that is always present at the start of the payload section, or might be indicated by a reserved value in the data pattern. Every conformant RTP application needs to be able to skip, but not process the header extension. 6 RTP Control Protocol --- RTCP 6.1 Introduction The RTP control protocol (RTCP) provides two functions: (1) monitoring the distribution of data, and (2) conveying minimal session information. The first function is performed by the RTCP sender or receiver report packets, described below. This function is an integral part of the RTP's role as a transport protocol, and is mandatory when RTP is used in the IP multicast environment. The second RTCP function provides support for "loosely controlled" sessions, i.e., where participants enter and leave without membership control and parameter negotiation. RTCP packets are sent to all members of a session, using the same distribution mechanism as for data packets. The underlying protocol must provide multiplexing of the data and control packets, for example using separate port numbers with UDP. The period between RTCP packets should be varied randomly to avoid synchronization of all sources. Its mean should increase with the number of participants in the session to limit the growth of the overall network and host interrupt load to a small fraction of the load induced by the media data. An algorithm for calculating the period is given in Appendix A.6. The length of the RTCP period determines, for example, how long a receiver joining a session has to wait until it can identify the source. A receiver may remove from its list of active sites a site that it has not been heard from for a given time-out period; the time-out period may depend on the number of sites or the observed average interarrival time of RTCP messages. A small multiple of the RTCP period is suggested to allow for packet loss. Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 15] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 Not every RTCP message has to contain all SDES descriptions for a source; for example, SDES EMAIL might only be sent every few messages. 6.2 RTCP packet format Each RTCP packet begins with a fixed part similar to that of RTP data packets, followed by structured elements that may be of variable length but always end on a 32-bit boundary. The length field and alignment requirement are included to make RTCP packets "stackable". Multiple RTCP packets may be sent in a single packet of the lower layer protocol such as UDP to combine as much information as possible into one packet, particularly for translators and mixers. This is advisable since per-packet processing overhead in the network and in many operating systems is high. For example, in a Unix operating system running the X windowing system, each packet is likely to cause a hardware interrupt, a software interrupt, a context switch and an X event. Any combination of RTCP packets may be stacked in one lower-layer packet, and each RTCP packet is processed independently. An application may skip RTCP packets with types unknown to it. Additional RTCP packet types may be registered with the Internet Assigned Numbers Authority. The first RTCP packet is always a report packet, which may be in either of two forms: a sender report (SR) for source that have recently transmitted RTP data packets or receiver reports (RR) for sources that have not recently sent RTP data. It may optionally be followed by more receiver report (RR) packets if the number of sources being reported exceeds 31, the number that will fit into one SR or RR packet. These one or more report packets are followed by an SDES packet containing at least the CNAME item. Finally, APP, BYE or other, yet to be defined packet types may follow in any order. Packet types may appear more than once. Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 16] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 6.3 SR: Sender report 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |T=2|P| RC | PT=RTCP_SR=0 | length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC of sender | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | NTP timestamp, most significant word | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | NTP timestamp, least significant word | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RTP timestamp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | sender's packet count | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | sender's octet count | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | SSRC_1 (SSRC of first source) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | cumulative number of packets received | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | cumulative number of packets expected | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | interarrival jitter | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | last SR (LSR) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | delay since last SR (DLSR) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC_2 (SSRC of second source) | ... +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | application-specific extensions | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ The sender report packet consists of two sections. The first section, the actual sender report, is 24 octets long and is present in every sender report packet. The second section contains zero or more reception reports depending on the number of sources heard since the last report. The fields have the following meaning: type (T): 2 bits The current value of the type identifier is 2 (two), as in RTP packets. padding (P): 1 bit If the padding bit is one, the packet contains some additional octets at the end which are not part of the payload. The very last octet of Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 17] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 the packet is a count of how many padding octets should be ignored. Padding may be needed by some encryption algorithms with fixed block sizes. reception report count (RC): 5 bits This field contains the number of reception report blocks contained in this packet. A value of zero is valid. packet type (PT): 8 bits The value of the packet type identifier is the constant RTCP_SR, defined in appendixA. length: 16 bits The length of this RTCP packet in 32-bit words minus one, including the header and any padding.(2) SSRC: 32 bits Synchronization source identifier for the sender of this RTCP packet. NTP timestamp: 64 bits The NTP timestamp corresponds to the wallclock time when this traffic report is sent so that it may be used in combination with timestamps returned in reception reports from other receivers to measure round-trip propagation to those receivers. Receivers should expect that the measurement accuracy of the timestamp may be limited to far less than the resolution of the NTP timestamp. The measurement uncertainty of the timestamp is not transmitted as it is usually difficult to estimate with any degree of reliability. A sender that can keep track of real time but has no notion of wallclock time may use the elapsed time of the session instead. It is permissible to use the sampling clock to estimate elapsed wallclock time. This is assumed to be less than 68 years, so the high bit will be zero. A sender that has no notion of wallclock time may set the NTP timestamp to zero. RTP timestamp: 32 bits Reference timestamp that corresponds to the same time as the NTP timestamp (above). This correspondence may be used for intra- and inter-media synchronization for sources whose NTP timestamps are synchronized, and may be used by media-independent receivers to estimate the nominal RTP clock frequency. This RTP timestamp is calculated from the corresponding NTP timestamp using the relationship between the RTP timestamp counter and real time as maintained by periodically checking the real time at a sampling instant. sender's packet count: 32 bits Counts the total number of RTP packets transmitted by the source since the source has started transmission and until the time this SR packet ------------------------------ 2. The offset of one makes zero a valid length and avoids possible infinite loops. Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 18] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 was generated. sender's octet count: 32 bits Counts the total number of octets transmitted in RTP packets by the source since the source has started transmission and until the time this sender report packet was generated. The octet count includes only the payload of RTP data packets. This field can be used to estimate the overall payload data rate. Each reception report in the second section of the sender packet conveys statistics on the reception of RTP packets from a single synchronization source. These statistics are: SSRC_n (source identifier): 32 bits SSRC identifier of the source to which the information in this reception report pertains. cumulative number of packets received: 32 bits The field contains the total number of RTP packets received from the source since the beginning of reception. By taking the difference in this number between two reception reports from a given source, and dividing by the interval between those two reports, a received packet rate may be calculated. cumulative number of packets expected: 32 bits The field contains the total number of packets expected by the receiver, which may be computed according to the algorithm in Appendix A.8. Together with the cumulative number of packets received, a monitor can measure the packet loss rate over both short and long time periods. The number of packets expected may also be used to judge the statistical validity of any loss estimates. (For example, 1 out of 5 packets lost has a different significance than 200 out of 1000.) There will be no loss indication (and likely no reception report issued) for a source if all recent packets from that source have been lost. interarrival jitter: 32 bits The interarrival jitter field should be an estimate of the statistical variance of the RTP data interarrival time, measured in timestamp units and expressed as an unsigned integer. A particular algorithm is not prescribed, but a sample algorithm is shown in Section A.7. If a receiver cannot estimate this value, it should use a value of zero. last SR timestamp (LSR): 32 bits The middle 32 bits of the last NTP timestamp (bytes 11 to 14) received as part of the RTCP reception report (RR) packet from the source being reported. delay since last SR (DLSR): 32 bits Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 19] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 Delay, expressed in units of 1/65536 seconds, between receiving the sender's SR packet and sending this SR packet. The 'last SR' and 'delay since last SR' fields allow the computation of round trip time by the sender of the SR. This may be used to cluster nodes according to propagation delay. If the reception report for SSRC S from receiver R arrives at time A at S, S can compute the round-trip time to R as A -- LSR -- DLSR. Round-trip may be of limited use for many real-time applications and that some links have very asymmetric delays. All reported numbers except interarrival jitter are cumulative. The difference between two reports can be used to estimate recent quality of the distribution. A fixed clock (NTP timestamp) is chosen so that quality monitors do not have to be cognizant of the clock rate for the current encoding. If a source cannot compute a particular value, it inserts a value of zero. A receiver (end system or mixer) should send sender/receiver report packets including a reception report for each source from which it has received RTP packets since the last report, or for as many such sources as will fit. A mixer should not send reception reports on one side for sources it has received on the other side. A profile may define application specific extensions to the sender report if there is additional information that should be reported regularly about the sender or receivers. If information about receivers is to be included, that data may be structured as an array of blocks parallel to the array of receiver reports in the second section of the sender report. 6.4 RR: Receiver report 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |T=2|P| RC | PT=RTCP_RR=1 | length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC of packet sender | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | SSRC_1 (SSRC of first source) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | cumulative number of packets received | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | cumulative number of packets expected | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | interarrival jitter | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | last SR (LSR) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 20] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 | delay since last SR (DLSR) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC_2 (SSRC of second source) | ... +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | application-specific extensions | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ The RR packet is issued in place of an SR packet only if the application has not recently sent any RTP data packets. (Unless specified by a profile, the timeout delay between sending the last RTP packet and ceasing to send SR packets should be a small multiple of the current reporting interval.) The packet fields have the same meaning as for the SR packet. Additional RR packets may follow the initial SR or RR packet if there are more than 31 sources to be reported. 6.5 SDES: Source description 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |T=2|P| CC | PT=RTCP_SDES=2| length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC/CSRC_1 | chunk +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SDES items | | ... | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | SSRC/CSRC_2 | chunk +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SDES items | | ... | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ The SDES packet is composed of a header and zero or more chunks containing items describing the sources identified in those chunks. The items are described individually below. type (T), padding (P), payload type (SDES), length: As described for the SR packet. CC: 5 bits This field contains the number of SSRC/CSRC chunks included in this SDES packet. Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 21] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 Each chunk consists of an SSRC/CSRC identifier followed by a list of zero or more items, which carry information about the SSRC/CSRC. Each chunk starts on a 32-bit boundary. Each item consists of an 8-bit type field, an 8-bit octet count describing the length of the text (thus, not including this two-octet header) and text. The text is encoded according to the UTF-2 encoding specified in Annex F of ISO standard 10646 [8,9]. This encoding is also known as UTF-8 or UTF-FSS. It is described in ``File System Safe UCS Transformation Format (FSS_UTF)'', X/Open reliminary Specification, Document Number: P316 and Unicode Technical Report #4. US-ASCII is a subset of this encoding and requires no additional encoding. The presence of multi-octet encodings is indicated by setting the most significant bit to a value of one. Items are contiguous, i.e., items are not individually padded to a 32-bit boundary. Text is not zero terminated. The list of items in each chunk is terminated by one or more binary zeroes to denote the end of the list and pad until the next 32-bit boundary. An SDES packet with zero chunks or a chunk with zero items is valid but useless. End systems send one SDES packet containing their own source identifier (the same as the SSRC in the fixed RTP header). A mixer sends one SDES packet containing a chunk for each contributing source from which it is receiving SDES information, or more than one SDES packet if there are more than 31 such sources. The following SDES items are currently defined. Additional items may be defined in a profile; some items shown here may be useful for particular profiles only. Not all items need to be sent with every SDES packet, except for the CNAME item, which is mandatory.(3) 6.5.1 CNAME: Canonical end-point identifier 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | CNAME=1 | length | user and domain name ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The CNAME identifier has the following properties: o Because the randomly allocated SSRC identifier may change if a conflict is discovered or if a program is restarted, the CNAME item is required to provide the binding to an identifier for the source that remains constant. ------------------------------ 3. Items are defined here rather than in the profile to simplify profile-independent applications, using common type numbers. Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 22] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 o Like the SSRC identifier, the CNAME identifier should also be unique within one medium of a session. o To provide a binding among multiple media tools used in a session by one participant, the CNAME should be fixed for that participant. o To facilitate third-party monitoring, the CNAME should be suitable for either a program or a person to locate the source. Therefore, the CNAME should be derived algorithmically and not entered manually, when possible. To meet these requirements, the following format should be used unless a profile specifies an alternate syntax or semantics. The CNAME item should have the format "user@host" or "host", where "host" is the fully qualified domain name of the host from which the real-time data originates, formatted according to the rules specified in RFC 1034, RFC 1035 and Section 2.1 of RFC 1123. The "host" form may be used if a user name is not available, for example on single-user systems. Only if a system cannot obtain a valid domain name, it may use the printable representation of its lowest numbered numeric network address. Hosts using IP Version 4 use the 'dotted decimal' (also known as 'dotted quad') representation. Application writers should be aware that address assignments such as the Net-10 assignment proposed in RFC 1597 may create IP network addresses that are not globally unique. This may create difficulties if sites that do not have direct IP connectivity to the public Internet forward RTP packets to the public Internet through an RTP-level firewall. (See also RFC 1627.) To handle this case, applications should provide a means to configure a unique name. Examples are: "doe@sleepy.megacorp.com" or "sleepy.megacorp.com" or "doe@192.35.149.160" or "192.35.149.160" The user name should be in a form that a program such as "finger" or "talk" could use, i.e., it typically is the login name rather than the real-life name. The host name is not necessarily identical to the electronic mail address of the participant. This syntax will not provide unique identifiers for each source if an application permits a user to generate multiple sources from one host. Such an application would have to rely on the SSRC to further identify the source, or the profile for that application would have to specify additional syntax for the CNAME identifier. Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 23] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 6.5.2 NAME: User name 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | NAME=2 | length | common name of source ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The real name used to describe the source, e.g., "John Doe, Bit Recycler, Megacorp". This name may be in any form desired by the user. For applications such as conferencing, this form of name may be the most desirable for display in participant lists, and therefore might be sent most frequently (profiles may establish such priorities). The NAME value is expected to remain constant at least for the duration of a session. It should not be relied upon to be unique across the session. 6.5.3 EMAIL: User's electronic mail address 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | EMAIL=3 | length | email address of source ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The email address is formatted according to RFC 822, for example, "John.Doe@megacorp.com". The EMAIL value is expected to remain constant for the duration of a session. 6.5.4 PHONE: User's phone number 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | PHONE=4 | length | phone number of source ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The phone number should be formatted with the plus sign replacing the international access code. For example, "+1 908 555 1212" for a number in the United States. Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 24] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 6.5.5 LOC: Geographic user location 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | LOC=5 | length | geographic location of site ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Depending on the application, different degrees of detail are appropriate for this item. For conference applications, a string like "Murray Hill, New Jersey" may be sufficient, while, for an active badge system, strings like "Room 2A244, AT&T BL MH" might be appropriate. The degree of detail is left to the implementation and/or user, but format and content may be prescribed by a profile. The LOC value is expected to remain constant for the duration of a session, except for mobile hosts. 6.5.6 TXT: Text describing the source 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TXT=7 | length | text describing source ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Message describing the current state of the source, e.g., "can't talk, having lunch". During a seminar, this field might be used to convey the title of the talk. The TXT value is likely to change during a session. 6.5.7 TOOL: Name of application or tool 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TOOL=6 | length | name/version of source appl. ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ String giving the name and possibly version of the application generating the stream, e.g., "videotool 1.2". This information may be useful for debugging purposes and is similar to the Mailer or Mail-System-Version SMTP headers. Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 25] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 6.5.8 PRIV: Private extensions 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | PRIV=8 | length | length of type| type string ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... | value string ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ This type is used to define experimental or application-specific SDES extensions. The item contains a prefix consisting of a length-string pair, followed by the value string filling the remainder of the item. The prefix length field is one octet long. The prefix string is a name chosen by the person defining the PRIV item to be unique with respect to other PRIV items this application might receive. The application creator might choose to use the application name plus an additional subtype identification if needed. Alternatively, it is recommended that others choose a name based on the entity they represent, then coordinate the use of the name within that entity. Note that the prefix consumes some space within the items total length of 255 octets, so the prefix should be kept as short as possible. The second string is the value, that is, the information carried by this item. SDES PRIV types will not be registered by IANA. If a type proves to be of general utility, it should be assigned a regular SDES type and registered with IANA instead for ease of handling and transmission efficiency. Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 26] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 6.6 BYE: Goodbye 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |T=2|P| CC | PT=RTCP_BYE=3 | length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC/CSRC | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | reason for leaving ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The BYE packet indicates that one or more sources are no longer active. type (T), padding (P), payload type (BYE), length: As described for the SR packet. CC: 5 bits This field contains the number of SSRC/CSRC identifiers included in this SDES packet. A count value of zero is valid, but meaningless. If a BYE packet is received by a mixer, the mixer forwards the BYE packet with the SSRC/CSRCS identifier(s) unchanged. If a mixer shuts down, it should send a BYE packet listing all contributing sources it handles, as well as its own SSRC identifier. Optionally, the BYE packet may include an octet count followed by the indicated number of characters indicating the reason for leaving, e.g., "camera malfunction". The string has the same encoding as that described for SDES. If the string fills the RTCP packet to the next 32-bit boundary, the string is not zero terminated. If not, the RTCP BYE packet is padded with zeroes. 6.7 APP: Application-defined 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |T=2|P| subtype | PT=RTCP_APP=4 | length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC/CSRC | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | name (ASCII) | Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 27] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | application-dependent data ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The APP packet is intended for experimental use as new applications and new features are developed, without requiring packet type value registration. APP packets with unrecognized names should be ignored. After testing and if wider use is justified, it is recommended that each APP packet be redefined without the subtype and name fields and registered with the Internet Assigned Numbers Authority using an RTCP packet type. type (T), padding (P), packet type (APP), length: As defined for the SR packet. subtype: 5 bits May be used as a subtype to allow a set of APP packets to be defined under one unique name, or for any application-dependent data. name: 4 octets A name chosen by the person defining the set of APP packets to be unique with respect to other APP packets this application might receive. The application creator might choose to use the application name, and then coordinate the allocation of subtype values to others who want to define new packet types for the application. Alternatively, it is recommended that others choose a name based on the entity they represent, then coordinate the use of the name within that entity. The name is interpreted as a sequence of four ASCII characters, with uppercase and lowercase characters treated as distinct. application-dependent data: variable length Application-dependent data may or may not appear in an APP packet. It is interpreted by the application and not RTP itself. 7 RTP Translators and Mixers 7.1 General Description Besides end-systems, RTP also supports the notion of "translators" and "mixers", which could be considered as "intermediate systems" at the RTP level. A translator connects two or more transport-level "clouds". Typically, each cloud is defined by a common transport level port, a multicast address and a transport protocol (e.g., UDP). (Exceptions are network-level protocol translators, which we ignore here.) The use of Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 28] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 translators and mixers must not result in loops. In the explanations below, we use the short-hand terms left and right cloud for conciseness to refer to two clouds connected by a translator/mixer. The concepts apply naturally in the other direction or if a translator/mixer joins more than two clouds. All RTP end systems that can communicate through one or more RTP translators/mixers share the same SSRC space, that is, the SSRC identifiers must be unique among all these end systems. Translators may change the encoding of the data (and thus the RTP data payload type and timestamp) and may combine several data packets. If they combine several data packets, they have to change the sequence number in each. We distinguish three basic kinds of translators and mixers: invisible translator: Invisible translators cannot be detected by end systems except if they knew what payload type or transport address was used by the sender. They do not have their own SSRC identifier. [other terms: anonymous translator; translator without a personality :-); shy translator :-)] visible translator: A visible translator has its own SSRC identifier. It forwards packets with their original SSRC identifier. [other terms: self-identifying translator?] mixer: A mixer has its own SSRC identifier and forwards all incoming data packets, combined (mixed) into a single stream, with its own SSRC identifier. A mixer may indicate the sources that contributed to a particular packet by adding CSRC identifiers to the RTP data packet. However, this is not required and may be ill advised for some applications using low-bandwidth links. A mixer that is also a contributing source for some packet must explicitly include an indentifier for itself in the CSRC list for that packet. Fig. 1 shows a combination of mixers and translators and their effect on CSRC and SSRC identifiers. In the figure, end systems are shown as rectangles (named E), translators as triangles (named T) and mixers as ovals (named M). The notation "M1: 48(1,17)" designates a packet originating a mixer M1, identified with a random SSRC value of 48 and two CSRC identifiers, 1 and 17. Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 29] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 7.2 Behavior of Mixers/Translators invisible visible mixing own SSRC no yes yes insert CSRC no no may send own RR no yes yes send own SR no no yes send own BYE no yes yes The processing of RTCP by translators and mixers is governed by these rules: SDES: If and only if a translator/mixer has its own SSRC, it must send SDES CNAME information about itself. Invisible and visible translators typically forward SDES information unchanged from one cloud to the other, but may, for example, decide to filter non-CNAME SDES information if bandwidth is scarce. Mixers must forward SDES CNAME information if the CSRCs they include in RTP data packets. RR: For sources in one cloud, the mixer generates its own reception reports and sends them to the same cloud. It does not send these reception reports to the other cloud. Invisible and visible translators forward reception reports with their SSRC identifier unchanged between the left and right cloud. SR: An invisible translator does not generate its own sender reports, but rather forwards those received in one cloud to the other, suitably modified. In particular, the RTP timestamp, the sender's packet and octet count may have to be modified if the encoding is changed. BYE: Translators forward BYE packets unchanged. Mixers only need to forward BYE packets if they use CSRC identifiers. Mixers and visible translators should generate BYE packets with their own SSRC identifiers if they are about to cease forwarding packets. 7.3 Cascaded Mixers 8 Security 8.1 Security Considerations RTP suffers from the same security liabilities as the underlying protocols. For example, an impostor can fake source or destination network addresses, or change the header or payload. For example, the CNAME and NAME Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 30] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 information may be used to impersonate another participant. In addition, RTP may be sent via IP multicast, which provides no direct means for a sender to know all the receivers of the data sent and therefore no measure of privacy. Rightly or not, users may be more sensitive to privacy concerns with audio and video communication than they have been with more traditional forms of network communication [10]. Therefore, the use of security mechanisms with RTP is important. As a first step, RTCP makes it easy for all participants in a session to identify themselves; if deemed important for a particular application, it is the responsibility of the application writer to make listening without identification difficult. It should be noted, however, that privacy of the payload can generally be assured only by encryption. The security measures described below can be used to implement confidentiality. Authentication and message integrity are not defined in the current specificiation of RTP. Security services might also be provided at the IP layer as security mechanisms are developed for that layer. The periodic transmission of RTCP or empty RTP packets from sources that are otherwise idle may make it possible to detect denial-of-service attacks, as the receiver can detect the absence of these expected messages. The messages that are received must be verified for integrity and authenticated before being accepted for this purpose. Key distribution and certificates are outside the scope of this document. The section below defines a confidentiality security service and defines standard algorithms for both RTP and RTCP. Other services, other implementations of services and other algorithms may be defined in the future. The selection presented here is meant to simplify implementation of interoperable, secure applications and provide guidance to implementors. No claim is made that the methods presented here are appropriate for a particular security need. A profile specifies which of the services and algorithms should be offered by applications, and may provide guidance as to their appropriate use. 8.2 Confidentiality Confidentiality means that only the intended receiver(s) can decode the received packets; for others, the packet contains no useful information. Confidentiality of the content is achieved by encryption. All RTP and RTCP packets in a single lower-layer protocol data unit are encrypted as a unit. For RTCP, it is allowed to send some such lower-layer packets encrypted, others in the clear. (This accomodates monitors that are not privy to the encryption key.) For RTP, no additional data structures are required. For RTCP, a 32-bit random number is prepended to the Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 31] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 unit before encryption to deter known plaintext attacks. The presence of encryption and the use of the correct key are confirmed by the receiver through header or payload consistency checks. An example of such a consistency check is given in Section A.1. The default encryption algorithm is the Data Encryption Standard (DES) algorithm in CBC (cipher block chaining) mode, as described in Section 1.1 of RFC 1423 [11], except that padding to a multiple of 8 octets is indicated as described for the P bit in Section 5.1. The initialization vector is zero because random values are supplied in the RTP header or by the random prefix for RTCP packets. For details on the use of CBC initialization vectors, see [12]. Implementations that support encryption should always support the DES algorithm in CBC mode. As an alternative to encryption at the RTP level as described above, profiles may define additional payload types for encrypted encodings. Those encodings must specify how padding and other aspects of the encryption should be handled. This method allows encrypting only the data while leaving the headers in the clear for applications where that is desired. It may be particularly useful for hardware devices that will handle both decryption and decoding. 9 RTP over Network and Transport Protocols This section describes issues specific to carrying RTP packets within particular network and transport protocols. The following rules apply unless superseded by protocol-specific definitions outside this specifications. RTP relies on the underlying protocol(s) to provide demultiplexing. For UDP and similar protocols, RTP uses an even port number and the corresponding RTCP stream uses the next higher port number. RTP packets contain no length field or other delineation, therefore RTP relies on the underlying protocol(s) to provide a length indication. The maximum length of RTP packets is limited only by the underlying transport mechanism. If RTP packets are to be carried in an underlying protocol that provides the abstraction of a continuous octet stream rather than messages (packets), an encapsulation of the RTP packets must be defined to provide a framing mechanism. TCP is an example of such a protocol. Framing is also needed if the underlying protocol may contain padding so that the extent of the RTP payload cannot be determined. The framing mechanism is not defined here. A profile may specify a framing method to be used even when RTP is carried in protocols that do provide framing in order to allow carrying several RTP packets in one lower-layer protocol data unit, such as a UDP Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 32] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 packet. Carrying several RTP packets in one network or transport packet reduces header overhead and may simplify synchronization between different streams. 10 Summary of Protocol Constants In this section, the symbolic constants used in the text are assigned numeric values. The following constants are defined in profiles rather than this document: RTP payload type (PT). 10.1 RTCP packet types abbrev. name value SR sender report 0 RR receiver report 1 SDES source description 2 BYE Goodbye 3 APP application-defined 4 Other constants are assigned by IANA. 10.2 SDES types abbrev. name value END end of SDES list 0 CNAME canonical name 1 NAME user name 2 EMAIL user's electronic mail address 3 PHONE user's phone number 4 LOC geographic user location 5 TOOL name of application or tool 6 TXT text describing the source 7 PRIV private extensions 8 Other constants are assigned by IANA. Constants not assigned by IANA are available for experimental use. Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 33] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 11 RTP Profiles and Payload Format Specifications RTP may be used for a variety of applications with somewhat differing requirements. The flexibility to adapt to those requirements is provided by allowing multiple choices in the main protocol specification, then in a separate document defining a profile to select the appropriate choices for a particular class of applications and environment. Typically an application will operate under only one profile and there is no explicit indication of which profile is in use. A profile for audio and video applications may be found in the companion Internet draft draft-ietf-avt-profile. Within this specification, the following possible uses of a profile have been identified, but this list is not meant to be exclusive: o Define a set of payload formats (e.g., media encodings) and a default mapping of those formats to payload type values. Where known, the nominal data rate of these encodings should be provided as the RTCP packet rate depends on this parameter. o Define the number and interpretation of the RTP marker bits, if different from the default specified in Section 5.1. o Define an extension to the fixed RTP data header if some additional functionality is required across the class of applications independent of payload type, and define the first 16 bits of the RTP data header extension if implementation-specific extensions are to be allowed (see Section 5.3). o Define new application-class-specific RTCP packets, or the data format, preferred use, or required use of particular RTCP packets. In particular, SR and RR packets may be extended if there is additional information that should be reported regularly about the sender or receivers. o Specify that a particular underlying network or transport layer protocol will be used to carry RTP packets. o Specify the mapping of RTP and RTCP to transport-level names, e.g., UDP ports, if different from the mapping defined in Section 9. o Specify encapsulation of RTP packets that are to be used always or with particular underlying protocols. It is not expected that a new profile will be required for every application. Within one application class, it would be better to extend an existing profile rather than make a new one. For example, additional RTCP packet types or payload type values may be defined and registered through the Internet Assigned Numbers Authority for publication in the Assigned Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 34] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 Numbers RFC as an alternative to publishing a new profile specification. A payload format document specifies how a particular kind of payload data, such as H.261 encoded video, should be carried in RTP. Payload formats may be useful under multiple profiles and may therefore be defined independently of any particular profile. The profile document is then responsible for assigning a default mapping of that format to a payload type value if needed. A Implementation Notes We describe aspects of the receiver implementation in this section. There may be other implementation methods that are faster in particular operating environments or have other advantages. These implementation notes are for informational purposes only. The following definitions are used for all examples; the structure definitions are valid for 32-bit big-endian (most significant octet first) architectures only. Bit fields are assumed to be packed tightly in big-endian bit order, with no additional padding. #include /* * The type definitions below are valid for 32-bit architectures and * may have to be adjusted for 16- or 64-bit architectures. */ typedef unsigned char u_int8; typedef unsigned short u_int16; typedef unsigned int u_int32; /* * rtp.h -- RTP header file */ #include #define RTP_SEQ_MOD (1<<16) #define RTP_TS_MOD (0xffffffff) #define RTP_MAX_SDES 256 /* maximum text length for SDES */ typedef enum { RTCP_SR, RTCP_RR, RTCP_SDES, Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 35] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 RTCP_BYE, RTCP_APP } rtcp_type_t; typedef enum { RTCP_SDES_END, RTCP_SDES_CNAME, RTCP_SDES_NAME, RTCP_SDES_EMAIL, RTCP_SDES_PHONE, RTCP_SDES_LOC, RTCP_SDES_TOOL, RTCP_SDES_TXT, RTCP_SDES_PRIV } rtcp_sdes_type_t; typedef struct { unsigned int type:2; /* packet type */ unsigned int p:1; /* padding flag */ unsigned int x:1; /* header extension flag */ unsigned int cc:4; /* CSRC count */ unsigned int m:1; /* marker bit */ unsigned int pt:7; /* payload type */ u_int16 seq; /* sequence number */ u_int32 ts; /* timestamp */ u_int32 ssrc; /* synchronization source */ u_int32 csrc[1]; /* optional CSRC list */ } rtp_hdr_t; typedef struct { unsigned int type:2; /* packet type */ unsigned int p:1; /* padding flag */ unsigned int count:5; /* varies by payload type */ unsigned int pt:8; /* payload type */ u_int16 length; /* packet length in words, without this word */ } rtcp_common_t; /* reception report */ typedef struct { u_int32 ssrc; /* data source being reported */ u_int32 received; /* cumulative number of packets received */ u_int32 expected; /* cumulative number of packets expected */ u_int32 jitter; /* interarrival jitter */ u_int32 lsr; /* last SR packet from this source */ u_int32 dlsr; /* delay since last SR packet */ } rtcp_rr_t; typedef struct { u_int8 type; /* type of SDES item (rtcp_sdes_type_t) */ u_int8 length; /* length of SDES item (in octets) */ char data[1]; /* text, not zero-terminated */ Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 36] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 } rtcp_sdes_item_t; /* one RTCP packet */ typedef struct { rtcp_common_t common; /* common header */ union { /* sender report (SR) */ struct { u_int32 ssrc; /* source this RTCP packet refers to */ u_int32 ntp_sec; /* NTP timestamp */ u_int32 ntp_frac; u_int32 rtp_ts; /* RTP timestamp */ u_int32 psent; /* packets sent */ u_int32 osent; /* octets sent */ /* variable-length list */ rtcp_rr_t rr[1]; } sr; /* reception report (RR) */ struct { u_int32 ssrc; /* source this generating this report */ /* variable-length list */ rtcp_rr_t rr[1]; } rr; /* BYE */ struct { u_int32 src[1]; /* list of sources */ /* can't express trailing text */ } bye; /* source description (SDES) */ struct rtcp_sdes_t { u_int32 src; /* first SSRC/CSRC */ rtcp_sdes_item_t item[1]; /* list of SDES items */ } sdes; } r; } rtcp_t; A.1 RTP Header Consistency Check The following checks may be used to determine whether an RTP header is likely to be valid, given a previously received RTP packet: o RTP type field value equal to 2 o payload type defined Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 37] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 o RTP sequence number one higher than previous packet o if packets contain fixed number of timestamp counts, comparison of timestamp increment with sequence number increment o length of RTP packet consistent with CC and payload type Depending on the application, algorithms may exploit additional knowledge, e.g., the expected increment in timestamps between packets. Note that this algorithm is likely to occasionally create false alarms. A.2 Parsing RTCP Packets The following code fragment walks through one or more RTCP packets, checking for invalid length fields. It may also be advisable to treat the packet type and payload type as a single field for checking and branching. u_int32 len; /* length of combined RTCP packets in words */ rtcp_t *r; /* RTCP header */ while (len > 0) { len -= r->common.length + 1; if (len < 0) { /* something wrong with packet format */ break; } switch (r->common.pt) { case RTCP_SR: break; default: /* invalid type */ break; } r = (rtcp_t *)((u_int32 *)r + r->common.length + 1); } A.3 Generating SDES RTCP Packets /* * Function adds a single item 'item' to buffer 'b'. * Returns updated buffer pointer. */ char *rtcp_sdes_add(char *b, rtcp_sdes_type_t type, char *item) { Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 38] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 rtcp_sdes_item_t *rsp; rsp = (rtcp_sdes_item_t *)b; rsp->type = type; rsp->length = strlen(item); strcpy(rsp->data, item); b += strlen(item) + 2; return b; } /* * Write SDES chunk into buffer 'b' from arrays type[] and value[] with * argc members. * Return pointer to next available location within 'b'. */ char *rtp_write_sdes(char *b, u_int32 src, int argc, rtcp_sdes_t type[], char *value[]) { rtcp_sdes_item_t i; src_id_t *src_pt = (src_id_t *)b; int pad; /* octets for padding */ /* SSRC header */ *src_pt = src; b += 4; /* SDES items */ for (i = 0; i < argc; i++) { b = rtcp_sdes_add(b, type[i], value[i]); } /* terminate with end marker */ *b++ = RTCP_SDES_END; /* if necessary, pad with zeroes to next 4-octet boundary */ pad = (4 - (b & 0x3)) & 0x3; memset(b, RTCP_SDES_END, pad); b += pad; return b; } A.4 Parsing SDES RTCP Packets The function below parses one SDES chunk and calls a function 'member_sdes' that sets the corresponding information for a session member 'm' (not defined here). It expects 'b' to point to the first item for this chunk. /* round a number 'n' up modulo the size of data type 't' */ Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 39] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 #define ROUND(n,t) (((long)(n) + sizeof (t) - 1) & ~ (sizeof (t) - 1)) char *rtp_read_sdes(member_t m, char *b) { rtcp_sdes_t *rsp = (rtcp_sdes_t *)b; for (; rsp->type; rsp = (rtcp_sdes_item_t *)((char *)rsp + rsp->length + 2)) { if (rsp->type > RTCP_SDES_TXT) return 0; member_sdes(m, rsp->type, rsp->data, rsp->length); } b = (char *)rsp; return b + (4 - (b & 0x3)) & 0x3; } A.5 Generating a Random 32-bit Identifier The following subroutine generates a random 32-bit identifier using the MD5 routines published in RFC 1321. The system routines may not be present on all operating systems, but they should serve as hints as to what kinds of information may be used. Other system calls that may be appropriate include getdomainname(), getwd(). ``Live'' video or audio samples are also a good source of random numbers, but care must be taken to avoid that a turned-off microphone or blinded camera is used as a source. /* * Generate a random 32-bit quantity. */ #include /* u_long */ #include /* gettimeofday() */ #include /* get..() */ #include /* printf() */ #include "global.h" /* from RFC 1321 */ #include "md5.h" /* from RFC 1321 */ #define MD_CTX MD5_CTX #define MDInit MD5Init #define MDUpdate MD5Update #define MDFinal MD5Final static u_long md_32(char *string, int length) { MD_CTX context; union { char c[16]; u_long x[4]; } digest; u_long r; Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 40] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 int i; MDInit (&context); MDUpdate (&context, string, length); MDFinal ((unsigned char *)&digest, &context); r = 0; for (i = 0; i < 3; i++) { r ^= digest.x[i]; } return r; } /* md_32 */ /* * Return random unsigned 32-bit quantity. */ u_long random32(void) { struct { struct timeval tv; pid_t pid; u_long hostid; uid_t uid; gid_t gid; char name[8]; } s; gettimeofday(&s.tv, 0); s.pid = getpid(); s.hostid = gethostid(); s.uid = getuid(); s.gid = getgid(); gethostname(s.name, sizeof(s.name)); return md_32((char *)&s, sizeof(s)); } /* random32 */ A.6 Computing the RTCP Transmission Period The RTCP messages emitted by all session members should not consume more than a small fraction of the total data bandwidth used. Thus, the time between transmitting RTCP messages must increase with the number of session members and the size of the previous RTCP message sent. A suggested value for the bandwidth used by RTCP messages is 5% of the single-sender data bandwidth, including any lower-layer protocols. [TBD: Having to know the total lower-layer overhead may not be a good idea, given that we probably don't want to start adding ATM AAL overhead, PPP overhead, etc., although they might be more significant.] Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 41] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 To reduce the sender load for very small sessions and to provide statistically meaningful sender reports, the minimum RTCP message interval is (arbitrarily) set to 5 seconds. The following function returns the time until the next transmission, measured in seconds. It should be called after sending an RTCP message. The parameters have the following meaning: rtcp_bw: The desired RTCP bandwidth, in octets per second. nsenders: Number of active senders since last report, known from construction of receiver reports for this report. Includes ourselves, if we also sent. members: The estimated number of session members, including ourselves. Incremented as we discover new session members, decremented as session members time out after not having been heard from. On the first call, this parameter should be one. Session members are timed out if they have not been heard from TBD times our current RTCP interval mean value. [move this sentence elsewhere?] packet_size: The size of the last RTCP packet, in octets. we_sent: Flag that is true if we have sent something since the last RTCP message. If the flag is true, the RTCP message just sent contained an SR packet. double rtcp_period(int members, int senders, double bw, int we_sent) { /* * Minimum time between RTCP packets from this site (in seconds). * This time prevents the reports from `clumping' when sessions are * small and the law of large numbers isn't helping to smooth out * the traffic. it also keeps the report interval from becoming * ridiculously small during transient outages like a network * partition. */ double const RTCP_MIN_TIME 5.; /* * Fraction of the rtcp bandwidth to be shared among active senders. * (This fraction was chosen so that in a typical session with one or * two active senders, the computed report time would be roughly * equal to the min report time so that we don't unnecessarily slow * down receiver reports.) The receiver fraction must be 1 - the * sender fraction. */ double const RTCP_SENDER_BW_FRACTION 0.25; double const RTCP_RECEIVER_BW_FRACTION (1 - RTCP_SENDER_BW_FRACTION); /* Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 42] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 * Gain (smoothing constant) for the low-pass filter that estimates * the average rtcp packet size. */ double const RTCP_SIZE_GAIN (1./16.); double t; /* interval */ double rtcp_min_time = RTCP_MIN_TIME; /* The avg. rtcp size is initialized to 128 bytes which is * conservative (it assumes everyone else is generating SRs instead * of RRs). */ static double avg_rtcp_size = 128; int n; /* number of members for computation */ /* initial */ if (members == 1) { rtcp_min_time /= 2; } /* Compute estimated number of session members. */ n = members; if (senders > 0) { if (we_sent) { bw *= RTCP_SENDER_BW_FRACTION; n = nsenders; } else { bw *= RTCP_RECEIVER_BW_FRACTION; n -= nsenders; } } /* Update avg. size of message [Is this really helpful?] */ avg_rtcp_size += (packet_size - avg_rtcp_size) * RTCP_SIZE_GAIN; /* compute interval */ t = avg_rtcp_size * n / bw; /* enforce minimum spacing */ if (t < rtcp_min_time) t = rtcp_min_time; /* * To avoid traffic bursts from unintended synchronization with * other sites, we then pick our actual next report interval as a * random number uniformly distributed between 0.5*t and 1.5*t. */ return t * (drand48() + 1.0); } Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 43] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 A.7 Estimating the Interarrival Jitter The interarrival jitter field in receiver reports should be an estimate of the statistical variance of the RTP data playout delay. The following algorithm may be suitable: double alpha = 0.01; avg = alpha * slack + (1-alpha) * avg; jitter = alpha * (slack - avg) + (1-alpha) * jitter; A.8 Determining the Expected Number of RTP Packets In order to compute packet loss rates, the number of packets expected and actually received needs to be known. The number of packets expected can be computed by the receiver by tracking the first sequence number received (seq0), the last sequence number received, seq, and the number of complete sequence number cycles: expected = cycles * 65536 + seq - seq0 + 1; The cycle count cycles is updated for each packet, where seq_prior is the sequence number of the prior packet. The cycle count is incremented when the sequence number wraps around in the "forward" direction, and needs to be decremented if the sequence number wraps around in the "backward" direction. unsigned short seq, seq_prior; if (seq > seq_prior) { if (seq - seq_prior > 32768) { /* out-of-order packet with wrap-around (e.g., 65530 preceded by 3) */ cycles--; } } else if (seq < seq_prior) { if (seq - seq_prior > 32768) { /* out-of-order packet (e.g., 2 preceded by 3) */ } else { /* wrap-around (e.g., 3 preceded by 65530) */ cycles++; } } Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 44] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 seq_prior = seq; Acknowledgments This memorandum is based on discussions within the IETF Audio/Video Transport working group chaired by Stephen Casner. The current protocol has its origins in the Network Voice Protocol and the Packet Video Protocol (Danny Cohen and Randy Cole) and the protocol implemented by the vat application (Van Jacobson and Steve McCanne). Christian Huitema provided ideas for the random identifier generator. B Addresses of Authors Henning Schulzrinne GMD Fokus Hardenbergplatz 2 D-10623 Berlin Germany electronic mail: hgs@fokus.gmd.de Stephen Casner University of Southern California/Information Sciences Institute 4676 Admiralty Way Marina del Rey, CA 90292-6695 United States electronic mail: casner@isi.edu Ron Frederick Xerox Palo Alto Research Center 3333 Coyote Hill Road Palo Alto, CA 94304 United States electronic mail: frederic@parc.xerox.com Van Jacobson MS 46a-1121 Lawrence Berkeley Laboratory Berkeley, CA 94720 United States electronic mail: van@ee.lbl.gov Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 45] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 References [1] D. D. Clark and D. L. Tennenhouse, "Architectural considerations for a new generation of protocols," in SIGCOMM Symposium on Communications Architectures and Protocols, (Philadelphia, Pennsylvania), pp. 200-- 208, IEEE, Sept. 1990. [2] H. Schulzrinne, "Issues in designing a transport protocol for audio and video conferences and other multiparticipant real-time applications."(4) expired Internet draft, Oct. 1993. [3] J.-C. Bolot, T. Turletti, and I. Wakeman, "Scalable feedback control for multicast video distribution in the internet,"(5) in SIGCOMM Symposium on Communications Architectures and Protocols, (London, England), pp. --, ACM, Aug. 1994. [4] D. E. Comer, Internetworking with TCP/IP, vol. 1. Englewood Cliffs, New Jersey: Prentice Hall, 1991. [5] J. Postel, "Internet protocol,"(6) Request for Comments (Standard) RFC 791, Internet Engineering Task Force, Sept. 1981. Obsoletes RFC0760. [6] D. Mills, "Network time protocol (v3),"(7) Request for Comments (Proposed Standard) RFC 1305, Internet Engineering Task Force, Apr. 1992. Obsoletes RFC1119. [7] W. Feller, An Introduction to Probability Theory and its Applications, Volume 1, vol. 1. New York, New York: John Wiley and Sons, third ed., 1968. [8] International Standards Organization, "ISO/IEC DIS 10646-1:1993 information technology -- universal multiple-octet coded character set (UCS) -- part I: Architecture and basic multilingual plane," 1993. [9] The Unicode Consortium, The Unicode Standard. New York, New York: Addison-Wesley, 1991. [10] S. Stubblebine, "Security services for multimedia conferencing," in 16th National Computer Security Conference, (Baltimore, Maryland), pp. 391--395, Sept. 1993. [11] D. Balenson, "Privacy enhancement for internet electronic mail: Part ------------------------------ 4. ftp://gaia.cs.umass.edu/pub/hgschulz/rtp/draft-ietf-avt-issues-01.ps 5. ftp://cs.ucl.ac.uk/darpa/multicast-congestion.ps.Z 6. ftp://ds.internic.net/rfc/rfc791.txt 7. ftp://ds.internic.net/rfc/rfc1305.ps Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 46] INTERNET-DRAFT draft-ietf-avt-rtp-06.txt November 28, 1994 III: algorithms, modes, and identifiers,"(8) Request for Comments (Proposed Standard) RFC 1423, Internet Engineering Task Force, Feb. 1993. Obsoletes RFC1115. [12] V. L. Voydock and S. T. Kent, "Security mechanisms in high-level network protocols," ACM Computing Surveys, vol. 15, pp. 135--171, June 1983. ------------------------------ 8. ftp://ds.internic.net/rfc/rfc1423.txt Schulzrinne/Casner/Frederick/Jacobson Expires 3/1/95 [Page 47]