Network Working Group M. West Internet-Draft S. McCann Expires: September 1, 2003 Siemens/Roke Manor March 3, 2003 TCP/IP Field Behavior draft-ietf-rohc-tcp-field-behavior-02.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http:// www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on September 1, 2003. Copyright Notice Copyright (C) The Internet Society (2003). All Rights Reserved. Abstract This memo describes TCP/IP field behavior in the context of header compression. Header compression is possible thanks to the fact that most header fields do not vary randomly from packet to packet. Many of the fields exhibit static behavior or change in a more or less predictable way. When designing a header compression scheme, it is of fundamental importance to understand the behavior of the fields in detail. An example of this analysis can be seen in RFC 3095 [31]. This memo performs a similar role for the compression of TCP/IP. West & McCann Expires September 1, 2003 [Page 1] Internet-Draft TCP/IP Field Behavior March 2003 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . 5 3. General classification . . . . . . . . . . . . . . . . . . . 5 3.1 IP header fields . . . . . . . . . . . . . . . . . . . . . . 6 3.1.1 IPv6 header fields . . . . . . . . . . . . . . . . . . . . . 6 3.1.2 IPv4 header fields . . . . . . . . . . . . . . . . . . . . . 7 3.2 TCP header fields . . . . . . . . . . . . . . . . . . . . . 10 3.3 Summary for IP/TCP . . . . . . . . . . . . . . . . . . . . . 11 4. Classification of replicable header fields . . . . . . . . . 12 4.1 IPv4 Header (inner and/or outer) . . . . . . . . . . . . . . 13 4.2 IPv6 Header (inner and/or outer) . . . . . . . . . . . . . . 14 4.3 TCP Header . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.4 TCP Options . . . . . . . . . . . . . . . . . . . . . . . . 16 4.5 Summary of replication . . . . . . . . . . . . . . . . . . . 16 5. Analysis of change patterns of header fields . . . . . . . . 17 5.1 IP header . . . . . . . . . . . . . . . . . . . . . . . . . 19 5.1.1 IP Traffic-Class / Type-Of-Service (TOS) . . . . . . . . . . 19 5.1.2 ECN Flags . . . . . . . . . . . . . . . . . . . . . . . . . 20 5.1.3 IP Identification . . . . . . . . . . . . . . . . . . . . . 20 5.1.4 Don't Fragment (DF) flag . . . . . . . . . . . . . . . . . . 22 5.1.5 IP Hop-Limit / Time-To-Live (TTL) . . . . . . . . . . . . . 23 5.2 TCP header . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.2.1 Sequence number . . . . . . . . . . . . . . . . . . . . . . 23 5.2.2 Acknowledgement number . . . . . . . . . . . . . . . . . . . 24 5.2.3 Reserved . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.2.4 Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.2.5 Checksum . . . . . . . . . . . . . . . . . . . . . . . . . . 26 5.2.6 Window . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 5.2.7 Urgent pointer . . . . . . . . . . . . . . . . . . . . . . . 27 5.3 Options . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5.3.1 Options overview . . . . . . . . . . . . . . . . . . . . . . 27 5.3.2 Option field behavior . . . . . . . . . . . . . . . . . . . 28 6. Other observations . . . . . . . . . . . . . . . . . . . . . 36 6.1 Implicit acknowledgements . . . . . . . . . . . . . . . . . 36 6.2 Shared data . . . . . . . . . . . . . . . . . . . . . . . . 36 6.3 TCP header overhead . . . . . . . . . . . . . . . . . . . . 36 6.4 Field independence and packet behavior . . . . . . . . . . . 37 6.5 Short-lived flows . . . . . . . . . . . . . . . . . . . . . 37 6.6 Master Sequence Number . . . . . . . . . . . . . . . . . . . 38 6.7 Size constraint for TCP options . . . . . . . . . . . . . . 38 7. Security considerations . . . . . . . . . . . . . . . . . . 39 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 39 References . . . . . . . . . . . . . . . . . . . . . . . . . 39 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 42 Full Copyright Statement . . . . . . . . . . . . . . . . . . 43 West & McCann Expires September 1, 2003 [Page 2] Internet-Draft TCP/IP Field Behavior March 2003 1. Introduction This document describes TCP/IP field behavior, as it is essential to understand this before correct assumptions about header compression can be made. Since the IP header does exhibit some slightly different behavior from that previously presented in RFC 3095 [31] for the RTP case, it is also included in this document. It is intentional that much of the classification text from RFC 3095 [31] has been borrowed. This is for easier reading rather than inserting many references to that document. Again based on the format presented in RFC 3095 [31] TCP/IP header fields are classified and analyzed in two steps. First, we have a general classification in Section 3 where the fields are classified on the basis of stable knowledge and assumptions. The general classification does not take into account the change characteristics of changing fields because those will vary more or less depending on the implementation and on the application used. Section 4 considers how field values can be used to optimize short-lived flows. A less stable but more detailed analysis of the change characteristics is then done in Section 5. Finally, Section 6 summarizes with conclusions about how the various header fields should be handled by the header compression scheme to optimize compression and functionality. A general question raised by this analysis is that of what 'baseline' definition of all possible TCP/IP implementations is to be considered? For the purposes of this document, a relatively up-to- date (as of the time of writing) implementation is considered, with a view to ensuring compatibility with legacy implementations. The general requirement for transparency is also seen to be more interesting. A number of recent proposals for extensions to TCP make use of some of the previously 'reserved' bits. It is therefore clear that a 'reserved' bit cannot be taken to have a guaranteed zero value, but may change. Ideally, this should be accommodated by the compression profile. It is unclear exactly how reserved bits should be handled, given that the possible future uses cannot be predicted. It is accepted that if these currently reserved bits were used, then efficiency may be reduced. However, the compression scheme should still offer a useful solution. West & McCann Expires September 1, 2003 [Page 3] Internet-Draft TCP/IP Field Behavior March 2003 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [21]. 3. General classification The following definitions (and some text) are copied from RFC 3095 [31] Appendix A. Differences between IP field behavior between RFC 3095 [31] (i.e. IP/UDP/RTP behavior for audio and video applications) and this document have been identified. At a general level, the header fields are separated into 5 classes: o INFERRED These fields contain values that can be inferred from other values, for example the size of the frame carrying the packet, and thus do not have to be handled at all by the compression scheme. o STATIC These fields are expected to be constant throughout the lifetime of the packet stream. Static information must in some way be communicated once. o STATIC-DEF STATIC fields whose values define a packet stream. They are in general handled as STATIC. o STATIC-KNOWN These STATIC fields are expected to have well-known values and therefore do not need to be communicated at all. o CHANGING These fields are expected to vary in some way: randomly, within a limited value set or range, or in some other manner. In this section, each of the IP and TCP header fields is assigned to one of these classes. For all fields except those classified as CHANGING, the motives for the classification are also stated. In section 4, CHANGING fields are further examined and classified on the basis of their expected change behavior. West & McCann Expires September 1, 2003 [Page 4] Internet-Draft TCP/IP Field Behavior March 2003 3.1 IP header fields 3.1.1 IPv6 header fields +---------------------+-------------+----------------+ | Field | Size (bits) | Class | +---------------------+-------------+----------------+ | Version | 4 | STATIC | | DSCP* | 6 | CHANGING | | ECT flag* | 1 | CHANGING | | CE flag* | 1 | CHANGING | | Flow Label | 20 | STATIC-DEF | | Payload Length | 16 | INFERRED | | Next Header | 8 | STATIC | | Hop Limit | 8 | CHANGING | | Source Address | 128 | STATIC-DEF | | Destination Address | 128 | STATIC-DEF | +---------------------+-------------+----------------+ * differs from RFC 3095 [31] [The DSCP, ECT and CE flags were amalgamated into the Traffic Class octet in RFC 3095.] o Version The version field states which IP version is used. Packets with different values in this field must be handled by different IP stacks. All packets of a packet stream must therefore be of the same IP version. Accordingly, the field is classified as STATIC. o Flow Label This field may be used to identify packets belonging to a specific packet stream. If not used, the value should be set to zero. Otherwise, all packets belonging to the same stream must have the same value in this field, it being one of the fields that define the stream. The field is therefore classified as STATIC-DEF. o Payload Length Information about packet length (and, consequently, payload length) is expected to be provided by the link layer. The field is therefore classified as INFERRED. West & McCann Expires September 1, 2003 [Page 5] Internet-Draft TCP/IP Field Behavior March 2003 o Next Header This field will usually have the same value in all packets of a packet stream. It encodes the type of the subsequent header. Only when extension headers are sometimes present and sometimes not, will the field change its value during the lifetime of the stream. The field is therefore classified as STATIC. The classification of STATIC is inherited from RFC 3095 [31]. However, it should be pointed out that the next header field is actually determined by the type of the following header. Thus, it might be more appropriate to view this as an inference, although this depends upon the specific implementation of the compression scheme. o Source and Destination addresses These fields are part of the definition of a stream and must thus be constant for all packets in the stream. The fields are therefore classified as STATIC-DEF. This might be considered as a slightly simplistic view. However for now the IP addresses are associated with the transport layer connection. More complex flow-separation could, of course, be considered. Total size of the fields in each class: +--------------+--------------+ | Class | Size (octets)| +--------------+--------------+ | INFERRED | 2 | | STATIC | 1.5 | | STATIC-DEF | 34.5 | | STATIC-KNOWN | 0 | | CHANGING | 2 | +--------------+--------------+ 3.1.2 IPv4 header fields West & McCann Expires September 1, 2003 [Page 6] Internet-Draft TCP/IP Field Behavior March 2003 +---------------------+-------------+----------------+ | Field | Size (bits) | Class | +---------------------+-------------+----------------+ | Version | 4 | STATIC | | Header Length | 4 | STATIC-KNOWN | | DSCP* | 6 | CHANGING | | ECT flag* | 1 | CHANGING | | CE flag* | 1 | CHANGING | | Packet Length | 16 | INFERRED | | Identification | 16 | CHANGING | | Reserved flag* | 1 | CHANGING | | Don't Fragment flag*| 1 | CHANGING | | More Fragments flag | 1 | STATIC-KNOWN | | Fragment Offset | 13 | STATIC-KNOWN | | Time To Live | 8 | CHANGING | | Protocol | 8 | STATIC | | Header Checksum | 16 | INFERRED | | Source Address | 32 | STATIC-DEF | | Destination Address | 32 | STATIC-DEF | +---------------------+-------------+----------------+ * differs from RFC 3095 [31] [The DSCP, ECT and CE flags were amalgamated into the TOS octet in RFC 3095. The DF flag behavior is considered later. The reserved field is discussed below.] o Version The version field states which IP version is used. Packets with different values in this field must be handled by different IP stacks. All packets of a packet stream must therefore be of the same IP version. Accordingly, the field is classified as STATIC. o Header Length As long as no options are present in the IP header, the header length is constant and well known. If there are options, the fields would be STATIC, but it is assumed here that there are no options. The field is therefore classified as STATIC-KNOWN. o Packet Length Information about packet length is expected to be provided by the link layer. The field is therefore classified as INFERRED. West & McCann Expires September 1, 2003 [Page 7] Internet-Draft TCP/IP Field Behavior March 2003 o Flags The Reserved flag must be set to zero, as defined in RFC 791 [1]. In RFC 3095 [31] the field is therefore classified as STATIC-KNOWN. However, it is expected that reserved fields may be used at some future point. It appears unwise to select an encoding that would preclude the use of a compression profile for a future change in the use of reserved fields. For this reason the alternative encoding of CHANGING is suggested. It would also be possible to have more than one compression profile, in one of which this field was considered to be STATIC-KNOWN. The More Fragments (MF) flag is expected to be zero because fragmentation is generally not expected. As discussed in the RTP case, only the first fragment will contain the transport layer protocol header; subsequent fragments would have to be compressed with a different profile. In terms of the effect of header overhead, if fragmentation does occur then the first fragment, by definition, should be relatively large, minimizing the header overhead. In the case of TCP, fragmentation should not be common due to a combination of initial MSS negotiation and subsequent use of path-MTU discovery. The More Fragments flag is therefore classified as STATIC-KNOWN. However, a profile could accept that this flag may be set in order to cope with fragmentation. o Fragment Offset Under the assumption that no fragmentation occurs, the fragment offset is always zero. The field is therefore classified as STATIC-KNOWN. Even if fragmentation were to be further considered, then only the first fragment would contain the TCP header and the fragment offset of this packet would still be zero. o Protocol This field will usually have the same value in all packets of a packet stream. It encodes the type of the subsequent header. Only when extension headers are sometimes present and sometimes not, will the field change its value during the lifetime of a stream. The field is therefore classified as STATIC. o Header Checksum The header checksum protects individual hops from processing a corrupted header. When almost all IP header information is West & McCann Expires September 1, 2003 [Page 8] Internet-Draft TCP/IP Field Behavior March 2003 compressed away, there is no point in having this additional checksum; instead it can be regenerated at the decompressor side. The field is therefore classified as INFERRED. We note that the TCP checksum does not protect the whole TCP/IP header, but only the TCP pseudo-header (and the payload). Compare this with ROHC [31], which uses a CRC to verify the uncompressed header. Given the need to validate the complete TCP/IP header; the cost of computing the TCP checksum over the entire payload; and known weaknesses in the TCP checksum [37], an additional check is necessary. Therefore, it is expected than some additional checksum (such as a CRC) will be used to validate correct decompression. o Source and Destination addresses These fields are part of the definition of a stream and must thus be constant for all packets in the stream. The fields are therefore classified as STATIC-DEF. Total size of the fields in each class: +--------------+--------------+ | Class | Size (octets)| +--------------+--------------+ | INFERRED | 4 | | STATIC* | 1.5 | | STATIC-DEF | 8 | | STATIC-KNOWN*| 2.25 | | CHANGING* | 4.25 | +--------------+--------------+ * differs from RFC 3095 [31] 3.2 TCP header fields West & McCann Expires September 1, 2003 [Page 9] Internet-Draft TCP/IP Field Behavior March 2003 +---------------------+-------------+----------------+ | Field | Size (bits) | Class | +---------------------+-------------+----------------+ | Source Port | 16 | STATIC-DEF | | Destination Port | 16 | STATIC-DEF | | Sequence Number | 32 | CHANGING | | Acknowledgement Num | 32 | CHANGING | | Data Offset | 4 | INFERRED | | Reserved | 4 | CHANGING | | CWR flag | 1 | CHANGING | | ECE flag | 1 | CHANGING | | URG flag | 1 | CHANGING | | ACK flag | 1 | CHANGING | | PSH flag | 1 | CHANGING | | RST flag | 1 | CHANGING | | SYN flag | 1 | CHANGING | | FIN flag | 1 | CHANGING | | Window | 16 | CHANGING | | Checksum | 16 | CHANGING | | Urgent Pointer | 16 | CHANGING | | Options | 0(-352) | CHANGING | +---------------------+-------------+----------------+ o Source and Destination ports These fields are part of the definition of a stream and must thus be constant for all packets in the stream. The fields are therefore classified as STATIC-DEF. o Data Offset The number of 4 octet words in the TCP header, thus indicating The start of the data. It is always a multiple of 4 octets. It can be re-constructed from the length of any options and thus it is not necessary to carry this explicitly. The field is therefore classified as INFERRED. 3.3 Summary for IP/TCP Summarizing this for IP/TCP one obtains West & McCann Expires September 1, 2003 [Page 10] Internet-Draft TCP/IP Field Behavior March 2003 +----------------+----------------+----------------+ | Class \ IP ver | IPv6 (octets) | IPv4 (octets) | +----------------+----------------+----------------+ | INFERRED | 2 + 4 bits | 4 + 4 bits | | STATIC | 1 + 4 bits | 1 + 4 bits | | STATIC-DEF | 38 + 4 bits | 12 | | STATIC-KNOWN | - | 2 + 2 bits | | CHANGING | 17 + 4 bits | 19 + 6 bits | +----------------+----------------+----------------+ | Totals | 60 | 40 | +----------------+----------------+----------------+ (excludes options, which are all classified as CHANGING) 4. Classification of replicable header fields Where multiple flows either overlap in time or occur sequentially within a short space of time there can be a great deal of similarity in header field values. Such commonality of field values is reflected in the compression context. Thus, it should be possible to utilise links between fields across different flows to improve the compression ratio. In order to do this, it is important to understand the 'replicable' characteristics of the various header fields. The key concept is that of 'replication', where an existing context is used as a baseline and replicated to initialise a new context. Those fields that are the same are then automatically initialised in the new context. Those that have changed will be updated or overwritten with values from the initialisation packet that triggered the replication. This section considers the commonality between fields in different flows. It should be noted, however, that replication is based on contexts (rather than just field values) and so compressor created fields that are part of the context may also be included. These, of course, are dependent upon the nature of the compression protocol (ROHC profile) being applied. A brief analysis of the relationship of TCP/IP fields among 'replicable' packet streams follows. 'N/A' -- The field need not be considered in the replication process as it is inferred or known 'a priori' (and, therefore, does not appear in the context). 'No' -- The field cannot be replicated since its change pattern between two packet flows is uncorrelated. West & McCann Expires September 1, 2003 [Page 11] Internet-Draft TCP/IP Field Behavior March 2003 'Yes' -- The field may be replicated. This does not guarantee that the field value will be the same across two candidate streams, only that it might be possible to exploit replication to increase the compression ratio. Specific encoding methods can be used to improve the compression efficiency. 4.1 IPv4 Header (inner and/or outer) +-----------------------+---------------+------------+ | Field | Class | Replicable | +-----------------------+---------------+------------+ | Version | STATIC | N/A | | Header Length | STATIC-KNOWN | N/A | | DSCP | CHANGING | No (1) | | ECT flag | CHANGING | No (2) | | CE flag | CHANGING | No (2) | | Packet Length | INFERRED | N/A | | Identification | CHANGING | Yes (3) | | Reserved flag | CHANGING | No (4) | | Don't Fragment flag | CHANGING | No | | More Fragments flag | STATIC-KNOWN | N/A | | Fragment Offset | STATIC-KNOWN | N/A | | Time To Live | CHANGING | Yes | | Protocol | STATIC | N/A | | Header Checksum | INFERRED | N/A | | Source Address | STATIC-DEF | Yes | | Destination Address | STATIC-DEF | Yes | +-----------------------+---------------+------------+ (1) The DSCP is marked based on the application's requirements. If it can be assumed that replicable connections often carry the same type of traffic, the DSCP may be regarded as replicable. However, issues such as re-marking will need to be taken into account. (2) It is not possible for the ECN bits to be replicated (note that use of the ECN nonce scheme [35] is anticipated). However, it seems likely that all TCP flows between ECN-capable hosts will use ECN, the use (or not) of ECN for flows between the same end-points might be considered replicable. See also note (4). (3) The replicable context for this field includes the IP-ID, NBO, and RND flags (as described in ROHC RTP). This highlights that the replication is of the context, rather than just the header field values and, as such, needs to be considered based on the exact nature of compression applied to each field. West & McCann Expires September 1, 2003 [Page 12] Internet-Draft TCP/IP Field Behavior March 2003 (4) Since the possible future behavior of the 'Reserved Flag' cannot be predicted, it is not considered as replicable. However, it might be expected that the behavior of the reserved flag between the same end-points will be similar. In this case, any selection of packet formats (for example) based on this behavior might carry across to the new flow. In the case of packet formats, this can probably be considered as a compressor-local decision. 4.2 IPv6 Header (inner and/or outer) +-----------------------+---------------+------------+ | Field | Class | Replicable | +-----------------------+---------------+------------+ | Version | STATIC | N/A | | Traffic Class | CHANGING | Yes (1) | | ECT flag | CHANGING | No (2) | | CE flag | CHANGING | No (2) | | Flow Label | STATIC-DEF | N/A | | Payload Length | INFERRED | N/A | | Next Header | STATIC | N/A | | Hop Limit | CHANGING | Yes | | Source Address | STATIC-DEF | Yes | | Destination Address | STATIC-DEF | Yes | +-----------------------+---------------+------------+ (1) See comment about DSCP field for IPv4, above. (2) See comment about ECT and CE flags for IPv4, above. West & McCann Expires September 1, 2003 [Page 13] Internet-Draft TCP/IP Field Behavior March 2003 4.3 TCP Header +-----------------------+---------------+------------+ | Field | Class | Replicable | +-----------------------+---------------+------------+ | Source Port | STATIC-DEF | Yes (1) | | Destination Port | STATIC-DEF | Yes (1) | | Sequence Number | CHANGING | No (2) | | Acknowledgement Number| CHANGING | No | | Data Offset | INFERRED | N/A | | Reserved Bits | CHANGING | No (3) | | Flags | | | | CWR | CHANGING | No (4) | | ECE | CHANGING | No (4) | | URG | CHANGING | No | | ACK | CHANGING | No | | PSH | CHANGING | No | | RST | CHANGING | No | | SYN | CHANGING | No | | FIN | CHANGING | No | | Window | CHANGING | Yes | | Checksum | CHANGING | No | | Urgent Pointer | CHANGING | Yes (5) | +-----------------------+---------------+------------+ (1) On the server side, the port number is likely to be a well-known value. On the client side, the port number is generally selected by the stack automatically. Whether the port number is replicable depends upon how the stack chooses the port number. However, most implementations use a simple scheme which sequentially picks the next available port number. This is clearly exploitable in a compression scheme. (2) With the recommendation (and expected deployment) of TCP Initial Sequence Number randomization, defined in RFC 1948 [16], it will be impossible to share the sequence number. Thus, this field will not be regarded as replicable. (3) See comment (4) for the IPv4 header, above. (4) See comment (2) on ECN flags for the IPv4 header, above. (5) The urgent pointer is very rarely used. This means that, in practice, the field may be considered replicable. West & McCann Expires September 1, 2003 [Page 14] Internet-Draft TCP/IP Field Behavior March 2003 4.4 TCP Options +---------------------------+--------------+------------+ | Option | SYN-only (1) | Replicable | +---------------------------+--------------+------------+ | End of Option List | No | No (2) | | No-Operation | No | No (2) | | Maximum Segment Size | Yes | Yes | | Window Scale | Yes | Yes | | SACK-Permitted | Yes | Yes | | SACK | No | No | | Timestamp | No | No | +---------------------------+--------------+------------+ (1) This indicates whether the option only appears in SYN packet or not. Options that are not 'SYN-only' may appear in any packet. Many TCP options are used only in SYN packets. Some options, such as MSS, Window Scale, SACK-Permitted etc., will tend to have the same value among replicable packet streams. Thus, to support context sharing, the compressor should maintain such TCP options in the context (even though they only appear in the SYN segment). (2) Since these options have fixed values, they could be regarded as replicable. However, the only interesting thing to convey about these options is their presence: if it is known that such an option exists, its value is defined. 4.5 Summary of replication From the above analysis, it can be seen that there are reasonable grounds for exploiting redundancy between flows, as well as between packets within a flow. Simply consider the advantage of being able to elide the source and destination addresses for a repeated connection between two IPv6 endpoints. There will also be a cost (in terms of complexity and robustness) for replicating contexts, and this must be considered when deciding what constitutes an appropriate solution. The final point to note for the use of replication is that it requires the compressor to have a suitable degree of confidence that the source data is present and correct at the decompressor. This may place some restrictions on which of the 'changing' fields, in particular, can be utilised during replication. West & McCann Expires September 1, 2003 [Page 15] Internet-Draft TCP/IP Field Behavior March 2003 5. Analysis of change patterns of header fields To design suitable mechanisms for efficient compression of all header fields, their change patterns must be analyzed. For this reason, an extended classification is done based on the general classification in 2, considering the fields which were labeled CHANGING in that classification. The CHANGING fields are separated into five different subclasses: o STATIC These are fields that were classified as CHANGING on a general basis, but are classified as STATIC here due to certain additional assumptions. o SEMISTATIC These fields are STATIC most of the time. However, occasionally the value changes but reverts to its original value after a known number of packets. o RARELY-CHANGING (RC) These are fields that change their values occasionally and then keep their new values. o ALTERNATING These fields alternate between a small number of different values. o IRREGULAR These, finally, are the fields for which no useful change pattern can be identified. To further expand the classification possibilities without increasing complexity, the classification can be done either according to the values of the field and/or according to the values of the deltas for the field. When the classification is done, other details are also stated regarding possible additional knowledge about the field values and/or field deltas, according to the classification. For fields classified as STATIC or SEMISTATIC, the case could be that the value of the field is not only STATIC but also well KNOWN a priori (two states for SEMISTATIC fields). For fields with non-irregular change behavior, West & McCann Expires September 1, 2003 [Page 16] Internet-Draft TCP/IP Field Behavior March 2003 it could be known that changes usually are within a LIMITED range compared to the maximal change for the field. For other fields, the values are completely UNKNOWN. Table 1 classifies all the CHANGING fields on the basis of their expected change patterns. (4) refers to IPv4 fields and (6) refers to IPv6. +------------------------+-------------+-------------+-------------+ | Field | Value/Delta | Class | Knowledge | +========================+=============+=============+=============+ | IP TOS(4) / Tr.Class(6)| Value | RC | UNKNOWN | +------------------------+-------------+-------------+-------------+ | IP ECT flag(4) | Value | RC | UNKNOWN | +------------------------+-------------+-------------+-------------+ | IP CE flag(4) | Value | RC | UNKNOWN | +------------------------+-------------+-------------+-------------+ | Sequential | Delta | STATIC | KNOWN | | -----------+-------------+-------------+-------------+ | IP Id(4) Seq. jump | Delta | RC | LIMITED | | -----------+-------------+-------------+-------------+ | Random | Value | IRREGULAR | UNKNOWN | +------------------------+-------------+-------------+-------------+ | IP DF flag(4) | Value | RC | UNKNOWN | +------------------------+-------------+-------------+-------------+ | IP TTL(4) / Hop Lim(6) | Value | ALTERNATING | LIMITED | +------------------------+-------------+-------------+-------------+ | TCP Sequence Number | Delta | IRREGULAR | LIMITED | +------------------------+-------------+-------------+-------------+ | TCP Acknowledgement Num| Delta | IRREGULAR | LIMITED | +------------------------+-------------+-------------+-------------+ | TCP Reserved | Value | RC | UNKNOWN | +------------------------+-------------+-------------+-------------+ | TCP flags | | | | | ECN flags | Value | IRREGULAR | UNKNOWN | | CWR flag | Value | IRREGULAR | UNKNOWN | | ECE flag | Value | IRREGULAR | UNKNOWN | | URG flag | Value | IRREGULAR | UNKNOWN | | ACK flag | Value | SEMISTATIC | KNOWN | | PSH flag | Value | IRREGULAR | UNKNOWN | | RST flag | Value | IRREGULAR | UNKNOWN | | SYN flag | Value | SEMISTATIC | KNOWN | | FIN flag | Value | SEMISTATIC | KNOWN | +------------------------+-------------+-------------+-------------+ | TCP Window | Value | ALTERNATING | KNOWN | +------------------------+-------------+-------------+-------------+ | TCP Checksum | Value | IRREGULAR | UNKNOWN | West & McCann Expires September 1, 2003 [Page 17] Internet-Draft TCP/IP Field Behavior March 2003 +------------------------+-------------+-------------+-------------+ | TCP Urgent Pointer | Value | IRREGULAR | KNOWN | +------------------------+-------------+-------------+-------------+ | TCP Options | Value | IRREGULAR | UNKNOWN | +------------------------+-------------+-------------+-------------+ Table 1 : Classification of CHANGING header fields The following subsections discuss the various header fields in detail. Note that table 1 and the discussions below do not consider changes caused by loss or reordering before the compression point. 5.1 IP header 5.1.1 IP Traffic-Class / Type-Of-Service (TOS) The Traffic-Class (IPv6) or Type-Of-Service/DSCP (IPv4) field might be expected to change during the lifetime of a packet stream. This analysis considers several RFCs that describe modifications to the original RFC 791 [1]. The TOS byte was initially described in RFC 791 [1] as 3 bits of precedence followed by 3 bits of TOS and 2 reserved bits (defined to be 0). RFC 1122 [5] extended this to specify 5 bits of TOS, although the meanings of the additional 2 bits were not defined. RFC 1349 [10] defined the 4th bit of TOS to be 'minimize monetary cost'. The next significant change was in RFC 2474 [23] which reworked the TOS octet as 6 bits of DSCP (DiffServ Code Point) plus 2 unused bits. Most recently RFC 2780 [29] identified the 2 reserved bits in the TOS or traffic class octet for experimental use with ECN. For the purposes of this classification, it is therefore proposed to classify the TOS (or traffic class) octet as 6 bits for the DSCP and 2 additional bits. This may be expected to be 0 or to contain ECN data. From a future proofing perspective, it is preferable to assume the use of ECN, especially with respect to TCP. It is also considered important that the profile works with legacy stacks, since these will be in existence for some considerable time to come. For simplicity, this will be considered as 6 bits of TOS information and 2 bits of ECN data, so the fields are always considered to be structured the same way. The DSCP (as for TOS in ROHC RTP) is not expected to change frequently (although it could change mid-flow, for example as a result of a route change). West & McCann Expires September 1, 2003 [Page 18] Internet-Draft TCP/IP Field Behavior March 2003 5.1.2 ECN Flags Initially we describe the ECN flags as specified in RFC 2481 [24]. Subsequently, a suggested update is described which would alter the behavior of the flags. In RFC 2481 [24] there are 2 separate flags, the ECT (ECN Capable Transport) flag and the CE (Congestion Experienced) flag. The ECT flag, if negotiated by the TCP stack, will be '1' for all data packets and '0' for all 'pure acknowledgement' packets. This means that the behavior of the ECT flag is linked to behavior in the TCP stack. Whether this can be exploited for compression is not clear. The CE flag is only used if ECT is set to '1'. It is set to '0' by the sender and can be set to '1' by an ECN capable router in the network to indicate congestion. Thus the CE flag is expected to be randomly set to '1' with a probability dependent upon the congestion state of the network and the position of the compressor in the path. So, a compressor located close to the receiver in a congested network will see the CE bit set frequently, but a compressor located close to a sender will rarely, if ever, see the CE bit set to '1'. A recent draft [35] suggests an alternative view of these 2 bits. This considers the two bits together as having 4 possible codepoints. Meanings are then assigned to the codepoints: 00 Not ECN capable 01 ECN capable, no congestion [known as ECT(0)] 10 ECN capable, no congestion [known as ECT(1)] 11 Congestion experienced The use of 2 codepoints for signaling ECT allows the sender to detect when a receiver is not reliably echoing congestion information. For the purposes of compression, this update means that ECT(0) and ECT(1) are equally likely (for an ECN capable flow) and that '11' will be relatively rarely seen. The probability of seeing a congestion indication is discussed above in the description of the CE flag. It is suggested that, for the purposes of compression, ECN with nonces is assumed as the baseline, although the compression scheme must be able to transparently compress the original ECN scheme. 5.1.3 IP Identification The Identification field (IP ID) of the IPv4 header is there to identify which fragments constitute a datagram when reassembling West & McCann Expires September 1, 2003 [Page 19] Internet-Draft TCP/IP Field Behavior March 2003 fragmented datagrams. The IPv4 specification does not specify exactly how this field is to be assigned values, only that each packet should get an IP ID that is unique for the source-destination pair and protocol for the time the datagram (or any of its fragments) could be alive in the network. This means that assignment of IP ID values can be done in various ways, which we have separated into three classes: o Sequential jump This is the most common assignment policy in today's IP stacks. A single IP ID counter is used for all packet streams. When the sender is running more than one packet stream simultaneously, the IP ID can increase by more than one between packets in a stream. The IP ID values will be much more predictable and require less bits to transfer than random values, and the packet-to-packet increment (determined by the number of active outgoing packet streams and sending frequencies) will usually be limited. o Random Some IP stacks assign IP ID values using a pseudo-random number generator. There is thus no correlation between the ID values of subsequent datagrams. Therefore there is no way to predict the IP ID value for the next datagram. For header compression purposes, this means that the IP ID field needs to be sent uncompressed with each datagram, resulting in two extra octets of header. IP stacks in cellular terminals that need optimum header compression efficiency should not use this IP ID assignment policy. o Sequential This assignment policy keeps a separate counter for each outgoing packet stream and thus the IP ID value will increment by one for each packet in the stream, except at wrap around. Therefore, the delta value of the field is constant and well known a priori. This assignment policy is the most desirable for header compression purposes. However, its usage is not as common as it perhaps should be. In order to avoid violating RFC 791 [1], packets sharing the same IP address pair and IP protocol number cannot use the same IP ID values. Therefore, implementations of sequential policies must make the ID number spaces disjoint for packet streams of the same IP protocol going between the same pair of nodes. This can be done in a number of ways, all of which West & McCann Expires September 1, 2003 [Page 20] Internet-Draft TCP/IP Field Behavior March 2003 introduce occasional jumps, and thus makes the policy less than perfectly sequential. For header compression purposes less frequent jumps are preferred. It should be noted that the ID is an IPv4 mechanism and is therefore not a problem for IPv6. For IPv4 the ID could be handled in three different ways. First, we have the inefficient but reliable solution where the ID field is sent as-is in all packets, increasing the compressed headers by two octets. This is the best way to handle the ID field if the sender uses random assignment of the ID field. Second, there can be solutions with more flexible mechanisms requiring less bits for the ID handling as long as sequential jump assignment is used. Such solutions will probably require even more bits if random assignment is used by the sender. Knowledge about the sender's assignment policy could therefore be useful when choosing between the two solutions above. Finally, even for IPv4, header compression could be designed without any additional information for the ID field included in compressed headers. To use such schemes, it must be known which assignment policy for the ID field is being used by the sender. That might not be possible to know, which implies that the applicability of such solutions is very uncertain. However, designers of IPv4 stacks for cellular terminals should use an assignment policy close to sequential. With regard to TCP compression, the behavior of the IP ID field is considered to be essentially the same. However, in RFC 3095 [31] it is noted that the IP ID is generally inferred from the RTP Sequence Number. There is no obvious candidate in the TCP case for a field to offer this 'master sequence number' role. Clearly from a busy server the observed behavior may well be quite erratic. This is a case where the ability to share the IP compression context between a number of flows (between the same end- points) could offer potential benefits. However, this would only have any real impact where there were a large number of flows between one machine and the server. If context sharing is being considered, then it is preferable to share the IP part of the context. 5.1.4 Don't Fragment (DF) flag Due to the use of path-MTU discovery RFC 1191 [8], the value is more likely to be '1', than found in UDP/RTP streams since DF should be set to check for fragmentation in the end-to-end path. In practice it is hard to predict the behavior of this field. However, it is considered that the most likely case is that it will generally stay at either '0' or '1'. When using PMTU discovery [8] it is expected that DF will always be set to '1', although a host may end PMTU discovery by clearing the DF bit to '0'. West & McCann Expires September 1, 2003 [Page 21] Internet-Draft TCP/IP Field Behavior March 2003 5.1.5 IP Hop-Limit / Time-To-Live (TTL) The Hop-Limit (IPv6) or Time-To-Live (IPv4) field is expected to be constant during the lifetime of a packet stream or to alternate between a limited number of values due to route changes. 5.2 TCP header Any discussion of compressability of TCP fields borrows heavily from RFC 1144 [6]. However, the premise of how the compression is performed is slightly different and the protocol has evolved slightly in the intervening time. 5.2.1 Sequence number An understanding of the sequence and acknowledgement number behavior are essential for a TCP compression scheme. At the simplest level the behavior of the sequence number can be described relatively easily. However, there are a number of complicating factors that also need to be considered. For transferring in-sequence data packets, the sequence number will increment for each packet by between 0 and an upper limit defined by the MSS (Maximum Segment Size). Although there are common MSS values, these can be quite variable. Given this variability and the range of window sizes it is hard (compared with the RTP case, for example) to select a 'one size fits all' encoding for the sequence number. (The same argument applies equally to the acknowledgement number). We note that the increment of the sequence number in a packet is the size of the data payload of that packet (including the SYN and FIN flags; see later). This is, of course, exactly the relationship that RFC 1144 [6] exploits to compress the sequence number in the most efficient case. This technique may not be directly applicable to a robust solution, but may be a useful relationship to consider. However, at any point on the path (i.e. wherever a compressor might be deployed), the sequence number can be anywhere within a range defined by the TCP window. This is a combination of a number of values (buffer space at the sender; advertised buffer size at the receiver; and TCP congestion control algorithms). Missing packets or retransmissions can cause the TCP sequence number to fluctuate within the limits of this window. It would also be desirable to be able to predict the sequence number West & McCann Expires September 1, 2003 [Page 22] Internet-Draft TCP/IP Field Behavior March 2003 from some regularity. However, this also appears to be difficult to do. For example, during bulk data transfer the sequence number will tend to go up by 1 MSS per packet (assuming no packet loss). Higher level values have been seen to have an impact as well, where sequence number behavior has been observed with an 8 kbyte repeating pattern - - 5 segments of 1460 bytes followed by 1 segment of 892 bytes. It appears that implementation and how data is presented to the stack can affect the sequence number. It has been suggested that the TCP window can be tracked by the compressor, allowing it to bound the size of these jumps. For interactive flows (for example telnet), the sequence number will change by small irregular amounts. In this case the Nagle algorithm [3] commonly applies, coalescing small packets where possible to reduce the basic header overhead. This may also mean that is less likely that predictable changes in the sequence number will occur. The Nagle algorithm is an optimisation and not required to be used. Also, applications can disable the Nagle algorithm (which may be done to mitigate the delays associated with its use). It is also noted that the SYN and FIN flags (which have to be acknowledged) consume 1 byte of sequence space. 5.2.2 Acknowledgement number Much of the information about the sequence number applies equally to the acknowledgement number. However, there are some important differences. For bulk data transfers there will tend to be 1 acknowledgement for every 2 data segments. The algorithm is specified in RFC 2581 [28]. An ACK need not always be send immediately on receipt of a data segment, but must be sent within 500ms and should be generated for at least every second full sized segment (MSS) of received data. It may be seen from this that the delta for the acknowledgement number is roughly twice that of the sequence number. This is not always the case and the discussion about sequence number irregularity should be applied. As an aside, a common implementation bug was 'stretch ACKs' (acknowledgements may be generated less frequently than every two full-size data segments). This pattern can also occur following loss on the return path. Since the acknowledgement number is cumulative, dropped packets in the forward path will result in the acknowledgement number remaining constant for a time in the reverse direction. Retransmission of a West & McCann Expires September 1, 2003 [Page 23] Internet-Draft TCP/IP Field Behavior March 2003 dropped segment can then cause a substantial jump in the acknowledgement number. These jumps in acknowledgement number are bounded by the TCP window, just as for the jumps in sequence number. In the acknowledgement case, information about the advertised received window gives a bound to the size of any ACK jump. 5.2.3 Reserved This field is reserved and as such might be expected to be zero. This can no longer be assumed due to future proofing as it is only a matter of time before a suggestion for using the flag is made. 5.2.4 Flags o ECN-E (Explicit Congestion Notification) '1' to echo CE bit in IP header. Will be set in several consecutive headers (until 'acknowledged' by CWR) If ECN nonces get used, then there will be a 'nonce-sum' (NS) bit in the flags, as well. Again, transparency of the reserved bits is crucial for future-proofing this compression scheme. From an efficiency/compression standpoint, the NS bit will either be unused [always 0] or randomly changing). The nonce- sum is the 1-bit sum of the ECT codepoints, as described in [35]. o CWR (Congestion Window Reduced) '1' to signal congestion window reduced on ECN. Will generally be set in individual packets. The flag will be set once per loss event. Thus, the probability of it being set is proportional to the degree of congestion in the network, but less likely to be set than the CE flag. o ECE (Echo Congestion Experience) If 'congestion experienced' is signaled on a received IP header, this is echoed through the ECE bit in segments sent by the receiver until acknowledged by seeing the CWR bit set. Clearly in periods of high congestion and/or long RTT, this flag will be frequently set to '1'. During connection open (SYN and SYN/ACK packets) the ECN bits have special meaning: CWR + ECN-E are both set with SYN to indicate desire to use ECN CWR only is set in SYN-ACK to agree ECN West & McCann Expires September 1, 2003 [Page 24] Internet-Draft TCP/IP Field Behavior March 2003 (The difference in bit-patterns for the negotiation is so that it will work with broken stacks that reflect the value of reserved bits) o URG (Urgent Flag) '1' to indicate urgent data [unlikely with any flag other than ACK] o ACK (Acknowledgement) '1' for all except the initial 'SYN' packet o PSH (Push Function Field) generally accepted to be randomly '0' or '1'. However, may be biased more to one value than the other (this is largely down to the implementation of the stack) o RST (Reset Connection) '1' to reset a connection [unlikely with any flag other than ACK] o SYN (Synchronize Sequence Number) '1' for the SYN/SYN-ACK only at the start of a connection o FIN (End of Data : FINished) '1' to indicate 'no more data' [unlikely with any flag other than ACK] 5.2.5 Checksum Carried as the end-to-end check for the TCP data. See RFC 1144 [6] for a discussion of why this should be carried. A header compression scheme should not rely upon the TCP checksum for robustness, though, and should apply appropriate error-detection mechanisms of its own. The TCP checksum has to be considered as randomly changing. 5.2.6 Window May oscillate randomly between 0 and the receiver's window limit (for the connection). In practice, the window will either not change, or may alternate West & McCann Expires September 1, 2003 [Page 25] Internet-Draft TCP/IP Field Behavior March 2003 between a relatively small number of values. Particularly when closing (the value is getting smaller), the change in window is likely to be related to the segment size, but it is not clear that this necessarily offers any compression advantage. When the window is opening, the effect of 'Silly-Window Syndrome' avoidance should be remembered. This prevents the window opening by small amounts that would encourage the sender to clock out small segments. When thinking about what fields might change in a sequence of TCP segments, it should be noted that the receiver can generate 'window update' segments in which only the window advertisement changes. 5.2.7 Urgent pointer From a compression point of view, the Urgent Pointer is interesting because it offers an example where 'semantically identical' compression is not the same as 'bitwise identical'. This is because the value of the Urgent Pointer is only valid if the URG flag is set. However, from the discussion of the TCP Checksum above, it should be realized that this enforces bitwise transparency of the scheme and so this argument is not particularly important. If the URG flag is set, then this pointer indicates the end of the urgent data and so can be point to anywhere in the window. This may be set (and changing) over several segments. Note that urgent data is rarely used, since it is not a particularly clean way of managing out-of-band data. 5.3 Options Options occupy space at the end of the TCP header. All options are included in the checksum. An option may begin on any byte boundary. The TCP header must be padded with zeros to make the header length a multiple of 32 bits. Optional header fields are identified by an option kind field. Options 0 and 1 are exactly one octet that is their kind field. All other options have their one octet kind field, followed by a one octet length field, followed by length-2 octets of option data. 5.3.1 Options overview Table 2 classifies the IANA known options together with their associated RFCs, if applicable, from IANA [38] West & McCann Expires September 1, 2003 [Page 26] Internet-Draft TCP/IP Field Behavior March 2003 +------+--------+------------------------------------+----------+-----+ | Kind | Length | Meaning | RFC | Use | | |(octets)| | | | +------+--------+------------------------------------+----------+-----+ | 0 | - | End of Option List | RFC 793 | * | | 1 | - | No-Operation | RFC 793 | * | | 2 | 4 | Maximum Segment Size | RFC 793 | * | | 3 | 3 | WSopt - Window Scale | RFC 1323 | * | | 4 | 2 | SACK Permitted | RFC 2018 | * | | 5 | N | SACK | RFC 2018 | * | | 6 | 6 | Echo (obsoleted by option 8) | RFC 1072 | | | 7 | 6 | Echo Reply (obsoleted by option 8) | RFC 1072 | | | 8 | 10 | TSopt - Time Stamp Option | RFC 1323 | * | | 9 | 2 | Partial Order Connection Permitted | RFC 1693 | | | 10 | 3 | Partial Order Service Profile | RFC 1693 | | | 11 | 6 | CC | RFC 1644 | | | 12 | 6 | CC.NEW | RFC 1644 | | | 13 | 6 | CC.ECHO | RFC 1644 | | | 14 | 3 | Alternate Checksum Request | RFC 1146 | | | 15 | N | Alternate Checksum Data | RFC 1146 | | | 16 | | Skeeter | | | | 17 | | Bubba | | | | 18 | 3 | Trailer Checksum Option | | | | 19 | 18 | MD5 Signature Option | RFC 2385 | | | 20 | | SCPS Capabilities | | | | 21 | | Selective Negative Acks | | | | 22 | | Record Boundaries | | | | 23 | | Corruption experienced | | | | 24 | | SNAP | | | | 25 | | Unassigned (released 12/18/00) | | | | 26 | | TCP Compression Filter | | | +------+--------+------------------------------------+----------+-----+ Table 2 Description of common TCP options The 'use' column is marked with '*' to indicate those options that are most likely to be seen in TCP flows. 5.3.2 Option field behavior Generally speaking all option fields have been classified as changing. This section describes the behavior of each option referenced within an RFC, listed by 'kind' indicator. 0. End of option list This option code indicates the end of the option list. This might not coincide with the end of the TCP header according to West & McCann Expires September 1, 2003 [Page 27] Internet-Draft TCP/IP Field Behavior March 2003 the Data Offset field. This is used at the end of all options, not the end of each option, and need only be used if the end of the options would not otherwise coincide with the end of the TCP header. Defined in RFC 793 [2]. There is no data associated with this option, a compression scheme must simply be able to encode its presence. 1. No-Operation This option code may be used between options, for example, to align the beginning of a subsequent option on a word boundary. There is no guarantee that senders will use this option, so receivers must be prepared to process options even if they do not begin on a word boundary RFC 793 [2]. There is no data associated with this option, a compression scheme must simply be able to encode its presence. This may be done by noting that the option simply maintains a certain alignment and that compression need only convey this alignment. In this way, padding can just be removed. 2. Maximum Segment Size If this option is present, then it communicates the maximum receive segment size at the TCP that sends this segment. This field must only be sent in the initial connection request (i.e., in segments with the SYN control bit set). If this option is not used, any segment size is allowed RFC 793 [2]. This option is very common. The segment size is a 16-bit quantity. Theoretically this could take any value, however there are a number of values that are more common. For example, 1460 bytes is very common for TCP/IPv4 over Ethernet (though with the increased prevalence of tunnels, for example, smaller values such as 1400 have become more popular). 536 bytes is the default MSS value. This may allow for common values to be encoded more efficiently. 3. Window Scale Option (WSopt) This option may be sent in a SYN segment by TCP : (1) to indicate that it is prepared to do both send and receive window scaling, and (2) to communicate a scale factor to be applied to its receive window. West & McCann Expires September 1, 2003 [Page 28] Internet-Draft TCP/IP Field Behavior March 2003 The scale factor is encoded logarithmically, as a power of 2 (presumably to be implemented by binary shifts). Note: the window in the SYN segment itself is never scaled RFC 1072 [4]. This option may be sent in an initial segment (i.e., a segment with the SYN bit on and the ACK bit off). It may also be sent in a segment, but only if a Window Scale option was received in the initial segment. A Window Scale option in a segment without a SYN bit should be ignored. The Window field in a SYN segment itself is never scaled RFC 1323 [9] The use of window scaling does not affect the encoding of any other field during the life-time of the flow. It is only the encoding of the window scaling option itself that is important. The window scale must be between 0 and 14 (inclusive). Generally smaller values would be expected (a window scale of 14 allows for a 1Gbyte window, which is extremely large). 4. SACK-Permitted This option may be sent in a SYN by a TCP that has been extended to receive (and presumably process) the SACK option once the connection has opened RFC 2018 [18]. There is no data in this option, all that is required is for the presence of the option to be encoded. 5. SACK This option is to be used to convey extended acknowledgment information over an established connection. Specifically, it is to be sent by a data receiver to inform the data transmitter of non- contiguous blocks of data that have been received and queued. The data receiver is awaiting the receipt of data in later retransmissions to fill the gaps in sequence space between these blocks. At that time, the data receiver will acknowledge the data normally by advancing the left window edge in the Acknowledgment Number field of the TCP header. It is important to understand that the SACK option will not change the meaning of the Acknowledgment Number field, whose value will still specify the left window edge, i.e., one byte beyond the last sequence number of fully-received data RFC 2018 [18]. If SACK has been negotiated (through an exchange of SACK- Permitted options), then this option may occur when dropped West & McCann Expires September 1, 2003 [Page 29] Internet-Draft TCP/IP Field Behavior March 2003 segments are noticed by the receiver. Because this identifies ranges of blocks within the receiver's window, this can be viewed as a base value with a number of offsets. The base value (left edge of the first block) can be viewed as offset from the TCP acknowledgement number. There can be up to 4 SACK blocks in a single option. SACK blocks may occur in a number of segments (if there is more out-of-order data 'on the wire') and this will typically extend the size of or add to the existing blocks. Alternative proposals such as DSACK RFC 2883 [30] do not fundamentally change the behavior of the SACK block, from the point of view of the information contained within it. 6. Echo This option carries information that the receiving TCP may send back in a subsequent TCP Echo Reply option (see below). A TCP may send the TCP Echo option in any segment, but only if a TCP Echo option was received in a SYN segment for the connection. When the TCP echo option is used for RTT measurement, it will be included in data segments, and the four information bytes will define the time at which the data segment was transmitted in any format convenient to the sender RFC 1072 [4]. The Echo option is generally not used in practice -- it is obsoleted by the Timestamp option. However, for transparency it is desirable that a compression scheme be able to transport it. (However, there is no benefit in attempting any more sophisticated treatment than viewing it as a generic 'option'). 7. Echo Reply A TCP that receives a TCP Echo option containing four information bytes will return these same bytes in a TCP Echo Reply option. This TCP Echo Reply option must be returned in the next segment (e.g., an ACK segment) that is sent. If more than one Echo option is received before a reply segment is sent, the TCP must choose only one of the options to echo, ignoring the others; specifically, it must choose the newest segment with the oldest sequence number (see RFC 1072 [4]). The Echo option is generally not used in practice -- it is obsoleted by the Timestamp option. However, for transparency it is desirable that a compression scheme be able to transport it. (However, there is no benefit in attempting any more sophisticated treatment than viewing it as a generic 'option'). West & McCann Expires September 1, 2003 [Page 30] Internet-Draft TCP/IP Field Behavior March 2003 8. Timestamps This option carries two four-byte timestamp fields. The Timestamp Value field (TSval) contains the current value of the timestamp clock of the TCP sending the option. The Timestamp Echo Reply field (TSecr) is only valid if the ACK bit is set in the TCP header; if it is valid, it echoes a timestamp value that was sent by the remote TCP in the TSval field of a Timestamps option. When TSecr is not valid, its value must be zero. The TSecr value will generally be from the most recent Timestamp option that was received; however, there are exceptions that are explained below. A TCP may send the Timestamps option (TSopt) in an initial segment (i.e., segment containing a SYN bit and no ACK bit), and may send a TSopt in other segments only if it received a TSopt in the initial segment for the connection RFC 1323 [9]. Timestamps are quite commonly used. If timestamp options are exchanged in the connection set-up phase, then they are expected to appear on all subsequent segments. If this exchange does not happen, then they will not appear for the remainder of the flow. Note that currently it is assumed that the negotiation of options such as timestamp occurs in the SYN packets. However, should this ever be allowed to change (allowing timestamps to be enabled during an existing connection, for example), the presence of the option should still be handled correctly. Because the value being carried is a timestamp, it is logical to expect that the entire value need not be carried. There is no obvious pattern of increments that might be expected, however. An important reason for using the timestamp option is to allow detection of sequence space wrap-around (Protection Against Wrapped Sequence-number, or PAWS RFC 1323 [9]). It is not expected that this is serious concern on the links that TCP header compression would be deployed on, but it is important that the integrity of this option is maintained. This issue is discussed in, for example, RFC 3150 [32]. However, the proposed Eifel algorithm [36] makes use of timestamps and so, currently, it is recommended that timestamps are used for cellular-type links [34]. With regard to compression, it is further noted that the range of resolutions for the timestamp suggested in RFC 1323 [9] is quite wide (1ms to 1s per 'tick'). This (along with the West & McCann Expires September 1, 2003 [Page 31] Internet-Draft TCP/IP Field Behavior March 2003 perhaps wide variation in RTT) makes it hard to select a set of encodings that will be optimal in all cases. 9. Partial Order Connection (POC) permitted This option represents a simple indicator communicated between the two peer transport entities to establish the operation of the POC protocol RFC 1693 [12] The Partial Order Connection option is in practice never seen, and so the only requirement is that the header compression scheme should be able to encode it. 10. POC service profile This option serves to communicate the information necessary to carry out the job of the protocol -- the type of information that is typically found in the header of a TCP segment. The Partial Order Connection option is in practice never seen, and so the only requirement is that the header compression scheme should be able to encode it. 11. Connection Count (CC) This option is part of the implementation of TCP Accelerated Open (TAO) that effectively bypasses the TCP Three-Way Handshake (3WHS). TAO introduces a 32-bit incarnation number, called a "connection count" (CC) that is carried in a TCP option in each segment. A distinct CC value is assigned to each direction of an open connection. The implementation assigns monotonically increasing CC values to successive connections that it opens actively or passively RFC 1644 [11]. This option is in practice never seen, and so the only requirement is that the header compression scheme should be able to encode it. 12. CC.NEW Correctness of the TAO mechanism requires that clients generate monotonically increasing CC values for successive connection initiations. Receiving a CC.NEW causes the server to invalidate its cache entry and do a 3WHS. RFC 1644 [11]. This option is in practice never seen, and so the only requirement is that the header compression scheme should be West & McCann Expires September 1, 2003 [Page 32] Internet-Draft TCP/IP Field Behavior March 2003 able to encode it. 13. CC.ECHO When a server host sends a segment, it echoes the connection count from the initial in a CC.ECHO option, which is used by the client host to validate the segment RFC 1644 [11]. This option is in practice never seen, and so the only requirement is that the header compression scheme should be able to encode it. 14. Alternate Checksum Request This option may be sent in a SYN segment by a TCP to indicate that the TCP is prepared to both generate and receive checksums based on an alternate algorithm. During communication, the alternate checksum replaces the regular TCP checksum in the checksum field of the TCP header. Should the alternate checksum require more than 2 octets to transmit, the checksum may either be moved into a TCP Alternate Checksum Data Option and the checksum field of the TCP header be sent as 0, or the data may be split between the header field and the option. Alternate checksums are computed over the same data as the regular TCP checksum RFC 1146 [7] This option is in practice never seen, and so the only requirement is that the header compression scheme should be able to encode it. It would only occur in connection set-up (SYN) packets. Even if this option were used, it would not affect the handling of the checksum, since this should be carried transparently in any case. 15. Alternate Checksum Data This field is used only when the alternate checksum that is negotiated is longer than 16 bits. These checksums will not fit in the checksum field of the TCP header and thus at least part of them must be put in an option. Whether the checksum is split between the checksum field in the TCP header and the option or the entire checksum is placed in the option is determined on a checksum by checksum basis. The length of this option will depend on the choice of alternate checksum algorithm for this connection RFC 1146 [7]. If an alternative checksum were negotiated in the connection West & McCann Expires September 1, 2003 [Page 33] Internet-Draft TCP/IP Field Behavior March 2003 set-up, then this option may appear on all subsequent packets (if needed to carry the checksum data). However, this option is in practice never seen, and so the only requirement is that the header compression scheme should be able to encode it. 16. -- 18. Are non-RFC references and are not considered in this document. 19. MD5 Digest Every segment sent on a TCP connection to be protected against spoofing will contain the 16-byte MD5 digest produced by applying the MD5 algorithm to a concatenated block of data. Upon receiving a signed segment, the receiver must validate it by calculating its own digest from the same data (using its own key) and comparing the two digest. A failing comparison must result in the segment being dropped and must not produce any response back to the sender. Logging the failure is probably advisable. Unlike other TCP extensions (e.g., the Window Scale option [9]), the absence of the option in the SYN, ACK segment must not cause the sender to disable its sending of signatures. This negotiation is typically done to prevent some TCP implementations from misbehaving upon receiving options in non- SYN segments. This is not a problem for this option, since the SYN, ACK sent during connection negotiation will not be signed and will thus be ignored. The connection will never be made, and non-SYN segments with options will never be sent. More importantly, the sending of signatures must be under the complete control of the application, not at the mercy of the remote host not understanding the option. MD5 digest information should, like any cryptographically secure data, be incompressible. Therefore the compression scheme must simply transparently carry this option, if it occurs. 20. -- 26. Are non-RFC references and are not considered in this document. This only means that their behavior is not described in detail as a compression scheme is not expected to be optimised for these options. However any unrecognised option must be transparently carried by a TCP compression scheme in order to work efficiently in the presence of new or rare options. West & McCann Expires September 1, 2003 [Page 34] Internet-Draft TCP/IP Field Behavior March 2003 In the discussion above regarding timestamps it is pointed out that there is the possibility (at some time in the future) of negotiations being permitted more generally than in the SYN packets at connection set-up. Although there seems to be no compelling need to optimise for this, it is worth pointing out that the compression scheme should be able to cope with arbitrary options appearing at any point within the flow. There is also no guarantee that a compression scheme will see the SYN packets of a connection set-up. 6. Other observations 6.1 Implicit acknowledgements There may be a small number of cues for 'implicit acknowledgements' in a TCP flow. Even if the compressor only sees the data transfer direction, for example, then seeing a packet without the SYN flag set implies that the SYN packet has been received. There is a clear requirement for the deployment of compression to be topologically independent. This means that it is not actually possible to be sure that seeing a data packet at the compressor guarantees that the SYN packet has been correctly received by the decompressor (as the SYN packet may have taken an alternative path). However, it may be that there are other such cues that may be used in certain circumstances to improve compression efficiency. 6.2 Shared data It can be seen that there are two distinct deployments -- one where the forward and reverse paths share a link and one where they do not. In the former case a compressor and decompressor could be co-located. It may then be possible for the compressor and decompressor at each end of the link to exchange information. This could lead to possible optimizations. For example, acknowledgement numbers are generally taken from the sequence numbers in the opposite direction. Since an acknowledgement cannot be generated for a packet that has not passed across the link, this offers an efficient way of encoding acknowledgements. 6.3 TCP header overhead For a TCP bulk data-transfer the overhead is not that onerous, particularly compared to the typical RTP voice case. Although spectral efficiency is clearly an important goal, it does not seem critical to extract every last bit of compression gain. West & McCann Expires September 1, 2003 [Page 35] Internet-Draft TCP/IP Field Behavior March 2003 However, in the acknowledgement direction (i.e. for 'pure' acknowledgement headers) the overhead could be said to be infinite (since there is no data being carried). This is why optimizations for the acknowledgement path may be considered useful. There are a number of schemes for manipulating TCP acknowledgements to reduce the ACK bandwidth. Many of these are documented in [33] and [32]. Most of these schemes are entirely compatible with header compression, without requiring any particular support from either. While it is not expected that a compression scheme will support experimental options, it is useful that these be considered when developing header compression schemes, and vice versa. 6.4 Field independence and packet behavior It should be apparent that direct comparisons with the highly 'packet' based view of RTP compression are hard. RTP header fields tend to change regularly per-packet and many fields (IPv4 IP ID, RTP sequence number and RTP timestamp, for example) typically change in a dependent manner. However, TCP fields, such as sequence number tend to change more unpredictably, partly because of the influence of external factors (size of TCP windows, application behavior, etc.) Also, the field values tend to change indpendently. Overall, this makes compression more challenging and makes it harder to select a set of encodings that can successfully trade-off efficiency and robustness. 6.5 Short-lived flows It is hard to see what can be done to improve performance for a single, unpredictable, short-lived connection. However, there are commonly cases where there will be multiple TCP connections between the same pair of hosts. As a particular example, consider web browsing (this is more the case with HTTP/1.0 [15] than HTTP/1.1 [20]). When a connection closes, it is either the last connection between that pair of hosts or it is likely that another connection will open within a relatively short space of time. In this case, the IP header part of the context will probably be almost identical. Certain aspects of the TCP context may also be similar. Support for context replication is discussed in more detail in Section 4. Overall, support for sub-context sharing, or initializing one context from another offers useful optimizations for a sequence of short-lived connections. It is noted that although TCP is connection oriented, it is hard for West & McCann Expires September 1, 2003 [Page 36] Internet-Draft TCP/IP Field Behavior March 2003 a compressor to tell whether or not a TCP flow has finished. For example, even in the 'bi-directional' link case, seeing a FIN and the ACK of the FIN at the compressor/decompressor does not mean that the FIN cannot be retransmitted. Thus it may be more useful to think about initializing a new context from an existing one, rather than re-using an existing one. As mentioned previously, in Section 5.1.3, the IP header can clearly be shared between any transport-layer flows between the same two end- points. There may be limited scope for initialisation of a new TCP header from an existing one. The port numbers are the most obvious starting point. 6.6 Master Sequence Number As pointed out earlier in Section 5.1.3 there is no obvious candidate for a 'master sequence number' in TCP. Moreover, it is noted that such a master sequence number is only required to allow a decompressor to acknowledge packets in bi-directional mode. It can also be seen that such a sequence number would not be required for every packet. While the sequence number only needs to be 'sparse', it is clear that there is a requirement for an explicitly added sequence number. There are no obvious ways of guaranteeing the unique identity of a packet other than by adding such a sequence number (sequence and acknowledgement numbers can both remain the same, for example). As a further note, support for re-ordering of compressed packets would require a sequence number external to the compressed packet. This is so that re-ordering could be identified prior to attempting decompression. 6.7 Size constraint for TCP options It can be seen from the above analysis, most TCP options, such as MSS, WSopt, SACK-Permitted, may appear only on a SYN segment. Every implementation should (and we expect that most will) ignore unknown options on SYN segments. TCP options will be sent on non-SYN segments only when an exchange of options on the SYN segments has indicated that both sides understand the extension. Other TCP options, such as MD5 Digest, Timestamp also tend to be sent when the connection is initiated (i.e. in the SYN packet). The total header size is also an issue. The TCP header specifies where segment data starts with a 4-bit field which gives the total size of the header (including options) in 32-bit words. This means that the total size of the header plus option must be less than or West & McCann Expires September 1, 2003 [Page 37] Internet-Draft TCP/IP Field Behavior March 2003 equal to 60 bytes -- this leaves 40 bytes for options. 7. Security considerations Since this document only describes TCP field behavior there are no direct security concerns raised by it. This memo is intended to be used to aid the compression of TCP/IP headers. Where authentication mechanisms such as IPsec AH [13] are used, it is important that compression is transparent. Where encryption methods such as IPsec ESP [14] are used, the TCP fields may not be visible, preventing compression. 8. Acknowledgements Many IP and TCP RFCs (hopefully all of which have been collated below) have been sources of ideas and knowledge, together with header compression schemes from RFC 1144, RFC 2509 and RFC 3095, and of course the detailed analysis of RTP/UDP/IP in RFC 3095. This draft also benefited from discussion on the rohc mailing list and in various corridors (virtual or otherwise) about many key issues; special thanks to Qian Zhang, Carsten Bormann and Gorry Fairhurst. Qian Zhang and Hongbin Liao contributed the extensive analysis of shareable header fields. Any remaining misrepresentation or misinterpretation of information is entirely the fault of the authors of this draft. References [1] Postel, J., "Internet Protocol", STD 5, RFC 791, September 1981. [2] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, September 1981. [3] Nagle, J., "Congestion control in IP/TCP internetworks", RFC 896, January 1984. [4] Jacobson, V. and R. Braden, "TCP extensions for long-delay paths", RFC 1072, October 1988. [5] Braden, R., "Requirements for Internet Hosts - Communication Layers", STD 3, RFC 1122, October 1989. West & McCann Expires September 1, 2003 [Page 38] Internet-Draft TCP/IP Field Behavior March 2003 [6] Jacobson, V., "Compressing TCP/IP headers for low-speed serial links", RFC 1144, February 1990. [7] Zweig, J. and C. Partridge, "TCP alternate checksum options", RFC 1146, March 1990. [8] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, November 1990. [9] Jacobson, V., Braden, B. and D. Borman, "TCP Extensions for High Performance", RFC 1323, May 1992. [10] Almquist, P., "Type of Service in the Internet Protocol Suite", RFC 1349, July 1992. [11] Braden, B., "T/TCP -- TCP Extensions for Transactions Functional Specification", RFC 1644, July 1994. [12] Connolly, T., Amer, P. and P. Conrad, "An Extension to TCP : Partial Order Service", RFC 1693, November 1994. [13] Atkinson, R., "IP Authentication Header", RFC 1826, August 1995. [14] Atkinson, R., "IP Encapsulating Security Payload (ESP)", RFC 1827, August 1995. [15] Berners-Lee, T., Fielding, R. and H. Nielsen, "Hypertext Transfer Protocol -- HTTP/1.0", RFC 1945, May 1996. [16] Bellovin, S., "Defending Against Sequence Number Attacks", RFC 1948, May 1996. [17] Stevens, W., "TCP Slow Start, Congestion Avoidance, Fast Retransmit, and Fast Recovery Algorithms", RFC 2001, January 1997. [18] and, M., Floyd, S. and A. Romanow, "TCP Selective Acknowledgment Options", RFC 2018, October 1996. [19] Bradner, S., "The Internet Standards Process -- Revision 3", BCP 9, RFC 2026, October 1996. [20] Fielding, R., Gettys, J., Mogul, J., Nielsen, H. and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2068, January 1997. [21] Bradner, S., "Key words for use in RFCs to Indicate Requirement West & McCann Expires September 1, 2003 [Page 39] Internet-Draft TCP/IP Field Behavior March 2003 Levels", BCP 14, RFC 2119, March 1997. [22] Heffernan, A., "Protection of BGP Sessions via the TCP MD5 Signature Option", RFC 2385, August 1998. [23] Nichols, K., Blake, S., Baker, F. and D. Black, "Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers", RFC 2474, December 1998. [24] Ramakrishnan, K. and S. Floyd, "A Proposal to add Explicit Congestion Notification (ECN) to IP", RFC 2481, January 1999. [25] Degermark, M., Nordgren, B. and S. Pink, "IP Header Compression", RFC 2507, February 1999. [26] Casner, S. and V. Jacobson, "Compressing IP/UDP/RTP Headers for Low-Speed Serial Links", RFC 2508, February 1999. [27] Engan, M., Casner, S. and C. Bormann, "IP Header Compression over PPP", RFC 2509, February 1999. [28] Allman, M., Paxson, V. and W. Stevens, "TCP Congestion Control", RFC 2581, April 1999. [29] Bradner, S. and V. Paxson, "IANA Allocation Guidelines For Values In the Internet Protocol and Related Headers", BCP 37, RFC 2780, March 2000. [30] Floyd, S., Mahdavi, J., Mathis, M. and M. Podolsky, "An Extension to the Selective Acknowledgement (SACK) Option for TCP", RFC 2883, July 2000. [31] Bormann, C., Burmeister, C., Degermark, M., Fukushima, H., Hannu, H., Jonsson, L-E., Hakenberg, R., Koren, T., Le, K., Liu, Z., Martensson, A., Miyazaki, A., Svanbro, K., Wiebke, T., Yoshimura, T. and H. Zheng, "RObust Header Compression (ROHC): Framework and four profiles: RTP, UDP, ESP, and uncompressed", RFC 3095, July 2001. [32] Dawkins, S., Montenegro, G., Kojo, M. and V. Magret, "End-to- end Performance Implications of Slow Links", BCP 48, RFC 3150, July 2001. [33] Balakrishnan, , Padmanabhan, V., Fairhurst, G. and M. Sooriyabandara, "TCP Performance Implications of Network Path Asymmetry", RFC 3449, December 2002. [34] Inamura, H., Montenegro, G., Ludwig, R., Gurtov, A. and F. West & McCann Expires September 1, 2003 [Page 40] Internet-Draft TCP/IP Field Behavior March 2003 Khafizov, "TCP over Second (2.5G) and Third (3G) Generation Wireless Networks", RFC 3481, February 2003. [35] Spring, N., Wetherall, D. and D. Ely, "Robust ECN Signaling with Nonces", draft-ietf-tsvwg-tcp-nonce-04.txt (work in progress), October 2002. [36] Ludwig, R. and M. Meyer, "The Eifel Detection Algorithm for TCP", draft-ietf-tsvwg-tcp-eifel-alg-07.txt (work in progress), December 2002. [37] Karn, , "Advice for Internet Subnetwork Designers", draft-ietf- pilc-link-design-13.txt (work in progress), February 2003. [38] Authors' Addresses Mark A. West Siemens/Roke Manor Roke Manor Research Ltd. Romsey, Hants SO51 0ZN UK Phone: +44 (0)1794 833311 EMail: mark.a.west@roke.co.uk URI: http://www.roke.co.uk Stephen McCann Siemens/Roke Manor Roke Manor Research Ltd. Romsey, Hants SO51 0ZN UK Phone: +44 (0)1794 833341 EMail: stephen.mccann@roke.co.uk URI: http://www.roke.co.uk West & McCann Expires September 1, 2003 [Page 41] Internet-Draft TCP/IP Field Behavior March 2003 Full Copyright Statement Copyright (C) The Internet Society (2003). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society. West & McCann Expires September 1, 2003 [Page 42]