IPsec Working Group R. Housley Internet Draft RSA Laboratories expires in six months July 2002 Using AES Counter Mode With IPsec ESP Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This document is a submission to the IETF Internet Protocol Security (IPsec) Working Group. Please send comments on this document to the working group mailing list (ipsec@lists.tislabs.com). Distribution of this memo is unlimited. Abstract This document describes the use of AES Counter Mode, with an explicit initialization vector, as an IPsec Encapsulating Security Payload confidentiality mechanism. Housley [Page 1] INTERNET DRAFT July 2002 1. Introduction The National Institute of Standards and Technology (NIST) recently selected the Advanced Encryption Standard (AES) [AES], also known as Rijndael. The AES is a block cipher, and it can be used in many different modes. This document describes the use of AES Counter Mode (AES-CTR), with an explicit initialization vector (IV), as an IPsec Encapsulating Security Payload (ESP) [ESP] confidentiality mechanism. This document does not provide an overview of IPsec. However, information about how the various components of IPsec and the way in which they collectively provide security services is available in [ARCH] and [ROAD]. 1.1. Conventions Used In This Document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [STDWORDS]. 2. AES Block Cipher This section contains a brief description of the relevant characteristics of the AES block cipher. Implementation requirements are also discussed. 2.1. Counter Mode NIST has defined five modes of operation for AES and other FIPS- approved block ciphers [MODES]. Each of these modes has different characteristics. The five modes are: ECB (Electronic Code Book), CBC (Cipher Block Chaining), CFB (Cipher FeedBack), OFB (Output FeedBack), and CTR (Counter). In this specification, only AES-CTR is discussed. This mode requires the encryptor to generate a unique per-packet value, and communicate this value to the decryptor. This specification calls this per-packet value an initialization vector (IV). The same IV and key combination MUST NOT be used more than once. The encryptor can generate the IV in any manner that ensures uniqueness. Common approaches to IV generation include incrementing a counter for each packet and linear feedback shift registers (LFSRs). AES Counter mode (AES-CTR) has many properties that make it an attractive encryption algorithm for in high-speed networking. AES-CTR uses the AES block cipher to create a stream cipher. It is easy to implement, and it is parallelizable. It can take advantage of pipelining. Further, it uses the only AES encrypt operation (for both Housley [Page 2] INTERNET DRAFT July 2002 encryption and decryption), making AES-CTR implementations smaller than many other AES modes. When used correctly, AES-CTR provides a high level of confidentiality. Unfortunately, AES-CTR is easy to use incorrectly. Being a stream cipher, any reuse of the per-packet value, called the IV, with the same key is catastrophic. An IV collision immediately leaks information about the plaintext in both packets. For this reason, it is inappropriate to use this mode of operation with statically configured keys. Extraordinary measures would be needed to prevent reuse of an IV value with the static key across power cycles. To be safe, implementations MUST use fresh keys with AES-CTR. The Internet Key Exchange (IKE) [IKE] protocol can be used to establish fresh keys. With AES-CTR, it is trivial to use a valid ciphertext to forge other (valid to the decryptor) ciphertexts. Thus, it is equally catastrophic to use AES-CTR without a companion authentication function. To be safe, implementations MUST use AES-CTR in conjunction with an authentication function, such as HMAC-SHA-1-96 [HMAC-SHA]. To encrypt a payload with AES-CTR, the encryptor partitions the plaintext, PT, into 128-bit blocks. The final block need not be full 128 bits. PT = PT[1] PT[2] ... PT[n] Each block of PT is then XORed with a block of the key stream to generate the ciphertext, CT. The AES encryption of each counter block results in 128 bits of key stream. Part of the 128-bit counter block is set to the per-packet IV value, and the least significant 32 bits of the counter block are initially set to zero. This counter value is incremented by one to generate subsequent counter blocks, each resulting in another 128 bits of key stream. The encryption of n plaintext blocks can be summarized as: CTRBLK := IV || ZERO FOR i := 1 to n-1 DO CT[i] := PT[i] XOR AES(CTRBLK) CTRBLK := CTRBLK + 1 END CT[n] := PT[n] XOR TRUNC(AES(CTRBLK)) The TRUNC() function truncates the output of the AES encrypt operation to the same length as the final plaintext block, returning the most significant bits. Housley [Page 3] INTERNET DRAFT July 2002 Decryption is similar. The decryption of n ciphertext blocks can be summarized as: CTRBLK := IV || ZERO FOR i := 1 to n-1 DO PT[i] := CT[i] XOR AES(CTRBLK) CTRBLK := CTRBLK + 1 END PT[n] := CT[n] XOR TRUNC(AES(CTRBLK)) 2.2. Key Size and Rounds AES supports three key sizes: 128 bits, 192 bits, and 256 bits. The default key size is 128 bits, and all implementations MUST support this key size. Implementations MAY also support key sizes of 192 bits and 256 bits. AES uses a different number of rounds for each of the defined key sizes. When a 128-bit key is used, implementations MUST use 10 rounds. When a 192-bit key is used, implementations MUST use 12 rounds. When a 256-bit key is used, implementations MUST use 14 rounds. 2.3. Block Size The AES has a block size of 128 bits (16 octets). As such, when using AES-CTR, each AES encrypt operation generates 128 bits of key stream. AES-CTR encryption is the XOR of the key stream with the plaintext. AES-CTR decryption is the XOR of the key stream with the ciphertext. If the generated key stream is longer than the plaintext or ciphertext, the extra key stream bits are simply discarded. For this reason, AES-CTR does not require the plaintext to be padded to a multiple of the block size. However, to provide limited traffic flow confidentiality, padding MAY be included, as specified in [ESP]. 3. ESP Payload The ESP payload is comprised of the IV followed by the ciphertext. The payload field, as defined in [ESP], is structured as shown in Figure 1. Housley [Page 4] INTERNET DRAFT July 2002 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Initialization Vector | | (8 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | ~ Encrypted Payload (variable) ~ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | ~ Authentication Data (variable) ~ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 1. ESP Payload Encrypted with AES-CTR 3.1. Initialization Vector The AES-CTR IV field MUST be eight octets. The IV MUST be chosen by the encryptor in a manner that ensures that the same IV value is used only once for a given key. The encryptor can generate the IV in any manner that ensures uniqueness. Common approaches to IV generation include incrementing a counter for each packet and linear feedback shift registers (LFSRs). Including the IV in each packet ensures that the decryptor can generate the key stream needed for decryption, even when some datagrams are lost or reordered. 3.2. Encrypted Payload The encrypted payload contains the ciphertext. AES-CTR mode does not require plaintext padding. However, ESP does require padding to 32-bit word-align the authentication data. The padding, Pad Length, and the Next Header MUST be concatenated with the plaintext before performing encryption, as described in [ESP]. 3.3. Authentication Data Since it is trivial to construct a forgery AES-CTR ciphertext from a valid AES-CTR ciphertext, AES-CTR implementations MUST employ a non- NULL ESP authentication method. HMAC-SHA-1-96 [HMAC-SHA] is a likely choice. Housley [Page 5] INTERNET DRAFT July 2002 4. Counter Block Format Each packet conveys the IV that is necessary to construct the sequence of counter blocks used to generate the key stream necessary to decrypt the payload. The AES counter block cipher block is 128 bits. Figure 2 shows the format of the counter block. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Flags | Truncated SPI | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Initialization Vector | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Block Counter | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 2. Counter Block Format The components of the counter block are as follows: Flags The Flags field is 8 bits. It MUST be set to zero. The Flags field provides compatibility with CCM mode [CCM]. Truncated SPI The truncated SPI field is 24 bits. As the name implies, it contains the least significant 24 bits of the ESP SPI. Initialization Vector The IV field is 64 bits. As described in section 3, the IV MUST be chosen by the encryptor in a manner that ensures that the same IV value is used only once for a given key. Block Counter The block counter field is the least significant 32 bits of the counter block. The block counter begins with the value of zero, and it is incremented to generate subsequent portions of the key stream. The block counter is a 32-bit big-Endian integer value. The first 128-bit block of the packet plaintext is encrypted by XORing the plaintext block with the AES encryption of the counter block (with the block counter set to zero), the second is encrypted by XORing the second block of plaintext with AES encryption of the incremented counter block (with the block counter set to one), and so on. Housley [Page 6] INTERNET DRAFT July 2002 This construction permits each packet to consist of up to: 2^32 blocks = 4,294,967,296 blocks = 68,719,476,736 octets This construction provides more key stream for each packet than is needed to handle any IPv6 Jumbogram. 5. Test Vectors To be supplied. 6. Security Considerations When used properly, AES-CTR mode provides strong confidentiality. Bellare, Desai, Jokipii, Rogaway show in [BDJR] that the privacy guarantees provided by counter mode are at least as strong as those for CBC mode when using the same block cipher. Unfortunately, it is very easy to misuse this counter mode. If a counter value is ever used for more that one packet with the same key, then the same key stream will be used to encrypt both packets, and the confidentiality guarantees are voided. What happens if the encryptor XORs the same key stream with two different plaintexts? Suppose two plaintext byte sequences P1, P2, P3 and Q1, Q2, Q3 are both encrypted with key stream K1, K2, K3. The two corresponding ciphertexts are: (P1 XOR K1), (P2 XOR K2), (P3 XOR K3) (Q1 XOR K1), (Q2 XOR K2), (Q3 XOR K3) If both of these two ciphertext streams are exposed to an attacker, then a catastrophic failure of confidentiality results, since: (P1 XOR K1) XOR (Q1 XOR K1) = P1 XOR Q1 (P2 XOR K2) XOR (Q2 XOR K2) = P2 XOR Q2 (P3 XOR K3) XOR (Q3 XOR K3) = P3 XOR Q3 Once the attacker obtains the two plaintexts XORed together, it is relatively straightforward to separate them. Thus, using any stream cipher, including AES-CTR, to encrypt two plaintexts under the same key stream leaks the plaintext. Therefore, stream ciphers, including AES-CTR, should not be used with statically configured keys. It is inappropriate to use this m AES- Housley [Page 7] INTERNET DRAFT July 2002 CTR with statically configured keys. Extraordinary measures would be needed to prevent reuse of a counter block value with the static key across power cycles. To be safe, implementations MUST use fresh keys with AES-CTR. The Internet Key Exchange (IKE) [IKE] protocol can be used to establish fresh keys. When IKE is used to establish fresh keys between two peer entities, separate keys are established for the two traffic flows. If a different mechanism is used to establish fresh keys, one that establishes only a single key to encrypt packets, then there is a high probability that the peers will select the same IV values for some packets. Thus, to avoid counter block collisions, ESP implementations that permit use of the same key for encrypting and decrypting packets with the same peer MUST ensure that the two peers assign different SPI values to the security association (SA). Further, since the counter block only contains the least significant 24 bits of the SPI, such implementations MUST ensure that the two SPI values differ in the least significant bits. Data forgery is trivial with CTR mode. The demonstration of this attack is very similar to discussion above. If a known plaintext byte sequence P1, P2, P3 is encrypted with key stream K1, K2, K3, then the attacker can replace the plaintext with one of his own choosing. The ciphertext is: (P1 XOR K1), (P2 XOR K2), (P3 XOR K3) The attacker simply XORs a selected sequence Q1, Q2, Q3 with the ciphertext to obtain: (Q1 XOR (P1 XOR K1)), (Q2 XOR (P2 XOR K2)), (Q3 XOR (P3 XOR K3)) Which is the same as: ((Q1 XOR P1) XOR K1), ((Q2 XOR P2) XOR K2), ((Q3 XOR P3) XOR K3) Decryption of the attacker-generated ciphertext will yield exactly what the attacker intended: (Q1 XOR P1), (Q2 XOR P2), (Q3 XOR P3) Accordingly, ESP implementations that MUST NOT allow the use of AES- CTR without ESP authentication. Additionally, AES with a 128-bit key is vulnerable to the birthday attack after 2^64 blocks are encrypted with a single key, regardless of the mode used. Since ESP with Enhanced Sequence Numbers allows for up to 2^64 packets in a single security association (SA), there Housley [Page 8] INTERNET DRAFT July 2002 is real potential for more than 2^64 blocks to be encrypted with one key. Implementations SHOULD generate a fresh key before 2^64 blocks are encrypted with the same key, or implementations SHOULD make use of the longer AES key sizes. Note that ESP with 32-bit Sequence Numbers will not exceed 2^64 blocks even if all of the packets are maximum-length Jumbograms. 7. Design Rationale In the development of this specification, the use of the ESP sequence number field instead of an explicit IV field was considered. This section documents the rationale for the selection of an explicit IV. This selection is not a cryptographic security issue, as either approach will prevent counter block collisions. The use of the explicit IV does not dictate the manner that the encryptor uses to assign the per-packet value in the counter block. This is desirable for several reasons. 1. Only the encryptor can ensure that the value is not used for more than one packet, so there is no advantage to selecting a mechanism that allows the decryptor to determine whether counter block values collide. Damage from the collision is done, whether the decryptor detects it or not. 2. Allows adders, LFSRs, and any other technique that meets the time budget of the encryptor, so long as the technique results in a unique value for each packet. Adders are simple and straightforward to implement, but due to carries, they do not execute in constant time. LSFRs offer an alternative that executes in constant time. 3. Complexity is in control of the implementer. Further, the decision made by the implementer of the encryptor does not make the decryptor more (or less) complex. 4. The assignment of the per-packet counter block value needs to be inside the assurance boundary. Some implementations assign the sequence number inside the assurance boundary, but others do not. A sequence number collision does not have the dire consequences, but, as described in section 6, a collision in counter block values has disastrous consequences. 5. Coupling with the sequence number is possible in those architectures where the sequence number assignment is performed in the assurance boundary. In this situation, the sequence number and the IV field will contain the same value. Housley [Page 9] INTERNET DRAFT July 2002 6. Decoupling from the sequence number is possible in those architectures where the sequence number assignment is performed outside the assurance boundary. The use of an explicit IV field directly follows from the decoupling of the sequence number and the per-packet counter block value. The additional overhead (64 bits for the IV field) is acceptable. This overhead is significantly less overhead associated with Cipher Block Chaining (CBC) mode. As normally employed, CBC requires a full block for the IV and, on average, half of a block for padding. AES-CTR with an explicit IV has about one-third of the overhead as AES-CBC, and the overhead is constant for each packet. 8. IANA Considerations IANA has assigned three ESP transform numbers for use with AES Counter Mode with an explicit IV, one for each AES key size: for AES-CTR with a 128 bit key; for AES-CTR with a 192 bit key; and for AES-CTR with a 256 bit key. 9. Acknowledgements This document is the result of extensive discussions and compromises. While not all of the participants are completely satisfied with the outcome, the document is better for their contributions. The author thanks the members of the IPsec working group, with special mention of the efforts of (in alphabetical order): Steve Bellovin, Niels Ferguson, Steve Kent, David McGrew, Robert Moskowitz, Jesse Walker, and Doug Whiting. 10. References This section provides normative and informative references. 10.1. Normative References [AES] NIST, FIPS PUB 197, "Advanced Encryption Standard (AES)," November 2001. [ESP] Kent, S. and R. Atkinson, "IP Encapsulating Security Payload (ESP)," RFC 2406, November 1998. [MODES] Dworkin, M., "Recommendation for Block Cipher Modes of Operation: Methods and Techniques," NIST Special Publication 800-38A, December 2001. Housley [Page 10] INTERNET DRAFT July 2002 [STDWORDS] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels," RFC 2119, March 1997. 10.2. Informative References [ARCH] Kent, S. and R. Atkinson, "Security Architecture for the Internet Protocol," RFC 2401, November 1998. [BDJR] Bellare, M, Desai, A., Jokipii, E., and P. Rogaway, "A Concrete Security Treatment of Symmetric Encryption: Analysis of the DES Modes of Operation", Proceedings 38th Annual Symposium on Foundations of Computer Science, 1997. [CCM] Whiting, D., Housley, R. and N. Ferguson, "AES Encryption & Authentication Using CTR Mode & CBC-MAC," IEEE P802.11 doc 02/001r2, May 2002. [HMAC-SHA] Madson, C. and R. Glenn, "The Use of HMAC-SHA-1-96 within ESP and AH," RFC 2404, November 1998. [IKE] Harkins, D. and D. Carrel, "The Internet Key Exchange (IKE)," RFC 2409, November 1998. [ROAD] Thayer, R., N. Doraswamy and R. Glenn, "IP Security Document Roadmap," RFC 2411, November 1998. 11. Author's Address Russell Housley RSA Laboratories 918 Spring Knoll Drive Herndon, VA 20170 USA rhousley@rsasecurity.com Housley [Page 11] INTERNET DRAFT July 2002 12. Full Copyright Statement Copyright (C) The Internet Society 2002. All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Housley [Page 12]