<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE RFC SYSTEM "rfc2629.dtd" [
<!ENTITY RFC4648 SYSTEM
"http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.4648.xml">
<!ENTITY RFC2119 SYSTEM
"http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml">
]>
<?xml-stylesheet type='text/xsl' href='RFC2629.xslt' ?>
<?rfc compact="yes"?>
<?rfc toc="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<!-- Expand crefs and put them inline -->
<?rfc comments='yes' ?>
<?rfc inline='yes' ?>  

<rfc docName="draft-faltstrom-base45-00" ipr="trust200902" category="std">
  <front>
    <title abbrev="Base45">
      The Base45 Data Encoding
    </title>
    <author fullname="Patrik Faltstrom" initials="P." surname="Faltstrom">
      <organization abbrev="Netnod">Netnod</organization>
      <address>
	<email>paf@netnod.se</email>
      </address>
    </author>
    <author fullname="Fredrik Ljunggren" initials="F." surname="Ljunggren">
      <organization abbrev="Kirei">Kirei</organization>
      <address>
	<email>fredrik@kirei.se</email>
      </address>
    </author>
    <author fullname="Dirk-Willem van Gulik" initials="D." surname="van Gulik">
      <organization abbrev="Webweaving">Webweaving</organization>
      <address>
	<email>dirkx@webweaving.org</email>
      </address>
    </author>
    <date month="March" year="2021" day="11"/>
    <area>Operations</area>
    <keyword>BASE45</keyword>
    <abstract>
      <t>
	This document describes the base 45 encoding scheme which is
	built upon the base 64, base 32 and base 16 encoding schemes.
      </t>
    </abstract>
  </front>
  <middle>
    <section anchor="intro" title="Introduction">
      <t>
	When using QR or Aztec codes a different encoding scheme is
	needed than the already established base 64, base 32 and base
	16 encoding schemes that are described in <xref
	target="RFC4648">RFC 4648</xref>. The difference from those and
	base 45 is the key table and that the padding with '=' is not
	required.
      </t>
    </section>
    <section title="Conventions Used in This Document">
      <t>
	The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
	NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
	"OPTIONAL" in this document are to be interpreted as described
	in <xref target="RFC2119">RFC 2119</xref>.
      </t>
    </section>
    <section title="Interpretation of Encoded Data">
      <t>
	Encoded data is to be interpreted as described in <xref
	target="RFC4648">RFC 4648</xref> with the exception that a
	different alphabet is selected.
      </t>
    </section>
    <section title="The Base 45 Encoding">
      <t>
	A 45-character subset of US-ASCII is used, the 45 characters
	that can be used in a QR or Aztec code. If we look at Base 64,
	it encodes 3 bytes in 4 characters. Base 45 encodes 2 bytes
	over 3 characters.
      </t>
      <t>
	The two bytes [A, B] are turned into [ C, D, E] where (A*256)
	+ B = (C*45*45) + (D*45) + E. The values C, D and E are then
	looked up in Table 1 to produce a three character string and
	the reverse when decoding.
      </t>
      <t>
	If the number of octets are not dividable by two, the last
	remaining byte is represented by two characters.
      </t>
      <t>
	<figure><artwork>
               Table 1: The Base 45 Alphabet

Value Encoding  Value Encoding  Value Encoding  Value Encoding
   00 0            12 C            24 O            36 Space
   01 1            13 D            25 P            37 $
   02 2            14 E            26 Q            38 %
   03 3            15 F            27 R            39 *
   04 4            16 G            28 S            40 +
   05 5            17 H            29 T            41 -
   06 6            18 I            30 U            42 .
   07 7            19 J            31 V            43 /
   08 8            20 K            32 W            44 :
   09 9            21 L            33 X
   10 A            22 M            34 Y
   11 B            23 N            35 Z
        </artwork></figure>
      </t>
      <section title="Encoding example">
	<t>
	  A series of bytes is turned into groups of two. Each such 16
	  bit value is turned into a series of three values calculated
	  by doing successive calculations modulo 45. The values are
	  in turned looked up in what is displayed in Table 1.
	</t>
	<t>
	  Example: The string "Hello!" is the byte sequence [ 72, 101,
	  108, 108, 111, 33 ]. If we look at each 16 bit value, it is
	  [ 18633, 27756, 28449]. When looking at the values modulo
	  45, we get [[ 9, 9, 3], [ 13, 30, 36], [14, 2, 9]]. By
	  looking up these values in the table we get the encoded
	  string "993DU E29".
	</t>
      </section>
    </section>
    <section title="IANA Considerations">
      <t>
	There are no considerations for IANA in this document.
      </t>
    </section>
    <section title="Security Considerations">
      <t>
	When implementing encoding and decoding it is important to be
	very careful so that buffer overflow does not take place, or
	anything similar. This includes of course the calculations of
	modulo 45 and lookup in the table of characters. Decoder also
	must be robust regarding input, including proper handling of
	the NUL character (ASCII 0).
      </t>
      <t>
	Specifically it should be noted that Base 64 (for example) pad
	the string so that the encoding has the correct number of
	characters. This is something that Base 45 does not do,
	i.e. Base 45 do not include padding.  Because of this, special
	care is to be taken when odd number of octets are to be
	encoded which results not in N*3 characters, but (N-1)*3+2
	characters in the encoded string and vice versa, when the
	number of encoded characters are not divisible by 3.
      </t>
    </section>
    <section title="Acknowledgements">
      <t>
	The authors thank everyone that have been working with Base64
	during the years that have proven the implementions are
	stable.
      </t>
    </section>
  </middle>
  <back>
    <references title='Normative References'>
      &RFC4648;
      &RFC2119;
    </references>
  </back>
</rfc>
