POP3 Support for UTF-8
Author(s): Chris Newman, Randall Gellens, Jiankang Yao, Kazunori Fujiwara
This specification extends the Post Office Protocol version 3 (POP3) to support UTF-8 encoded international string in user names, passwords, mail addresses, message headers, and protocol-level textual strings....
Network Working Group R. Gellens Internet-Draft QUALCOMM Incorporated Obsoletes: 5721 (if approved) C. Newman Intended status: Standards Track Oracle Expires: April 25, 2013 J. Yao CNNIC K. Fujiwara JPRS October 22, 2012 POP3 Support for UTF-8 draft-ietf-eai-rfc5721bis-08.txt Abstract This specification extends the Post Office Protocol version 3 (POP3) to support UTF-8 encoded international string in user names, passwords, mail addresses, message headers, and protocol-level textual strings. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on April 25, 2013. Copyright Notice Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect Gellens, et al. Expires April 25, 2013 [Page 1] Internet-Draft POP3 Support for UTF-8 October 2012 to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Conventions Used in This Document . . . . . . . . . . . . 3 2. UTF8 Capability . . . . . . . . . . . . . . . . . . . . . . . 4 2.1. The UTF8 Command . . . . . . . . . . . . . . . . . . . . . 4 2.2. USER Argument to UTF8 Capability . . . . . . . . . . . . . 6 3. LANG Capability . . . . . . . . . . . . . . . . . . . . . . . 6 4. Non-ASCII character Maildrops . . . . . . . . . . . . . . . . 9 5. UTF8 Response Code . . . . . . . . . . . . . . . . . . . . . . 9 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 7. Security Considerations . . . . . . . . . . . . . . . . . . . 10 8. Change History . . . . . . . . . . . . . . . . . . . . . . . . 10 8.1. draft-ietf-eai-rfc5721bis: Version 00 . . . . . . . . . . 10 8.2. draft-ietf-eai-rfc5721bis: Version 01 . . . . . . . . . . 10 8.3. draft-ietf-eai-rfc5721bis: Version 02 . . . . . . . . . . 11 8.4. draft-ietf-eai-rfc5721bis: Version 03 . . . . . . . . . . 11 8.5. draft-ietf-eai-rfc5721bis: Version 04 . . . . . . . . . . 11 8.6. draft-ietf-eai-rfc5721bis: Version 05 . . . . . . . . . . 11 8.7. draft-ietf-eai-rfc5721bis: Version 06 . . . . . . . . . . 11 8.8. draft-ietf-eai-rfc5721bis: Version 07 . . . . . . . . . . 11 8.9. draft-ietf-eai-rfc5721bis: Version 08 . . . . . . . . . . 11 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 9.1. Normative References . . . . . . . . . . . . . . . . . . . 11 9.2. Informative References . . . . . . . . . . . . . . . . . . 13 Appendix A. Design Rationale . . . . . . . . . . . . . . . . . . 13 Appendix B. Acknowledgments . . . . . . . . . . . . . . . . . . . 13 Gellens, et al. Expires April 25, 2013 [Page 2] Internet-Draft POP3 Support for UTF-8 October 2012 1. Introduction This document forms part of the Email Address Internationalization protocols described in the Email Address Internationalization Framework document [RFC6530]. As part of the overall Email Address Internationalization work, email messages could be transmitted and delivered containing Unicode string encoded in UTF-8 in the header and/or body, and maildrops that are accessed using POP3 [RFC1939] might natively store UTF-8. This specification extends POP3 [RFC1939] using the POP3 extension mechanism [RFC2449] to permit un-encoded UTF-8 [RFC3629] in headers, and bodies (e.g., transferred using 8-bit Content Transfer Encoding) as described in "Internationalized Email Headers" [RFC6532]. It also adds a mechanism to support login names and passwords containing UTF-8 string and a mechanism to support UTF-8 string in protocol level response strings as well as the ability to negotiate a language for such response strings. This specification also adds a new response code to indicate that a message was not delivered because it required UTF-8 mode discussed in section 2 and the server was unable or unwilling to create and deliver a variant form of the message as discussed in Section 7 of [I-D.ietf-eai-5738bis]. This specification replaces an earlier, experimental, approach to the same problem RFC 5721 [RFC 5721]. 1.1. Conventions Used in This Document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in "Key words for use in RFCs to Indicate Requirement Levels" [RFC 2119]. The terms "UTF-8 string" or "UTF-8 character" are used to refer to Unicode characters, which may or may not be members of the ASCII subset, in UTF-8 RFC3629 [RFC3629], a standard Unicode Encoding Form. All other specialized terms used in this specification are defined in the Email Address Internationalization framework document. In examples, "C:" and "S:" indicate lines sent by the client and server, respectively. If a single "C:" or "S:" label applies to multiple lines, then the line breaks between those lines are for editorial clarity only and are not part of the actual protocol exchange. Note that examples always use ASCII characters due to limitations of Gellens, et al. Expires April 25, 2013 [Page 3] Internet-Draft POP3 Support for UTF-8 October 2012 this document format; otherwise, some examples for the "LANG" command may appear incorrectly. 2. UTF8 Capability This specification adds a new POP3 Extension [RFC2449] capability response tag and command to specify support for header field information in UTF-8 rather than only ASCII. The capability tag and new command and functionality are described below. CAPA tag: UTF8 Arguments with CAPA tag: USER Added Commands: UTF8 Standard commands affected: USER, PASS, APOP, LIST, TOP, RETR Announced states / possible differences: both / no Commands valid in states: AUTHORIZATION Specification reference: this document Discussion: This capability adds the "UTF8" command to POP3. The UTF8 command switches the session from the ASCII-only mode of RFC1939 to UTF-8 mode. The UTF-8 mode means that, all messages transmitted between servers and clients are UTF-8 strings, and both servers and clients can send and accept UTF-8 string. 2.1. The UTF8 Command The UTF8 command enables UTF-8 mode. The UTF8 command has no parameters. UTF-8 mode has no effect on messages in an ASCII-only maildrop. Messages in native UTF-8 maildrops can be encoded either in UTF-8 using internationalized headers [RFC6532] and/or 8bit content- transfer-encoding (see MIME Section 2.8 [RFC2045]), or in ASCII. The Gellens, et al. Expires April 25, 2013 [Page 4] Internet-Draft POP3 Support for UTF-8 October 2012 message at maildrops can be encoded in ASCII, UTF-8, or something else. In UTF-8 mode, if the character encoding format of maildrops is UTF-8 or ASCII, the messages are sent to the client as-is; if the character encoding format of maildrops is format other than UTF-8 or ASCII, the messages' encoding format SHOULD be converted to be UTF-8 before they are sent to the client. When UTF-8 mode has not been enabled, non-ASCII strings MUST NOT be sent to the client as-is. If a client requests a UTF-8 message when UTF-8 mode is not enabled, the server MUST either send the client a surrogate message that complies with unextended POP and Internet Mail Format without UTF-8 mode support, or fail the request with a -ERR response. See [I-D.ietf-eai-5738bis], Section 7, for information about creating a surrogate message, and for a discussion of potential issues. Section 5 of this document discusses UTF8 response codes. The server MAY respond to the UTF8 command with an -ERR response. Note that even in UTF-8 mode, MIME binary content-transfer-encoding as defined in MIME Section 6.2 [RFC2045] is still not permitted. MIME 8bit content-transfer-encoding (8BITMIME) [RFC6152] is obviously allowed. The octet count (size) of a message reported in a response to the LIST command SHOULD match the actual number of octets sent in a RETR response (not counting byte-stuffing). Sizes reported elsewhere, such as in STAT responses and non-standardized, free-form text in positive status indicators (following "+OK") need not be accurate, but it is preferable if they were. Normal operation for maildrops that natively support non-ASCII characters will be for both servers and clients to support the extension discussed in this specification. Upgrading of both clients and servers is the only fully satisfactory way to support the capabilities offered by the "UTF8" extension and SMTPUTF8 mail more generally. Servers must, however, anticipate the possibility of a client attempting to access a message that requires this extension without having issued the "UTF8" command. There are no completely satisfactory responses for that case other than upgrading the client to support this specification. One solution, unsatisfactory because the user may be confused by being able to access the message through some means and not others, is that a server MAY choose to reject the command to retrieve the message as discussed in Section 5. Other alternatives, including the possibility of creating and delivering variant form of the message, are discussed in Section 7 of [I-D.ietf-eai-5738bis]. Clients MUST NOT issue the STLS command [RFC2595] after issuing UTF8; servers MAY (but are not required to) enforce this by rejecting with an "-ERR" response an STLS command issued subsequent to a successful Gellens, et al. Expires April 25, 2013 [Page 5] Internet-Draft POP3 Support for UTF-8 October 2012 UTF8 command. (Because this is a protocol error as opposed to a failure based on conditions, an extended response code [RFC2449] is not specified.) 2.2. USER Argument to UTF8 Capability If the USER argument is included with this capability, it indicates that the server accepts UTF-8 user names and passwords. Servers that include the USER argument in the UTF8 capability response SHOULD apply SASLprep [RFC4013] or one of its standards- track successors to the arguments of the USER and PASS commands. A client or server that supports APOP and permits UTF-8 in user names or passwords MUST apply SASLprep [RFC4013] or one of its standards- track successors to the user name and password used to compute the APOP digest. When applying SASLprep [RFC4013], servers MUST reject UTF-8 user names or passwords that contain a UTF-8 character listed in Section 2.3 of SASLprep. When applying SASLprep to the USER argument, the PASS argument, or the APOP username argument, a compliant server or client MUST treat them as a query string [RFC3454]. When applying SASLprep to the APOP password argument, a compliant server or client MUST treat them as a stored string [RFC3454]. The client does not need to issue the UTF8 command prior to using UTF-8 in authentication. However, clients MUST NOT use UTF-8 string in USER, PASS, or APOP commands unless the USER argument is included in the UTF8 capability response. The server MUST reject UTF-8 user names or passwords that fail to comply with the formal syntax in UTF-8 [RFC3629]. Use of UTF-8 string in the AUTH command is governed by the POP3 SASL [RFC5034] mechanism. 3. LANG Capability This document adds a new POP3 Extension [RFC2449] capability response tag to indicate support for a new command: LANG. The capability tag and new command are described below. CAPA tag: LANG Gellens, et al. Expires April 25, 2013 [Page 6] Internet-Draft POP3 Support for UTF-8 October 2012 Arguments with CAPA tag: none Added Commands: LANG Standard commands affected: All Announced states / possible differences: both / no Commands valid in states: AUTHORIZATION, TRANSACTION Specification reference: this document Discussion: POP3 allows most +OK and -ERR server responses to include human- readable text that, in some cases, might be presented to the user. But that text is limited to ASCII by the POP3 specification [RFC1939]. The LANG capability and command permit a POP3 client to negotiate which language the server uses when sending human-readable text. The LANG command requests that human-readable text included in all subsequent +OK and -ERR responses be localized to a language matching the language range argument (the "Basic Language Range" as described by [RFC4647]). If the command succeeds, the server returns a +OK response followed by a single space, the exact language tag selected, another space. Human-readable text in the appropriate language then appears in the rest of the line. This and subsequent protocol-level human-readable text is encoded in the UTF-8 charset. If the command fails, the server returns an -ERR response and subsequent human-readable response text continues to use the language that was previously used. If the client issues a LANG command with the special "*" language range argument, it indicates a request to use a language designated as preferred by the server administrator. The preferred language MAY vary based on the currently active user. If no argument is given and the POP3 server issues a positive response, that response will usually consist of multi-lines. After the initial +OK, for each language tag the server supports, the POP3 Gellens, et al. Expires April 25, 2013 [Page 7] Internet-Draft POP3 Support for UTF-8 October 2012 server responds with a line for that language. This line is called a "language listing". In order to simplify parsing, all POP3 servers are required to use a certain format for language listings. A language listing consists of the language tag [RFC5646] of the message, optionally followed by a single space and a human-readable description of the language in the language itself, using the UTF-8 charset. There are no specific listing order of languages, which may depend on configuration or implementation. Examples: Note that some examples do not include the correct character accents due to limitations of this document format. C: USER karen S: +OK Hello, karen C: PASS password S: +OK karen's maildrop contains 2 messages (320 octets) Client requests deprecated MUL language. Server replies with -ERR response. C: LANG MUL S: -ERR invalid language MUL A LANG command with no parameters is a request for a language listing. C: LANG S: +OK Language listing follows: S: en English S: en-boont English Boontling dialect S: de Deutsch S: it Italiano S: es Espanol S: sv Svenska S: . A request for a language listing might fail. C: LANG S: -ERR Server is unable to list languages Once the client selects the language, all responses will be in that language, starting with the response to the LANG command. Gellens, et al. Expires April 25, 2013 [Page 8] Internet-Draft POP3 Support for UTF-8 October 2012 C: LANG es S: +OK es Idioma cambiado If a server returns an -ERR response to a LANG command that specifies a primary language, the current language for responses remains in effect. C: LANG uga S: -ERR es Idioma <<UGA>> no es conocido C: LANG sv S: +OK sv Kommandot "LANG" lyckades C: LANG * S: +OK es Idioma cambiado 4. Non-ASCII character Maildrops When a POP3 server uses a native non-ASCII character maildrop, it is the responsibility of the server to comply with the POP3 base specification [RFC1939] and Internet Message Format [RFC5322] when not in UTF-8 mode. When the server is not in UTF-8 mode and the message requires that mode, requests to download the message MAY be rejected (as specified in the next section) or the various other alternatives outlined in Section 2.1 above, including creation and delivery of variations on the original message, MAY be considered. 5. UTF8 Response Code Per "POP3 Extension Mechanism" [RFC2449], this document adds a new response code: UTF8, described below. Complete response code: UTF8 Valid for responses: -ERR Valid for commands: LIST, TOP, RETR Response code meaning and expected client behavior: The UTF8 response code indicates that a failure is due to a request when not in UTF-8 mode for message content containing UTF-8 string. The client MAY reissue the command after entering UTF-8 mode. Gellens, et al. Expires April 25, 2013 [Page 9] Internet-Draft POP3 Support for UTF-8 October 2012 6. IANA Considerations Section 2 and 3 of this specification update two capabilities ("UTF8" and "LANG") to the POP3 capability registry [RFC2449]. Section 5 of this specification also adds one new response code ("UTF8") to the POP3 response codes registry [RFC2449]. 7. Security Considerations The security considerations of UTF-8 [RFC3629], SASLprep [RFC4013] and Unicode Format for Network Interchange [RFC5198] apply to this specification, particularly with respect to use of UTF-8 in user names and passwords. The "LANG *" command might reveal the existence and preferred language of a user to an active attacker probing the system if the active language changes in response to the USER, PASS, or APOP commands prior to validating the user's credentials. Servers are strongly advised to implement a configuration to prevent this exposure. It is possible for a man-in-the-middle attacker to insert a LANG command in the command stream, thus making protocol-level diagnostic responses unintelligible to the user. A mechanism to protect the integrity of the session can be used to defeat such attacks. For example, a client can issue the STLS command [RFC2595] before issuing the LANG command. As with other internationalization upgrades, modifications to server authentication code (in this case, to support non-ASCII strings) needs to be done with care to avoid introducing vulnerabilities (for example, in string parsing or matching). This is particularly important if the native databases or mailstore of the operating system use some character set or encoding other than Unicode in UTF-8. 8. Change History 8.1. draft-ietf-eai-rfc5721bis: Version 00 following the new charter 8.2. draft-ietf-eai-rfc5721bis: Version 01 refine the texts Gellens, et al. Expires April 25, 2013 [Page 10] Internet-Draft POP3 Support for UTF-8 October 2012 8.3. draft-ietf-eai-rfc5721bis: Version 02 update the texts based on Joseph's comments 8.4. draft-ietf-eai-rfc5721bis: Version 03 improve the texts text instructing servers to either downconvert or reject new UTF-8 response code for use 8.5. draft-ietf-eai-rfc5721bis: Version 04 improve the texts 8.6. draft-ietf-eai-rfc5721bis: Version 05 updated according to jabber interim meeting result updated according to john and apparea's review comments 8.7. draft-ietf-eai-rfc5721bis: Version 06 improve the texts, updated section 3.2 to provide for SASL successor specs. 8.8. draft-ietf-eai-rfc5721bis: Version 07 updated according to John's comments 8.9. draft-ietf-eai-rfc5721bis: Version 08 improve the texts 9. References 9.1. Normative References [I-D.ietf-eai-5738bis] Resnick, P., Newman, C., and S. Shen, "IMAP Support for UTF-8", draft-ietf-eai-5738bis-03 (work in progress), December 2011. [RFC1939] Myers, J. and M. Rose, "Post Office Protocol - Version 3", STD 53, RFC1939, May 1996. [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Gellens, et al. Expires April 25, 2013 [Page 11] Internet-Draft POP3 Support for UTF-8 October 2012 Format of Internet Message Bodies", RFC2045, November 1996. [RFC2047] Moore, K., "MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text", RFC2047, November 1996. [RFC 2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2449] Gellens, R., Newman, C., and L. Lundblade, "POP3 Extension Mechanism", RFC2449, November 1998. [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of Internationalized Strings ("stringprep")", RFC3454, December 2002. [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 10646", STD 63, RFC3629, November 2003. [RFC4013] Zeilenga, K., "SASLprep: Stringprep Profile for User Names and Passwords", RFC4013, February 2005. [RFC4647] Phillips, A. and M. Davis, "Matching of Language Tags", BCP 47, RFC4647, September 2006. [RFC5198] Klensin, J. and M. Padlipsky, "Unicode Format for Network Interchange", RFC5198, March 2008. [RFC5322] Resnick, P., Ed., "Internet Message Format", RFC5322, October 2008. [RFC5646] Phillips, A. and M. Davis, "Tags for Identifying Languages", BCP 47, RFC5646, September 2009. [RFC6152] Klensin, J., Freed, N., Rose, M., and D. Crocker, "SMTP Service Extension for 8-bit MIME Transport", STD 71, RFC6152, March 2011. Gellens, et al. Expires April 25, 2013 [Page 12] Internet-Draft POP3 Support for UTF-8 October 2012 [RFC6530] Klensin, J. and Y. Ko, "Overview and Framework for Internationalized Email", RFC6530, February 2012. [RFC6532] Yang, A., Steele, S., and N. Freed, "Internationalized Email Headers", RFC6532, February 2012. 9.2. Informative References [RFC2595] Newman, C., "Using TLS with IMAP, POP3 and ACAP", RFC2595, June 1999. [RFC5034] Siemborski, R. and A. Menon-Sen, "The Post Office Protocol (POP3) Simple Authentication and Security Layer (SASL) Authentication Mechanism", RFC5034, July 2007. [RFC 5721] Gellens, R. and C. Newman, "POP3 Support for UTF-8", RFC 5721, February 2010. Appendix A. Design Rationale This non-normative section discusses the reasons behind some of the design choices in the above specification. Due to interoperability problems with RFC2047 and limited deployment of RFC 2231, it is hoped these 7-bit encoding mechanisms can be deprecated in the future when UTF-8 header support becomes prevalent. The USER capability (Section 2.2) and hence the upgraded USER command and additional support for non-ASCII credentials, are optional because the implementation burden of SASLprep [RFC4013] is not well understood, and mandating such support in all cases could negatively impact deployment. Appendix B. Acknowledgments Thanks to John Klensin, Joseph Yee, Tony Hansen, Alexey Melnikov and other Email Address Internationalization working group participants who provided helpful suggestions and interesting debate that improved this specification. Gellens, et al. Expires April 25, 2013 [Page 13] Internet-Draft POP3 Support for UTF-8 October 2012 Authors' Addresses Randall Gellens QUALCOMM Incorporated 5775 Morehouse Drive San Diego, CA 92651 US EMail: email@example.com Chris Newman Oracle 800 Royal Oaks Monrovia, CA 91016-6347 US EMail: firstname.lastname@example.org Jiankang YAO CNNIC No.4 South 4th Street, Zhongguancun Beijing Phone: +86 10 58813007 EMail: email@example.com Kazunori Fujiwara Japan Registry Services Co., Ltd. Chiyoda First Bldg. East 13F, 3-8-1 Nishi-Kanda Tokyo Phone: +81 3 5215 8451 EMail: firstname.lastname@example.org Gellens, et al. Expires April 25, 2013 [Page 14]