SMAP syntax overview

SMAP syntax overview
Prev	Appendix A. Simple Mail Access Protocol, Version 1	Next

The rest of this document defines the actual SMAP commands, but all commands from the client, and replies from the server, follow the same basic syntax. All SMAP commands, and replies, use the UTF-8 character set (no matter what the natural language is used for error messages, or prompts). An SMAP command is generally one or more words. The words are separated by at least one whitespace character (U+0020, U+0009, or U+000D). An SMAP command is terminated by the newline character, (U+000A).

Note

The server does not translate actual message contents to UTF-8. The server provides the contents of a message as is, to the client. The UTF-8 character set is used for:

Names of folders
Error and status messages
Search strings

Other text-based entities already use an implicit character set (such as US-ASCII for names of E-mail headers) or are specified as opaque text strings (such as message unique identifiers, loginid and password) and have no explicitly defined character set.

Maximum limitations and timeouts

RFC 2822 sets the maximum length of a line in an E-mail message as 998 characters. This limitation is applicable to SMAP, since SMAP deals with E-mail. Additionally, SMAP commands may not exceed 8000 characters. SMAP servers should accept commands up to 8000 characters long. SMAP clients should not send commands that exceed 8000 characters in length.

Note

This, of course, does not apply to binary-formatted multiline replies, that transfer binary MIME attachments.

SMAP servers should terminate inactive SMAP clients. SMAP servers must have a timeout of at least 30 minutes. SMAP clients must not wait more than 29 minutes before sending the next command to the server. SMAP clients that must remain idle for a prolonged period of time should periodically send the NOOP command to prevent themselves from being disconnected for inactivity.

Words

A word can contain any character except for a control character (U+0000 through U+001F). If a word contains spaces or quotes (U+0020 or U+0022), a quote must be added before the first character of the word, and a second quote character must follow the last character of the word. Quotes that are part of the word are doubled. For example: "Learning the ""ABC""'s" is a single SMAP word. The word contains a single quote character before the letter "A" and a second quote character after the letter "C". Everything else is as it appears. A word that contains a single quote character is represented as """". A word that's meant to be completely empty is represented by two quotes: "".

SMAP server replies

SMAP server replies also (except where noted) are lines of text terminated by U+000A. There are three general classes of server replies. All three of them generally contain either whitespace-delimited words (formatted similarly), or they begin with one or more whitespace-delimited words, with the rest being an informative, free-form message.

SMAP servers are allowed to reply with lines of text terminated by the CR+LF sequence, U+000D U+000A. The ASCII CR character is interpreted by SMAP clients as whitespace filler, and is generally ignored. SMAP servers must be prepared to receive client commands that use either the CR+LF or the bare LF newline sequence. The only exception to this rule is the initial connection negotiation, which must use CR+LF in order to remain compatible with IMAP.

An SMAP client receives the server's response by reading an entire U+000A-terminated line, then parsing the first word, or character, to determine the reply's format.

Status replies

A line whose first word is either “+OK” or “-ERR” is a “status reply”. “+OK” indicates that the client's request succeeded, “-ERR” indicates that it failed. The rest of the line is a free-form message, suitable to be displayed as the original command's acknowledgement. The status reply does not include any actual information requested by the original command (if any). The status reply indicates whether the command succeeded or failed. Information requested by the original command will be sent before the status reply, using either a "single line", or a "multiple line" format (see below). A command can result in more than one single or multi-line reply. The client, after sending an SMAP command, keeps reading single and multi-line replies, until a final status reply is read. After receiving the status reply, the client may proceed to send the next server command.

The SMAP client must wait until the status reply is received before sending the next command.

Note

In order to allow interoperability with IMAP, server replies prior to logging in are an exception to this reply format. They follow the general IMAP syntax.

Single line replies

A server reply where the first whitespace-delimited word is the “*” character is called a “single line reply”. Single line replies send information, requested by the original command, formatted as whitespace-delimited words. The information carried by the words depends on the reply. The actual format of a single line reply depends on the original command, but can usually be determined by the word that follows “*”. This single-line reply does not indicate if the client's command succeeded, the client must still wait to receive the final status reply. The only exception to this rule is the initial connection greeting, or the unexpected connection termination situation, which is described later.

Note

The client must be prepared to receive multiple single line and multiple line replies, followed by an “-ERR” status reply. This happens when the server encounters an error in a middle of processing the client's request.

Multiple line replies

Certain information returned by an SMAP server cannot be conveniently represented as a single line of text. An example would be the contents of a message. Obviously, messages contains many lines of text. A server reply whose first word starts with the '{' character is called a “multiple line reply”. This name is actually slightly misleading; this format may carry binary data that bears no resemblance to lines of text.

There are two separate multi-line reply formats. The first format is suitable for line-oriented textual content. It's called the “dot-stuffed format”. The first word of the server reply is “{.nnnn}”, where “nnnn” is a decimal number. This number is the server's estimate of the total size of the multi-line reply, in bytes. It is not an exact byte count, just a reasonable estimate. The server is not required to compute the exact byte count before sending the data, just provide a ballpark estimate.

The remaining words of the server's multi-line reply line contain other information, depending on the nature of the data. For example, a single SMAP command can request the server to return the contents of two or more messages. The server may process those messages in any order. The remainder of the server's multi-line reply line indicates which message this multi-line reply refers to. The actual data follows the server's multi-line reply line. The data is transmitted as lines of text, each line terminated by the U+000A character. Servers are also allowed to use the U+000D/U+000A end-of-line sequence. The end of the data is marked by a line that contains only a single period, U+002E, followed by optional whitespace. Lines of data that begin with U+002E have a second U+002E character prepended.

An SMAP client, upon receiving a multi-line reply in this format, begins reading U+000A-terminated lines of text, until it reads a line containing a single U+002E, and optional whitespace. Other lines with a leading U+002E character have it removed, and the rest of the line gets saved as the returned data.

The second multi-line server reply format carries binary data. It begins with the word “{xxxx/yyyy}”, where “xxxx” and “yyyy” are decimal numbers. “yyyy” gives the server's estimate of the total size of the binary data, and “xxxx” gives the byte count of the first part that the server is about to send. The server does not have to send everything as one binary goop. The server is allowed to break down the binary data in smaller, managable, chunks, which are sent one at a time.

The remaining words of the multi-line reply line format are the same as as the dot-stuffed format's. Immediately following the multi-line reply's trailing U+000A, the server sends exactly the number of bytes given by xxxx. This is called a "binary chunk".

Each binary chunk is followed by another line of text terminated by the U+000A character. A trailing line that's empty, or contains only whitespace, indicates the end of the multi-line binary data. Otherwise the line contains a single word, xxxx (which can have leading or trailing whitespace filler), that gives the byte count of the next binary chunk, which immediately follows the U+000A character.

The SMAP client reads the initial multi-line reply line, and obtains the first chunk's byte count, and the estimated total byte count. The SMAP client then reads the exact number of bytes indicated by the byte count. Afterwords, the SMAP client enters a loop where it first reads a newline-terminated line of text. If the line is empty or contains only whitespace filler, then this is the end of the binary data, and the client proceeds to read the next server reply. Otherwise the client extracts the next chunk's byte count, reads the indicated number of bytes, then repeats the process.

An empty multi-line reply

An expected multi-line reply may be empty. An example: the client requests specific E-mail headers, but the message does not have them. This is indicated by an empty multi-line reply. An empty multi-line reply is indicated by the word "{.0}" (followed by the remaining words that specify the context of this multi-line reply). The next line sent by the server contains a single "." character. This sequence is parsed by an SMAP client as a completely empty multi-line reply.