source: vendor/python/2.5/Doc/lib/emailheaders.tex

Last change on this file was 3225, checked in by bird, 18 years ago

Python 2.5

File size: 7.4 KB
Line 
1\declaremodule{standard}{email.header}
2\modulesynopsis{Representing non-ASCII headers}
3
4\rfc{2822} is the base standard that describes the format of email
5messages. It derives from the older \rfc{822} standard which came
6into widespread use at a time when most email was composed of \ASCII{}
7characters only. \rfc{2822} is a specification written assuming email
8contains only 7-bit \ASCII{} characters.
9
10Of course, as email has been deployed worldwide, it has become
11internationalized, such that language specific character sets can now
12be used in email messages. The base standard still requires email
13messages to be transferred using only 7-bit \ASCII{} characters, so a
14slew of RFCs have been written describing how to encode email
15containing non-\ASCII{} characters into \rfc{2822}-compliant format.
16These RFCs include \rfc{2045}, \rfc{2046}, \rfc{2047}, and \rfc{2231}.
17The \module{email} package supports these standards in its
18\module{email.header} and \module{email.charset} modules.
19
20If you want to include non-\ASCII{} characters in your email headers,
21say in the \mailheader{Subject} or \mailheader{To} fields, you should
22use the \class{Header} class and assign the field in the
23\class{Message} object to an instance of \class{Header} instead of
24using a string for the header value. Import the \class{Header} class from the
25\module{email.header} module. For example:
26
27\begin{verbatim}
28>>> from email.message import Message
29>>> from email.header import Header
30>>> msg = Message()
31>>> h = Header('p\xf6stal', 'iso-8859-1')
32>>> msg['Subject'] = h
33>>> print msg.as_string()
34Subject: =?iso-8859-1?q?p=F6stal?=
35
36
37\end{verbatim}
38
39Notice here how we wanted the \mailheader{Subject} field to contain a
40non-\ASCII{} character? We did this by creating a \class{Header}
41instance and passing in the character set that the byte string was
42encoded in. When the subsequent \class{Message} instance was
43flattened, the \mailheader{Subject} field was properly \rfc{2047}
44encoded. MIME-aware mail readers would show this header using the
45embedded ISO-8859-1 character.
46
47\versionadded{2.2.2}
48
49Here is the \class{Header} class description:
50
51\begin{classdesc}{Header}{\optional{s\optional{, charset\optional{,
52 maxlinelen\optional{, header_name\optional{, continuation_ws\optional{,
53 errors}}}}}}}
54Create a MIME-compliant header that can contain strings in different
55character sets.
56
57Optional \var{s} is the initial header value. If \code{None} (the
58default), the initial header value is not set. You can later append
59to the header with \method{append()} method calls. \var{s} may be a
60byte string or a Unicode string, but see the \method{append()}
61documentation for semantics.
62
63Optional \var{charset} serves two purposes: it has the same meaning as
64the \var{charset} argument to the \method{append()} method. It also
65sets the default character set for all subsequent \method{append()}
66calls that omit the \var{charset} argument. If \var{charset} is not
67provided in the constructor (the default), the \code{us-ascii}
68character set is used both as \var{s}'s initial charset and as the
69default for subsequent \method{append()} calls.
70
71The maximum line length can be specified explicit via
72\var{maxlinelen}. For splitting the first line to a shorter value (to
73account for the field header which isn't included in \var{s},
74e.g. \mailheader{Subject}) pass in the name of the field in
75\var{header_name}. The default \var{maxlinelen} is 76, and the
76default value for \var{header_name} is \code{None}, meaning it is not
77taken into account for the first line of a long, split header.
78
79Optional \var{continuation_ws} must be \rfc{2822}-compliant folding
80whitespace, and is usually either a space or a hard tab character.
81This character will be prepended to continuation lines.
82\end{classdesc}
83
84Optional \var{errors} is passed straight through to the
85\method{append()} method.
86
87\begin{methoddesc}[Header]{append}{s\optional{, charset\optional{, errors}}}
88Append the string \var{s} to the MIME header.
89
90Optional \var{charset}, if given, should be a \class{Charset} instance
91(see \refmodule{email.charset}) or the name of a character set, which
92will be converted to a \class{Charset} instance. A value of
93\code{None} (the default) means that the \var{charset} given in the
94constructor is used.
95
96\var{s} may be a byte string or a Unicode string. If it is a byte
97string (i.e. \code{isinstance(s, str)} is true), then
98\var{charset} is the encoding of that byte string, and a
99\exception{UnicodeError} will be raised if the string cannot be
100decoded with that character set.
101
102If \var{s} is a Unicode string, then \var{charset} is a hint
103specifying the character set of the characters in the string. In this
104case, when producing an \rfc{2822}-compliant header using \rfc{2047}
105rules, the Unicode string will be encoded using the following charsets
106in order: \code{us-ascii}, the \var{charset} hint, \code{utf-8}. The
107first character set to not provoke a \exception{UnicodeError} is used.
108
109Optional \var{errors} is passed through to any \function{unicode()} or
110\function{ustr.encode()} call, and defaults to ``strict''.
111\end{methoddesc}
112
113\begin{methoddesc}[Header]{encode}{\optional{splitchars}}
114Encode a message header into an RFC-compliant format, possibly
115wrapping long lines and encapsulating non-\ASCII{} parts in base64 or
116quoted-printable encodings. Optional \var{splitchars} is a string
117containing characters to split long ASCII lines on, in rough support
118of \rfc{2822}'s \emph{highest level syntactic breaks}. This doesn't
119affect \rfc{2047} encoded lines.
120\end{methoddesc}
121
122The \class{Header} class also provides a number of methods to support
123standard operators and built-in functions.
124
125\begin{methoddesc}[Header]{__str__}{}
126A synonym for \method{Header.encode()}. Useful for
127\code{str(aHeader)}.
128\end{methoddesc}
129
130\begin{methoddesc}[Header]{__unicode__}{}
131A helper for the built-in \function{unicode()} function. Returns the
132header as a Unicode string.
133\end{methoddesc}
134
135\begin{methoddesc}[Header]{__eq__}{other}
136This method allows you to compare two \class{Header} instances for equality.
137\end{methoddesc}
138
139\begin{methoddesc}[Header]{__ne__}{other}
140This method allows you to compare two \class{Header} instances for inequality.
141\end{methoddesc}
142
143The \module{email.header} module also provides the following
144convenient functions.
145
146\begin{funcdesc}{decode_header}{header}
147Decode a message header value without converting the character set.
148The header value is in \var{header}.
149
150This function returns a list of \code{(decoded_string, charset)} pairs
151containing each of the decoded parts of the header. \var{charset} is
152\code{None} for non-encoded parts of the header, otherwise a lower
153case string containing the name of the character set specified in the
154encoded string.
155
156Here's an example:
157
158\begin{verbatim}
159>>> from email.header import decode_header
160>>> decode_header('=?iso-8859-1?q?p=F6stal?=')
161[('p\xf6stal', 'iso-8859-1')]
162\end{verbatim}
163\end{funcdesc}
164
165\begin{funcdesc}{make_header}{decoded_seq\optional{, maxlinelen\optional{,
166 header_name\optional{, continuation_ws}}}}
167Create a \class{Header} instance from a sequence of pairs as returned
168by \function{decode_header()}.
169
170\function{decode_header()} takes a header value string and returns a
171sequence of pairs of the format \code{(decoded_string, charset)} where
172\var{charset} is the name of the character set.
173
174This function takes one of those sequence of pairs and returns a
175\class{Header} instance. Optional \var{maxlinelen},
176\var{header_name}, and \var{continuation_ws} are as in the
177\class{Header} constructor.
178\end{funcdesc}
Note: See TracBrowser for help on using the repository browser.