1 | \section{\module{marshal} ---
|
---|
2 | Internal Python object serialization}
|
---|
3 |
|
---|
4 | \declaremodule{builtin}{marshal}
|
---|
5 | \modulesynopsis{Convert Python objects to streams of bytes and back
|
---|
6 | (with different constraints).}
|
---|
7 |
|
---|
8 |
|
---|
9 | This module contains functions that can read and write Python
|
---|
10 | values in a binary format. The format is specific to Python, but
|
---|
11 | independent of machine architecture issues (e.g., you can write a
|
---|
12 | Python value to a file on a PC, transport the file to a Sun, and read
|
---|
13 | it back there). Details of the format are undocumented on purpose;
|
---|
14 | it may change between Python versions (although it rarely
|
---|
15 | does).\footnote{The name of this module stems from a bit of
|
---|
16 | terminology used by the designers of Modula-3 (amongst others), who
|
---|
17 | use the term ``marshalling'' for shipping of data around in a
|
---|
18 | self-contained form. Strictly speaking, ``to marshal'' means to
|
---|
19 | convert some data from internal to external form (in an RPC buffer for
|
---|
20 | instance) and ``unmarshalling'' for the reverse process.}
|
---|
21 |
|
---|
22 | This is not a general ``persistence'' module. For general persistence
|
---|
23 | and transfer of Python objects through RPC calls, see the modules
|
---|
24 | \refmodule{pickle} and \refmodule{shelve}. The \module{marshal} module exists
|
---|
25 | mainly to support reading and writing the ``pseudo-compiled'' code for
|
---|
26 | Python modules of \file{.pyc} files. Therefore, the Python
|
---|
27 | maintainers reserve the right to modify the marshal format in backward
|
---|
28 | incompatible ways should the need arise. If you're serializing and
|
---|
29 | de-serializing Python objects, use the \module{pickle} module instead.
|
---|
30 | \refstmodindex{pickle}
|
---|
31 | \refstmodindex{shelve}
|
---|
32 | \obindex{code}
|
---|
33 |
|
---|
34 | \begin{notice}[warning]
|
---|
35 | The \module{marshal} module is not intended to be secure against
|
---|
36 | erroneous or maliciously constructed data. Never unmarshal data
|
---|
37 | received from an untrusted or unauthenticated source.
|
---|
38 | \end{notice}
|
---|
39 |
|
---|
40 | Not all Python object types are supported; in general, only objects
|
---|
41 | whose value is independent from a particular invocation of Python can
|
---|
42 | be written and read by this module. The following types are supported:
|
---|
43 | \code{None}, integers, long integers, floating point numbers,
|
---|
44 | strings, Unicode objects, tuples, lists, dictionaries, and code
|
---|
45 | objects, where it should be understood that tuples, lists and
|
---|
46 | dictionaries are only supported as long as the values contained
|
---|
47 | therein are themselves supported; and recursive lists and dictionaries
|
---|
48 | should not be written (they will cause infinite loops).
|
---|
49 |
|
---|
50 | \strong{Caveat:} On machines where C's \code{long int} type has more than
|
---|
51 | 32 bits (such as the DEC Alpha), it is possible to create plain Python
|
---|
52 | integers that are longer than 32 bits.
|
---|
53 | If such an integer is marshaled and read back in on a machine where
|
---|
54 | C's \code{long int} type has only 32 bits, a Python long integer object
|
---|
55 | is returned instead. While of a different type, the numeric value is
|
---|
56 | the same. (This behavior is new in Python 2.2. In earlier versions,
|
---|
57 | all but the least-significant 32 bits of the value were lost, and a
|
---|
58 | warning message was printed.)
|
---|
59 |
|
---|
60 | There are functions that read/write files as well as functions
|
---|
61 | operating on strings.
|
---|
62 |
|
---|
63 | The module defines these functions:
|
---|
64 |
|
---|
65 | \begin{funcdesc}{dump}{value, file\optional{, version}}
|
---|
66 | Write the value on the open file. The value must be a supported
|
---|
67 | type. The file must be an open file object such as
|
---|
68 | \code{sys.stdout} or returned by \function{open()} or
|
---|
69 | \function{posix.popen()}. It must be opened in binary mode
|
---|
70 | (\code{'wb'} or \code{'w+b'}).
|
---|
71 |
|
---|
72 | If the value has (or contains an object that has) an unsupported type,
|
---|
73 | a \exception{ValueError} exception is raised --- but garbage data
|
---|
74 | will also be written to the file. The object will not be properly
|
---|
75 | read back by \function{load()}.
|
---|
76 |
|
---|
77 | \versionadded[The \var{version} argument indicates the data
|
---|
78 | format that \code{dump} should use (see below)]{2.4}
|
---|
79 | \end{funcdesc}
|
---|
80 |
|
---|
81 | \begin{funcdesc}{load}{file}
|
---|
82 | Read one value from the open file and return it. If no valid value
|
---|
83 | is read, raise \exception{EOFError}, \exception{ValueError} or
|
---|
84 | \exception{TypeError}. The file must be an open file object opened
|
---|
85 | in binary mode (\code{'rb'} or \code{'r+b'}).
|
---|
86 |
|
---|
87 | \warning{If an object containing an unsupported type was
|
---|
88 | marshalled with \function{dump()}, \function{load()} will substitute
|
---|
89 | \code{None} for the unmarshallable type.}
|
---|
90 | \end{funcdesc}
|
---|
91 |
|
---|
92 | \begin{funcdesc}{dumps}{value\optional{, version}}
|
---|
93 | Return the string that would be written to a file by
|
---|
94 | \code{dump(\var{value}, \var{file})}. The value must be a supported
|
---|
95 | type. Raise a \exception{ValueError} exception if value has (or
|
---|
96 | contains an object that has) an unsupported type.
|
---|
97 |
|
---|
98 | \versionadded[The \var{version} argument indicates the data
|
---|
99 | format that \code{dumps} should use (see below)]{2.4}
|
---|
100 | \end{funcdesc}
|
---|
101 |
|
---|
102 | \begin{funcdesc}{loads}{string}
|
---|
103 | Convert the string to a value. If no valid value is found, raise
|
---|
104 | \exception{EOFError}, \exception{ValueError} or
|
---|
105 | \exception{TypeError}. Extra characters in the string are ignored.
|
---|
106 | \end{funcdesc}
|
---|
107 |
|
---|
108 | In addition, the following constants are defined:
|
---|
109 |
|
---|
110 | \begin{datadesc}{version}
|
---|
111 | Indicates the format that the module uses. Version 0 is the
|
---|
112 | historical format, version 1 (added in Python 2.4) shares interned
|
---|
113 | strings and version 2 (added in Python 2.5) uses a binary format for
|
---|
114 | floating point numbers. The current version is 2.
|
---|
115 |
|
---|
116 | \versionadded{2.4}
|
---|
117 | \end{datadesc}
|
---|