Changeset 391 for python/trunk/Doc/library/struct.rst
- Timestamp:
- Mar 19, 2014, 11:31:01 PM (11 years ago)
- Location:
- python/trunk
- Files:
-
- 2 edited
Legend:
- Unmodified
- Added
- Removed
-
python/trunk
-
Property svn:mergeinfo
set to
/python/vendor/Python-2.7.6 merged eligible /python/vendor/current merged eligible
-
Property svn:mergeinfo
set to
-
python/trunk/Doc/library/struct.rst
r2 r391 11 11 12 12 This module performs conversions between Python values and C structs represented 13 as Python strings. It uses :dfn:`format strings` (explained below) as compact 14 descriptions of the lay-out of the C structs and the intended conversion to/from 15 Python values. This can be used in handling binary data stored in files or from 16 network connections, among other sources. 13 as Python strings. This can be used in handling binary data stored in files or 14 from network connections, among other sources. It uses 15 :ref:`struct-format-strings` as compact descriptions of the layout of the C 16 structs and the intended conversion to/from Python values. 17 18 .. note:: 19 20 By default, the result of packing a given C struct includes pad bytes in 21 order to maintain proper alignment for the C types involved; similarly, 22 alignment is taken into account when unpacking. This behavior is chosen so 23 that the bytes of a packed struct correspond exactly to the layout in memory 24 of the corresponding C struct. To handle platform-independent data formats 25 or omit implicit pad bytes, use ``standard`` size and alignment instead of 26 ``native`` size and alignment: see :ref:`struct-alignment` for details. 27 28 Functions and Exceptions 29 ------------------------ 17 30 18 31 The module defines the following exception and functions: … … 21 34 .. exception:: error 22 35 23 Exception raised on various occasions; argument is a string describing what is24 wrong.36 Exception raised on various occasions; argument is a string describing what 37 is wrong. 25 38 26 39 … … 34 47 .. function:: pack_into(fmt, buffer, offset, v1, v2, ...) 35 48 36 Pack the values ``v1, v2, ...`` according to the given format, write the packed37 bytes into the writable *buffer* starting at *offset*. Note that the offset is38 a required argument.49 Pack the values ``v1, v2, ...`` according to the given format, write the 50 packed bytes into the writable *buffer* starting at *offset*. Note that the 51 offset is a required argument. 39 52 40 53 .. versionadded:: 2.5 … … 44 57 45 58 Unpack the string (presumably packed by ``pack(fmt, ...)``) according to the 46 given format. The result is a tuple even if it contains exactly one item. The47 string must contain exactly the amount of data required by the format59 given format. The result is a tuple even if it contains exactly one item. 60 The string must contain exactly the amount of data required by the format 48 61 (``len(string)`` must equal ``calcsize(fmt)``). 49 62 … … 52 65 53 66 Unpack the *buffer* according to the given format. The result is a tuple even 54 if it contains exactly one item. The *buffer* must contain at least the amount55 of data required by the format (``len(buffer[offset:])`` must be at least56 ``calcsize(fmt)``).67 if it contains exactly one item. The *buffer* must contain at least the 68 amount of data required by the format (``len(buffer[offset:])`` must be at 69 least ``calcsize(fmt)``). 57 70 58 71 .. versionadded:: 2.5 … … 64 77 given format. 65 78 79 .. _struct-format-strings: 80 81 Format Strings 82 -------------- 83 84 Format strings are the mechanism used to specify the expected layout when 85 packing and unpacking data. They are built up from :ref:`format-characters`, 86 which specify the type of data being packed/unpacked. In addition, there are 87 special characters for controlling the :ref:`struct-alignment`. 88 89 90 .. _struct-alignment: 91 92 Byte Order, Size, and Alignment 93 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 94 95 By default, C types are represented in the machine's native format and byte 96 order, and properly aligned by skipping pad bytes if necessary (according to the 97 rules used by the C compiler). 98 99 Alternatively, the first character of the format string can be used to indicate 100 the byte order, size and alignment of the packed data, according to the 101 following table: 102 103 +-----------+------------------------+----------+-----------+ 104 | Character | Byte order | Size | Alignment | 105 +===========+========================+==========+===========+ 106 | ``@`` | native | native | native | 107 +-----------+------------------------+----------+-----------+ 108 | ``=`` | native | standard | none | 109 +-----------+------------------------+----------+-----------+ 110 | ``<`` | little-endian | standard | none | 111 +-----------+------------------------+----------+-----------+ 112 | ``>`` | big-endian | standard | none | 113 +-----------+------------------------+----------+-----------+ 114 | ``!`` | network (= big-endian) | standard | none | 115 +-----------+------------------------+----------+-----------+ 116 117 If the first character is not one of these, ``'@'`` is assumed. 118 119 Native byte order is big-endian or little-endian, depending on the host 120 system. For example, Intel x86 and AMD64 (x86-64) are little-endian; 121 Motorola 68000 and PowerPC G5 are big-endian; ARM and Intel Itanium feature 122 switchable endianness (bi-endian). Use ``sys.byteorder`` to check the 123 endianness of your system. 124 125 Native size and alignment are determined using the C compiler's 126 ``sizeof`` expression. This is always combined with native byte order. 127 128 Standard size depends only on the format character; see the table in 129 the :ref:`format-characters` section. 130 131 Note the difference between ``'@'`` and ``'='``: both use native byte order, but 132 the size and alignment of the latter is standardized. 133 134 The form ``'!'`` is available for those poor souls who claim they can't remember 135 whether network byte order is big-endian or little-endian. 136 137 There is no way to indicate non-native byte order (force byte-swapping); use the 138 appropriate choice of ``'<'`` or ``'>'``. 139 140 Notes: 141 142 (1) Padding is only automatically added between successive structure members. 143 No padding is added at the beginning or the end of the encoded struct. 144 145 (2) No padding is added when using non-native size and alignment, e.g. 146 with '<', '>', '=', and '!'. 147 148 (3) To align the end of a structure to the alignment requirement of a 149 particular type, end the format with the code for that type with a repeat 150 count of zero. See :ref:`struct-examples`. 151 152 153 .. _format-characters: 154 155 Format Characters 156 ^^^^^^^^^^^^^^^^^ 157 66 158 Format characters have the following meaning; the conversion between C and 67 Python values should be obvious given their types: 68 69 +--------+-------------------------+--------------------+-------+ 70 | Format | C Type | Python | Notes | 71 +========+=========================+====================+=======+ 72 | ``x`` | pad byte | no value | | 73 +--------+-------------------------+--------------------+-------+ 74 | ``c`` | :ctype:`char` | string of length 1 | | 75 +--------+-------------------------+--------------------+-------+ 76 | ``b`` | :ctype:`signed char` | integer | | 77 +--------+-------------------------+--------------------+-------+ 78 | ``B`` | :ctype:`unsigned char` | integer | | 79 +--------+-------------------------+--------------------+-------+ 80 | ``?`` | :ctype:`_Bool` | bool | \(1) | 81 +--------+-------------------------+--------------------+-------+ 82 | ``h`` | :ctype:`short` | integer | | 83 +--------+-------------------------+--------------------+-------+ 84 | ``H`` | :ctype:`unsigned short` | integer | | 85 +--------+-------------------------+--------------------+-------+ 86 | ``i`` | :ctype:`int` | integer | | 87 +--------+-------------------------+--------------------+-------+ 88 | ``I`` | :ctype:`unsigned int` | integer or long | | 89 +--------+-------------------------+--------------------+-------+ 90 | ``l`` | :ctype:`long` | integer | | 91 +--------+-------------------------+--------------------+-------+ 92 | ``L`` | :ctype:`unsigned long` | long | | 93 +--------+-------------------------+--------------------+-------+ 94 | ``q`` | :ctype:`long long` | long | \(2) | 95 +--------+-------------------------+--------------------+-------+ 96 | ``Q`` | :ctype:`unsigned long | long | \(2) | 97 | | long` | | | 98 +--------+-------------------------+--------------------+-------+ 99 | ``f`` | :ctype:`float` | float | | 100 +--------+-------------------------+--------------------+-------+ 101 | ``d`` | :ctype:`double` | float | | 102 +--------+-------------------------+--------------------+-------+ 103 | ``s`` | :ctype:`char[]` | string | | 104 +--------+-------------------------+--------------------+-------+ 105 | ``p`` | :ctype:`char[]` | string | | 106 +--------+-------------------------+--------------------+-------+ 107 | ``P`` | :ctype:`void \*` | long | | 108 +--------+-------------------------+--------------------+-------+ 159 Python values should be obvious given their types. The 'Standard size' column 160 refers to the size of the packed value in bytes when using standard size; that 161 is, when the format string starts with one of ``'<'``, ``'>'``, ``'!'`` or 162 ``'='``. When using native size, the size of the packed value is 163 platform-dependent. 164 165 +--------+--------------------------+--------------------+----------------+------------+ 166 | Format | C Type | Python type | Standard size | Notes | 167 +========+==========================+====================+================+============+ 168 | ``x`` | pad byte | no value | | | 169 +--------+--------------------------+--------------------+----------------+------------+ 170 | ``c`` | :c:type:`char` | string of length 1 | 1 | | 171 +--------+--------------------------+--------------------+----------------+------------+ 172 | ``b`` | :c:type:`signed char` | integer | 1 | \(3) | 173 +--------+--------------------------+--------------------+----------------+------------+ 174 | ``B`` | :c:type:`unsigned char` | integer | 1 | \(3) | 175 +--------+--------------------------+--------------------+----------------+------------+ 176 | ``?`` | :c:type:`_Bool` | bool | 1 | \(1) | 177 +--------+--------------------------+--------------------+----------------+------------+ 178 | ``h`` | :c:type:`short` | integer | 2 | \(3) | 179 +--------+--------------------------+--------------------+----------------+------------+ 180 | ``H`` | :c:type:`unsigned short` | integer | 2 | \(3) | 181 +--------+--------------------------+--------------------+----------------+------------+ 182 | ``i`` | :c:type:`int` | integer | 4 | \(3) | 183 +--------+--------------------------+--------------------+----------------+------------+ 184 | ``I`` | :c:type:`unsigned int` | integer | 4 | \(3) | 185 +--------+--------------------------+--------------------+----------------+------------+ 186 | ``l`` | :c:type:`long` | integer | 4 | \(3) | 187 +--------+--------------------------+--------------------+----------------+------------+ 188 | ``L`` | :c:type:`unsigned long` | integer | 4 | \(3) | 189 +--------+--------------------------+--------------------+----------------+------------+ 190 | ``q`` | :c:type:`long long` | integer | 8 | \(2), \(3) | 191 +--------+--------------------------+--------------------+----------------+------------+ 192 | ``Q`` | :c:type:`unsigned long | integer | 8 | \(2), \(3) | 193 | | long` | | | | 194 +--------+--------------------------+--------------------+----------------+------------+ 195 | ``f`` | :c:type:`float` | float | 4 | \(4) | 196 +--------+--------------------------+--------------------+----------------+------------+ 197 | ``d`` | :c:type:`double` | float | 8 | \(4) | 198 +--------+--------------------------+--------------------+----------------+------------+ 199 | ``s`` | :c:type:`char[]` | string | | | 200 +--------+--------------------------+--------------------+----------------+------------+ 201 | ``p`` | :c:type:`char[]` | string | | | 202 +--------+--------------------------+--------------------+----------------+------------+ 203 | ``P`` | :c:type:`void \*` | integer | | \(5), \(3) | 204 +--------+--------------------------+--------------------+----------------+------------+ 109 205 110 206 Notes: 111 207 112 208 (1) 113 The ``'?'`` conversion code corresponds to the :c type:`_Bool` type defined by114 C99. If this type is not available, it is simulated using a :c type:`char`. In209 The ``'?'`` conversion code corresponds to the :c:type:`_Bool` type defined by 210 C99. If this type is not available, it is simulated using a :c:type:`char`. In 115 211 standard mode, it is always represented by one byte. 116 212 … … 119 215 (2) 120 216 The ``'q'`` and ``'Q'`` conversion codes are available in native mode only if 121 the platform C compiler supports C :c type:`long long`, or, on Windows,122 :c type:`__int64`. They are always available in standard modes.217 the platform C compiler supports C :c:type:`long long`, or, on Windows, 218 :c:type:`__int64`. They are always available in standard modes. 123 219 124 220 .. versionadded:: 2.2 221 222 (3) 223 When attempting to pack a non-integer using any of the integer conversion 224 codes, if the non-integer has a :meth:`__index__` method then that method is 225 called to convert the argument to an integer before packing. If no 226 :meth:`__index__` method exists, or the call to :meth:`__index__` raises 227 :exc:`TypeError`, then the :meth:`__int__` method is tried. However, the use 228 of :meth:`__int__` is deprecated, and will raise :exc:`DeprecationWarning`. 229 230 .. versionchanged:: 2.7 231 Use of the :meth:`__index__` method for non-integers is new in 2.7. 232 233 .. versionchanged:: 2.7 234 Prior to version 2.7, not all integer conversion codes would use the 235 :meth:`__int__` method to convert, and :exc:`DeprecationWarning` was 236 raised only for float arguments. 237 238 (4) 239 For the ``'f'`` and ``'d'`` conversion codes, the packed representation uses 240 the IEEE 754 binary32 (for ``'f'``) or binary64 (for ``'d'``) format, 241 regardless of the floating-point format used by the platform. 242 243 (5) 244 The ``'P'`` format character is only available for the native byte ordering 245 (selected as the default or with the ``'@'`` byte order character). The byte 246 order character ``'='`` chooses to use little- or big-endian ordering based 247 on the host system. The struct module does not interpret this as native 248 ordering, so the ``'P'`` format is not available. 249 125 250 126 251 A format character may be preceded by an integral repeat count. For example, … … 133 258 string, not a repeat count like for the other format characters; for example, 134 259 ``'10s'`` means a single 10-byte string, while ``'10c'`` means 10 characters. 135 For packing, the string is truncated or padded with null bytes as appropriate to 136 make it fit. For unpacking, the resulting string always has exactly the 137 specified number of bytes. As a special case, ``'0s'`` means a single, empty 138 string (while ``'0c'`` means 0 characters). 260 If a count is not given, it defaults to 1. For packing, the string is 261 truncated or padded with null bytes as appropriate to make it fit. For 262 unpacking, the resulting string always has exactly the specified number of 263 bytes. As a special case, ``'0s'`` means a single, empty string (while 264 ``'0c'`` means 0 characters). 139 265 140 266 The ``'p'`` format character encodes a "Pascal string", meaning a short 141 variable-length string stored in a fixed number of bytes. The count is the total 142 number of bytes stored. The first byte stored is the length of the string, or 143 255, whichever is smaller. The bytes of the string follow. If the string 144 passed in to :func:`pack` is too long (longer than the count minus 1), only the 145 leading count-1 bytes of the string are stored. If the string is shorter than 146 count-1, it is padded with null bytes so that exactly count bytes in all are 147 used. Note that for :func:`unpack`, the ``'p'`` format character consumes count 148 bytes, but that the string returned can never contain more than 255 characters. 149 150 For the ``'I'``, ``'L'``, ``'q'`` and ``'Q'`` format characters, the return 151 value is a Python long integer. 267 variable-length string stored in a *fixed number of bytes*, given by the count. 268 The first byte stored is the length of the string, or 255, whichever is smaller. 269 The bytes of the string follow. If the string passed in to :func:`pack` is too 270 long (longer than the count minus 1), only the leading ``count-1`` bytes of the 271 string are stored. If the string is shorter than ``count-1``, it is padded with 272 null bytes so that exactly count bytes in all are used. Note that for 273 :func:`unpack`, the ``'p'`` format character consumes count bytes, but that the 274 string returned can never contain more than 255 characters. 152 275 153 276 For the ``'P'`` format character, the return value is a Python integer or long … … 164 287 any non-zero value will be True when unpacking. 165 288 166 By default, C numbers are represented in the machine's native format and byte 167 order, and properly aligned by skipping pad bytes if necessary (according to the 168 rules used by the C compiler). 169 170 Alternatively, the first character of the format string can be used to indicate 171 the byte order, size and alignment of the packed data, according to the 172 following table: 173 174 +-----------+------------------------+--------------------+ 175 | Character | Byte order | Size and alignment | 176 +===========+========================+====================+ 177 | ``@`` | native | native | 178 +-----------+------------------------+--------------------+ 179 | ``=`` | native | standard | 180 +-----------+------------------------+--------------------+ 181 | ``<`` | little-endian | standard | 182 +-----------+------------------------+--------------------+ 183 | ``>`` | big-endian | standard | 184 +-----------+------------------------+--------------------+ 185 | ``!`` | network (= big-endian) | standard | 186 +-----------+------------------------+--------------------+ 187 188 If the first character is not one of these, ``'@'`` is assumed. 189 190 Native byte order is big-endian or little-endian, depending on the host system. 191 For example, Motorola and Sun processors are big-endian; Intel and DEC 192 processors are little-endian. 193 194 Native size and alignment are determined using the C compiler's 195 ``sizeof`` expression. This is always combined with native byte order. 196 197 Standard size and alignment are as follows: no alignment is required for any 198 type (so you have to use pad bytes); :ctype:`short` is 2 bytes; :ctype:`int` and 199 :ctype:`long` are 4 bytes; :ctype:`long long` (:ctype:`__int64` on Windows) is 8 200 bytes; :ctype:`float` and :ctype:`double` are 32-bit and 64-bit IEEE floating 201 point numbers, respectively. :ctype:`_Bool` is 1 byte. 202 203 Note the difference between ``'@'`` and ``'='``: both use native byte order, but 204 the size and alignment of the latter is standardized. 205 206 The form ``'!'`` is available for those poor souls who claim they can't remember 207 whether network byte order is big-endian or little-endian. 208 209 There is no way to indicate non-native byte order (force byte-swapping); use the 210 appropriate choice of ``'<'`` or ``'>'``. 211 212 The ``'P'`` format character is only available for the native byte ordering 213 (selected as the default or with the ``'@'`` byte order character). The byte 214 order character ``'='`` chooses to use little- or big-endian ordering based on 215 the host system. The struct module does not interpret this as native ordering, 216 so the ``'P'`` format is not available. 217 218 Examples (all using native byte order, size and alignment, on a big-endian 219 machine):: 289 290 291 .. _struct-examples: 292 293 Examples 294 ^^^^^^^^ 295 296 .. note:: 297 All examples assume a native byte order, size, and alignment with a 298 big-endian machine. 299 300 A basic example of packing/unpacking three integers:: 220 301 221 302 >>> from struct import * … … 227 308 8 228 309 229 Hint: to align the end of a structure to the alignment requirement of a230 particular type, end the format with the code for that type with a repeat count231 of zero. For example, the format ``'llh0l'`` specifies two pad bytes at the232 end, assuming longs are aligned on 4-byte boundaries. This only works when233 native size and alignment are in effect; standard size and alignment does not234 enforce any alignment.235 236 310 Unpacked fields can be named by assigning them to variables or by wrapping 237 311 the result in a named tuple:: … … 242 316 >>> from collections import namedtuple 243 317 >>> Student = namedtuple('Student', 'name serialnum school gradelevel') 244 >>> Student._make(unpack('<10sHHb', s))318 >>> Student._make(unpack('<10sHHb', record)) 245 319 Student(name='raymond ', serialnum=4658, school=264, gradelevel=8) 320 321 The ordering of format characters may have an impact on size since the padding 322 needed to satisfy alignment requirements is different:: 323 324 >>> pack('ci', '*', 0x12131415) 325 '*\x00\x00\x00\x12\x13\x14\x15' 326 >>> pack('ic', 0x12131415, '*') 327 '\x12\x13\x14\x15*' 328 >>> calcsize('ci') 329 8 330 >>> calcsize('ic') 331 5 332 333 The following format ``'llh0l'`` specifies two pad bytes at the end, assuming 334 longs are aligned on 4-byte boundaries:: 335 336 >>> pack('llh0l', 1, 2, 3) 337 '\x00\x00\x00\x01\x00\x00\x00\x02\x00\x03\x00\x00' 338 339 This only works when native size and alignment are in effect; standard size and 340 alignment does not enforce any alignment. 341 246 342 247 343 .. seealso:: … … 256 352 .. _struct-objects: 257 353 258 Struct Objects259 ------- -------354 Classes 355 ------- 260 356 261 357 The :mod:`struct` module also defines the following type: … … 264 360 .. class:: Struct(format) 265 361 266 Return a new Struct object which writes and reads binary data according to the267 format string *format*. Creating a Struct object once and calling its methods268 is more efficient than calling the :mod:`struct` functions with the same format269 s ince the format string only needs to be compiled once.362 Return a new Struct object which writes and reads binary data according to 363 the format string *format*. Creating a Struct object once and calling its 364 methods is more efficient than calling the :mod:`struct` functions with the 365 same format since the format string only needs to be compiled once. 270 366 271 367 .. versionadded:: 2.5 … … 291 387 292 388 293 .. method:: unpack_from(buffer [, offset=0])389 .. method:: unpack_from(buffer, offset=0) 294 390 295 391 Identical to the :func:`unpack_from` function, using the compiled format.
Note:
See TracChangeset
for help on using the changeset viewer.