Ignore:
Timestamp:
Mar 19, 2014, 11:31:01 PM (11 years ago)
Author:
dmik
Message:

python: Merge vendor 2.7.6 to trunk.

Location:
python/trunk
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • python/trunk

  • python/trunk/Doc/library/struct.rst

    r2 r391  
    1111
    1212This module performs conversions between Python values and C structs represented
    13 as Python strings.  It uses :dfn:`format strings` (explained below) as compact
    14 descriptions of the lay-out of the C structs and the intended conversion to/from
    15 Python values.  This can be used in handling binary data stored in files or from
    16 network connections, among other sources.
     13as Python strings.  This can be used in handling binary data stored in files or
     14from network connections, among other sources.  It uses
     15:ref:`struct-format-strings` as compact descriptions of the layout of the C
     16structs and the intended conversion to/from Python values.
     17
     18.. note::
     19
     20   By default, the result of packing a given C struct includes pad bytes in
     21   order to maintain proper alignment for the C types involved; similarly,
     22   alignment is taken into account when unpacking.  This behavior is chosen so
     23   that the bytes of a packed struct correspond exactly to the layout in memory
     24   of the corresponding C struct.  To handle platform-independent data formats
     25   or omit implicit pad bytes, use ``standard`` size and alignment instead of
     26   ``native`` size and alignment: see :ref:`struct-alignment` for details.
     27
     28Functions and Exceptions
     29------------------------
    1730
    1831The module defines the following exception and functions:
     
    2134.. exception:: error
    2235
    23    Exception raised on various occasions; argument is a string describing what is
    24    wrong.
     36   Exception raised on various occasions; argument is a string describing what
     37   is wrong.
    2538
    2639
     
    3447.. function:: pack_into(fmt, buffer, offset, v1, v2, ...)
    3548
    36    Pack the values ``v1, v2, ...`` according to the given format, write the packed
    37    bytes into the writable *buffer* starting at *offset*. Note that the offset is
    38    a required argument.
     49   Pack the values ``v1, v2, ...`` according to the given format, write the
     50   packed bytes into the writable *buffer* starting at *offset*. Note that the
     51   offset is a required argument.
    3952
    4053   .. versionadded:: 2.5
     
    4457
    4558   Unpack the string (presumably packed by ``pack(fmt, ...)``) according to the
    46    given format.  The result is a tuple even if it contains exactly one item.  The
    47    string must contain exactly the amount of data required by the format
     59   given format.  The result is a tuple even if it contains exactly one item.
     60   The string must contain exactly the amount of data required by the format
    4861   (``len(string)`` must equal ``calcsize(fmt)``).
    4962
     
    5265
    5366   Unpack the *buffer* according to the given format. The result is a tuple even
    54    if it contains exactly one item. The *buffer* must contain at least the amount
    55    of data required by the format (``len(buffer[offset:])`` must be at least
    56    ``calcsize(fmt)``).
     67   if it contains exactly one item. The *buffer* must contain at least the
     68   amount of data required by the format (``len(buffer[offset:])`` must be at
     69   least ``calcsize(fmt)``).
    5770
    5871   .. versionadded:: 2.5
     
    6477   given format.
    6578
     79.. _struct-format-strings:
     80
     81Format Strings
     82--------------
     83
     84Format strings are the mechanism used to specify the expected layout when
     85packing and unpacking data.  They are built up from :ref:`format-characters`,
     86which specify the type of data being packed/unpacked.  In addition, there are
     87special characters for controlling the :ref:`struct-alignment`.
     88
     89
     90.. _struct-alignment:
     91
     92Byte Order, Size, and Alignment
     93^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     94
     95By default, C types are represented in the machine's native format and byte
     96order, and properly aligned by skipping pad bytes if necessary (according to the
     97rules used by the C compiler).
     98
     99Alternatively, the first character of the format string can be used to indicate
     100the byte order, size and alignment of the packed data, according to the
     101following table:
     102
     103+-----------+------------------------+----------+-----------+
     104| Character | Byte order             | Size     | Alignment |
     105+===========+========================+==========+===========+
     106| ``@``     | native                 | native   | native    |
     107+-----------+------------------------+----------+-----------+
     108| ``=``     | native                 | standard | none      |
     109+-----------+------------------------+----------+-----------+
     110| ``<``     | little-endian          | standard | none      |
     111+-----------+------------------------+----------+-----------+
     112| ``>``     | big-endian             | standard | none      |
     113+-----------+------------------------+----------+-----------+
     114| ``!``     | network (= big-endian) | standard | none      |
     115+-----------+------------------------+----------+-----------+
     116
     117If the first character is not one of these, ``'@'`` is assumed.
     118
     119Native byte order is big-endian or little-endian, depending on the host
     120system. For example, Intel x86 and AMD64 (x86-64) are little-endian;
     121Motorola 68000 and PowerPC G5 are big-endian; ARM and Intel Itanium feature
     122switchable endianness (bi-endian). Use ``sys.byteorder`` to check the
     123endianness of your system.
     124
     125Native size and alignment are determined using the C compiler's
     126``sizeof`` expression.  This is always combined with native byte order.
     127
     128Standard size depends only on the format character;  see the table in
     129the :ref:`format-characters` section.
     130
     131Note the difference between ``'@'`` and ``'='``: both use native byte order, but
     132the size and alignment of the latter is standardized.
     133
     134The form ``'!'`` is available for those poor souls who claim they can't remember
     135whether network byte order is big-endian or little-endian.
     136
     137There is no way to indicate non-native byte order (force byte-swapping); use the
     138appropriate choice of ``'<'`` or ``'>'``.
     139
     140Notes:
     141
     142(1) Padding is only automatically added between successive structure members.
     143    No padding is added at the beginning or the end of the encoded struct.
     144
     145(2) No padding is added when using non-native size and alignment, e.g.
     146    with '<', '>', '=', and '!'.
     147
     148(3) To align the end of a structure to the alignment requirement of a
     149    particular type, end the format with the code for that type with a repeat
     150    count of zero.  See :ref:`struct-examples`.
     151
     152
     153.. _format-characters:
     154
     155Format Characters
     156^^^^^^^^^^^^^^^^^
     157
    66158Format characters have the following meaning; the conversion between C and
    67 Python values should be obvious given their types:
    68 
    69 +--------+-------------------------+--------------------+-------+
    70 | Format | C Type                  | Python             | Notes |
    71 +========+=========================+====================+=======+
    72 | ``x``  | pad byte                | no value           |       |
    73 +--------+-------------------------+--------------------+-------+
    74 | ``c``  | :ctype:`char`           | string of length 1 |       |
    75 +--------+-------------------------+--------------------+-------+
    76 | ``b``  | :ctype:`signed char`    | integer            |       |
    77 +--------+-------------------------+--------------------+-------+
    78 | ``B``  | :ctype:`unsigned char`  | integer            |       |
    79 +--------+-------------------------+--------------------+-------+
    80 | ``?``  | :ctype:`_Bool`          | bool               | \(1)  |
    81 +--------+-------------------------+--------------------+-------+
    82 | ``h``  | :ctype:`short`          | integer            |       |
    83 +--------+-------------------------+--------------------+-------+
    84 | ``H``  | :ctype:`unsigned short` | integer            |       |
    85 +--------+-------------------------+--------------------+-------+
    86 | ``i``  | :ctype:`int`            | integer            |       |
    87 +--------+-------------------------+--------------------+-------+
    88 | ``I``  | :ctype:`unsigned int`   | integer or long    |       |
    89 +--------+-------------------------+--------------------+-------+
    90 | ``l``  | :ctype:`long`           | integer            |       |
    91 +--------+-------------------------+--------------------+-------+
    92 | ``L``  | :ctype:`unsigned long`  | long               |       |
    93 +--------+-------------------------+--------------------+-------+
    94 | ``q``  | :ctype:`long long`      | long               | \(2)  |
    95 +--------+-------------------------+--------------------+-------+
    96 | ``Q``  | :ctype:`unsigned long   | long               | \(2)  |
    97 |        | long`                   |                    |       |
    98 +--------+-------------------------+--------------------+-------+
    99 | ``f``  | :ctype:`float`          | float              |       |
    100 +--------+-------------------------+--------------------+-------+
    101 | ``d``  | :ctype:`double`         | float              |       |
    102 +--------+-------------------------+--------------------+-------+
    103 | ``s``  | :ctype:`char[]`         | string             |       |
    104 +--------+-------------------------+--------------------+-------+
    105 | ``p``  | :ctype:`char[]`         | string             |       |
    106 +--------+-------------------------+--------------------+-------+
    107 | ``P``  | :ctype:`void \*`        | long               |       |
    108 +--------+-------------------------+--------------------+-------+
     159Python values should be obvious given their types.  The 'Standard size' column
     160refers to the size of the packed value in bytes when using standard size; that
     161is, when the format string starts with one of ``'<'``, ``'>'``, ``'!'`` or
     162``'='``.  When using native size, the size of the packed value is
     163platform-dependent.
     164
     165+--------+--------------------------+--------------------+----------------+------------+
     166| Format | C Type                   | Python type        | Standard size  | Notes      |
     167+========+==========================+====================+================+============+
     168| ``x``  | pad byte                 | no value           |                |            |
     169+--------+--------------------------+--------------------+----------------+------------+
     170| ``c``  | :c:type:`char`           | string of length 1 | 1              |            |
     171+--------+--------------------------+--------------------+----------------+------------+
     172| ``b``  | :c:type:`signed char`    | integer            | 1              | \(3)       |
     173+--------+--------------------------+--------------------+----------------+------------+
     174| ``B``  | :c:type:`unsigned char`  | integer            | 1              | \(3)       |
     175+--------+--------------------------+--------------------+----------------+------------+
     176| ``?``  | :c:type:`_Bool`          | bool               | 1              | \(1)       |
     177+--------+--------------------------+--------------------+----------------+------------+
     178| ``h``  | :c:type:`short`          | integer            | 2              | \(3)       |
     179+--------+--------------------------+--------------------+----------------+------------+
     180| ``H``  | :c:type:`unsigned short` | integer            | 2              | \(3)       |
     181+--------+--------------------------+--------------------+----------------+------------+
     182| ``i``  | :c:type:`int`            | integer            | 4              | \(3)       |
     183+--------+--------------------------+--------------------+----------------+------------+
     184| ``I``  | :c:type:`unsigned int`   | integer            | 4              | \(3)       |
     185+--------+--------------------------+--------------------+----------------+------------+
     186| ``l``  | :c:type:`long`           | integer            | 4              | \(3)       |
     187+--------+--------------------------+--------------------+----------------+------------+
     188| ``L``  | :c:type:`unsigned long`  | integer            | 4              | \(3)       |
     189+--------+--------------------------+--------------------+----------------+------------+
     190| ``q``  | :c:type:`long long`      | integer            | 8              | \(2), \(3) |
     191+--------+--------------------------+--------------------+----------------+------------+
     192| ``Q``  | :c:type:`unsigned long   | integer            | 8              | \(2), \(3) |
     193|        | long`                    |                    |                |            |
     194+--------+--------------------------+--------------------+----------------+------------+
     195| ``f``  | :c:type:`float`          | float              | 4              | \(4)       |
     196+--------+--------------------------+--------------------+----------------+------------+
     197| ``d``  | :c:type:`double`         | float              | 8              | \(4)       |
     198+--------+--------------------------+--------------------+----------------+------------+
     199| ``s``  | :c:type:`char[]`         | string             |                |            |
     200+--------+--------------------------+--------------------+----------------+------------+
     201| ``p``  | :c:type:`char[]`         | string             |                |            |
     202+--------+--------------------------+--------------------+----------------+------------+
     203| ``P``  | :c:type:`void \*`        | integer            |                | \(5), \(3) |
     204+--------+--------------------------+--------------------+----------------+------------+
    109205
    110206Notes:
    111207
    112208(1)
    113    The ``'?'`` conversion code corresponds to the :ctype:`_Bool` type defined by
    114    C99. If this type is not available, it is simulated using a :ctype:`char`. In
     209   The ``'?'`` conversion code corresponds to the :c:type:`_Bool` type defined by
     210   C99. If this type is not available, it is simulated using a :c:type:`char`. In
    115211   standard mode, it is always represented by one byte.
    116212
     
    119215(2)
    120216   The ``'q'`` and ``'Q'`` conversion codes are available in native mode only if
    121    the platform C compiler supports C :ctype:`long long`, or, on Windows,
    122    :ctype:`__int64`.  They are always available in standard modes.
     217   the platform C compiler supports C :c:type:`long long`, or, on Windows,
     218   :c:type:`__int64`.  They are always available in standard modes.
    123219
    124220   .. versionadded:: 2.2
     221
     222(3)
     223   When attempting to pack a non-integer using any of the integer conversion
     224   codes, if the non-integer has a :meth:`__index__` method then that method is
     225   called to convert the argument to an integer before packing.  If no
     226   :meth:`__index__` method exists, or the call to :meth:`__index__` raises
     227   :exc:`TypeError`, then the :meth:`__int__` method is tried.  However, the use
     228   of :meth:`__int__` is deprecated, and will raise :exc:`DeprecationWarning`.
     229
     230   .. versionchanged:: 2.7
     231      Use of the :meth:`__index__` method for non-integers is new in 2.7.
     232
     233   .. versionchanged:: 2.7
     234      Prior to version 2.7, not all integer conversion codes would use the
     235      :meth:`__int__` method to convert, and :exc:`DeprecationWarning` was
     236      raised only for float arguments.
     237
     238(4)
     239   For the ``'f'`` and ``'d'`` conversion codes, the packed representation uses
     240   the IEEE 754 binary32 (for ``'f'``) or binary64 (for ``'d'``) format,
     241   regardless of the floating-point format used by the platform.
     242
     243(5)
     244   The ``'P'`` format character is only available for the native byte ordering
     245   (selected as the default or with the ``'@'`` byte order character). The byte
     246   order character ``'='`` chooses to use little- or big-endian ordering based
     247   on the host system. The struct module does not interpret this as native
     248   ordering, so the ``'P'`` format is not available.
     249
    125250
    126251A format character may be preceded by an integral repeat count.  For example,
     
    133258string, not a repeat count like for the other format characters; for example,
    134259``'10s'`` means a single 10-byte string, while ``'10c'`` means 10 characters.
    135 For packing, the string is truncated or padded with null bytes as appropriate to
    136 make it fit. For unpacking, the resulting string always has exactly the
    137 specified number of bytes.  As a special case, ``'0s'`` means a single, empty
    138 string (while ``'0c'`` means 0 characters).
     260If a count is not given, it defaults to 1.  For packing, the string is
     261truncated or padded with null bytes as appropriate to make it fit. For
     262unpacking, the resulting string always has exactly the specified number of
     263bytes.  As a special case, ``'0s'`` means a single, empty string (while
     264``'0c'`` means 0 characters).
    139265
    140266The ``'p'`` format character encodes a "Pascal string", meaning a short
    141 variable-length string stored in a fixed number of bytes. The count is the total
    142 number of bytes stored.  The first byte stored is the length of the string, or
    143 255, whichever is smaller.  The bytes of the string follow.  If the string
    144 passed in to :func:`pack` is too long (longer than the count minus 1), only the
    145 leading count-1 bytes of the string are stored.  If the string is shorter than
    146 count-1, it is padded with null bytes so that exactly count bytes in all are
    147 used.  Note that for :func:`unpack`, the ``'p'`` format character consumes count
    148 bytes, but that the string returned can never contain more than 255 characters.
    149 
    150 For the ``'I'``, ``'L'``, ``'q'`` and ``'Q'`` format characters, the return
    151 value is a Python long integer.
     267variable-length string stored in a *fixed number of bytes*, given by the count.
     268The first byte stored is the length of the string, or 255, whichever is smaller.
     269The bytes of the string follow.  If the string passed in to :func:`pack` is too
     270long (longer than the count minus 1), only the leading ``count-1`` bytes of the
     271string are stored.  If the string is shorter than ``count-1``, it is padded with
     272null bytes so that exactly count bytes in all are used.  Note that for
     273:func:`unpack`, the ``'p'`` format character consumes count bytes, but that the
     274string returned can never contain more than 255 characters.
    152275
    153276For the ``'P'`` format character, the return value is a Python integer or long
     
    164287any non-zero value will be True when unpacking.
    165288
    166 By default, C numbers are represented in the machine's native format and byte
    167 order, and properly aligned by skipping pad bytes if necessary (according to the
    168 rules used by the C compiler).
    169 
    170 Alternatively, the first character of the format string can be used to indicate
    171 the byte order, size and alignment of the packed data, according to the
    172 following table:
    173 
    174 +-----------+------------------------+--------------------+
    175 | Character | Byte order             | Size and alignment |
    176 +===========+========================+====================+
    177 | ``@``     | native                 | native             |
    178 +-----------+------------------------+--------------------+
    179 | ``=``     | native                 | standard           |
    180 +-----------+------------------------+--------------------+
    181 | ``<``     | little-endian          | standard           |
    182 +-----------+------------------------+--------------------+
    183 | ``>``     | big-endian             | standard           |
    184 +-----------+------------------------+--------------------+
    185 | ``!``     | network (= big-endian) | standard           |
    186 +-----------+------------------------+--------------------+
    187 
    188 If the first character is not one of these, ``'@'`` is assumed.
    189 
    190 Native byte order is big-endian or little-endian, depending on the host system.
    191 For example, Motorola and Sun processors are big-endian; Intel and DEC
    192 processors are little-endian.
    193 
    194 Native size and alignment are determined using the C compiler's
    195 ``sizeof`` expression.  This is always combined with native byte order.
    196 
    197 Standard size and alignment are as follows: no alignment is required for any
    198 type (so you have to use pad bytes); :ctype:`short` is 2 bytes; :ctype:`int` and
    199 :ctype:`long` are 4 bytes; :ctype:`long long` (:ctype:`__int64` on Windows) is 8
    200 bytes; :ctype:`float` and :ctype:`double` are 32-bit and 64-bit IEEE floating
    201 point numbers, respectively. :ctype:`_Bool` is 1 byte.
    202 
    203 Note the difference between ``'@'`` and ``'='``: both use native byte order, but
    204 the size and alignment of the latter is standardized.
    205 
    206 The form ``'!'`` is available for those poor souls who claim they can't remember
    207 whether network byte order is big-endian or little-endian.
    208 
    209 There is no way to indicate non-native byte order (force byte-swapping); use the
    210 appropriate choice of ``'<'`` or ``'>'``.
    211 
    212 The ``'P'`` format character is only available for the native byte ordering
    213 (selected as the default or with the ``'@'`` byte order character). The byte
    214 order character ``'='`` chooses to use little- or big-endian ordering based on
    215 the host system. The struct module does not interpret this as native ordering,
    216 so the ``'P'`` format is not available.
    217 
    218 Examples (all using native byte order, size and alignment, on a big-endian
    219 machine)::
     289
     290
     291.. _struct-examples:
     292
     293Examples
     294^^^^^^^^
     295
     296.. note::
     297   All examples assume a native byte order, size, and alignment with a
     298   big-endian machine.
     299
     300A basic example of packing/unpacking three integers::
    220301
    221302   >>> from struct import *
     
    227308   8
    228309
    229 Hint: to align the end of a structure to the alignment requirement of a
    230 particular type, end the format with the code for that type with a repeat count
    231 of zero.  For example, the format ``'llh0l'`` specifies two pad bytes at the
    232 end, assuming longs are aligned on 4-byte boundaries.  This only works when
    233 native size and alignment are in effect; standard size and alignment does not
    234 enforce any alignment.
    235 
    236310Unpacked fields can be named by assigning them to variables or by wrapping
    237311the result in a named tuple::
     
    242316    >>> from collections import namedtuple
    243317    >>> Student = namedtuple('Student', 'name serialnum school gradelevel')
    244     >>> Student._make(unpack('<10sHHb', s))
     318    >>> Student._make(unpack('<10sHHb', record))
    245319    Student(name='raymond   ', serialnum=4658, school=264, gradelevel=8)
     320
     321The ordering of format characters may have an impact on size since the padding
     322needed to satisfy alignment requirements is different::
     323
     324    >>> pack('ci', '*', 0x12131415)
     325    '*\x00\x00\x00\x12\x13\x14\x15'
     326    >>> pack('ic', 0x12131415, '*')
     327    '\x12\x13\x14\x15*'
     328    >>> calcsize('ci')
     329    8
     330    >>> calcsize('ic')
     331    5
     332
     333The following format ``'llh0l'`` specifies two pad bytes at the end, assuming
     334longs are aligned on 4-byte boundaries::
     335
     336    >>> pack('llh0l', 1, 2, 3)
     337    '\x00\x00\x00\x01\x00\x00\x00\x02\x00\x03\x00\x00'
     338
     339This only works when native size and alignment are in effect; standard size and
     340alignment does not enforce any alignment.
     341
    246342
    247343.. seealso::
     
    256352.. _struct-objects:
    257353
    258 Struct Objects
    259 --------------
     354Classes
     355-------
    260356
    261357The :mod:`struct` module also defines the following type:
     
    264360.. class:: Struct(format)
    265361
    266    Return a new Struct object which writes and reads binary data according to the
    267    format string *format*.  Creating a Struct object once and calling its methods
    268    is more efficient than calling the :mod:`struct` functions with the same format
    269    since the format string only needs to be compiled once.
     362   Return a new Struct object which writes and reads binary data according to
     363   the format string *format*.  Creating a Struct object once and calling its
     364   methods is more efficient than calling the :mod:`struct` functions with the
     365   same format since the format string only needs to be compiled once.
    270366
    271367   .. versionadded:: 2.5
     
    291387
    292388
    293    .. method:: unpack_from(buffer[, offset=0])
     389   .. method:: unpack_from(buffer, offset=0)
    294390
    295391      Identical to the :func:`unpack_from` function, using the compiled format.
Note: See TracChangeset for help on using the changeset viewer.