[2] | 1 | .. highlightlang:: c
|
---|
| 2 |
|
---|
| 3 | .. _stringobjects:
|
---|
| 4 |
|
---|
| 5 | String/Bytes Objects
|
---|
| 6 | --------------------
|
---|
| 7 |
|
---|
| 8 | These functions raise :exc:`TypeError` when expecting a string parameter and are
|
---|
| 9 | called with a non-string parameter.
|
---|
| 10 |
|
---|
| 11 | .. note::
|
---|
| 12 |
|
---|
| 13 | These functions have been renamed to PyBytes_* in Python 3.x. Unless
|
---|
| 14 | otherwise noted, the PyBytes functions available in 3.x are aliased to their
|
---|
| 15 | PyString_* equivalents to help porting.
|
---|
| 16 |
|
---|
| 17 | .. index:: object: string
|
---|
| 18 |
|
---|
| 19 |
|
---|
[391] | 20 | .. c:type:: PyStringObject
|
---|
[2] | 21 |
|
---|
[391] | 22 | This subtype of :c:type:`PyObject` represents a Python string object.
|
---|
[2] | 23 |
|
---|
| 24 |
|
---|
[391] | 25 | .. c:var:: PyTypeObject PyString_Type
|
---|
[2] | 26 |
|
---|
| 27 | .. index:: single: StringType (in module types)
|
---|
| 28 |
|
---|
[391] | 29 | This instance of :c:type:`PyTypeObject` represents the Python string type; it is
|
---|
[2] | 30 | the same object as ``str`` and ``types.StringType`` in the Python layer. .
|
---|
| 31 |
|
---|
| 32 |
|
---|
[391] | 33 | .. c:function:: int PyString_Check(PyObject *o)
|
---|
[2] | 34 |
|
---|
| 35 | Return true if the object *o* is a string object or an instance of a subtype of
|
---|
| 36 | the string type.
|
---|
| 37 |
|
---|
| 38 | .. versionchanged:: 2.2
|
---|
| 39 | Allowed subtypes to be accepted.
|
---|
| 40 |
|
---|
| 41 |
|
---|
[391] | 42 | .. c:function:: int PyString_CheckExact(PyObject *o)
|
---|
[2] | 43 |
|
---|
| 44 | Return true if the object *o* is a string object, but not an instance of a
|
---|
| 45 | subtype of the string type.
|
---|
| 46 |
|
---|
| 47 | .. versionadded:: 2.2
|
---|
| 48 |
|
---|
| 49 |
|
---|
[391] | 50 | .. c:function:: PyObject* PyString_FromString(const char *v)
|
---|
[2] | 51 |
|
---|
| 52 | Return a new string object with a copy of the string *v* as value on success,
|
---|
| 53 | and *NULL* on failure. The parameter *v* must not be *NULL*; it will not be
|
---|
| 54 | checked.
|
---|
| 55 |
|
---|
| 56 |
|
---|
[391] | 57 | .. c:function:: PyObject* PyString_FromStringAndSize(const char *v, Py_ssize_t len)
|
---|
[2] | 58 |
|
---|
| 59 | Return a new string object with a copy of the string *v* as value and length
|
---|
| 60 | *len* on success, and *NULL* on failure. If *v* is *NULL*, the contents of the
|
---|
| 61 | string are uninitialized.
|
---|
| 62 |
|
---|
| 63 | .. versionchanged:: 2.5
|
---|
[391] | 64 | This function used an :c:type:`int` type for *len*. This might require
|
---|
[2] | 65 | changes in your code for properly supporting 64-bit systems.
|
---|
| 66 |
|
---|
| 67 |
|
---|
[391] | 68 | .. c:function:: PyObject* PyString_FromFormat(const char *format, ...)
|
---|
[2] | 69 |
|
---|
[391] | 70 | Take a C :c:func:`printf`\ -style *format* string and a variable number of
|
---|
[2] | 71 | arguments, calculate the size of the resulting Python string and return a string
|
---|
| 72 | with the values formatted into it. The variable arguments must be C types and
|
---|
| 73 | must correspond exactly to the format characters in the *format* string. The
|
---|
| 74 | following format characters are allowed:
|
---|
| 75 |
|
---|
| 76 | .. % This should be exactly the same as the table in PyErr_Format.
|
---|
| 77 | .. % One should just refer to the other.
|
---|
| 78 | .. % The descriptions for %zd and %zu are wrong, but the truth is complicated
|
---|
| 79 | .. % because not all compilers support the %z width modifier -- we fake it
|
---|
| 80 | .. % when necessary via interpolating PY_FORMAT_SIZE_T.
|
---|
[391] | 81 | .. % Similar comments apply to the %ll width modifier and
|
---|
| 82 | .. % PY_FORMAT_LONG_LONG.
|
---|
[2] | 83 | .. % %u, %lu, %zu should have "new in Python 2.5" blurbs.
|
---|
| 84 |
|
---|
| 85 | +-------------------+---------------+--------------------------------+
|
---|
| 86 | | Format Characters | Type | Comment |
|
---|
| 87 | +===================+===============+================================+
|
---|
| 88 | | :attr:`%%` | *n/a* | The literal % character. |
|
---|
| 89 | +-------------------+---------------+--------------------------------+
|
---|
| 90 | | :attr:`%c` | int | A single character, |
|
---|
| 91 | | | | represented as an C int. |
|
---|
| 92 | +-------------------+---------------+--------------------------------+
|
---|
| 93 | | :attr:`%d` | int | Exactly equivalent to |
|
---|
| 94 | | | | ``printf("%d")``. |
|
---|
| 95 | +-------------------+---------------+--------------------------------+
|
---|
| 96 | | :attr:`%u` | unsigned int | Exactly equivalent to |
|
---|
| 97 | | | | ``printf("%u")``. |
|
---|
| 98 | +-------------------+---------------+--------------------------------+
|
---|
| 99 | | :attr:`%ld` | long | Exactly equivalent to |
|
---|
| 100 | | | | ``printf("%ld")``. |
|
---|
| 101 | +-------------------+---------------+--------------------------------+
|
---|
| 102 | | :attr:`%lu` | unsigned long | Exactly equivalent to |
|
---|
| 103 | | | | ``printf("%lu")``. |
|
---|
| 104 | +-------------------+---------------+--------------------------------+
|
---|
[391] | 105 | | :attr:`%lld` | long long | Exactly equivalent to |
|
---|
| 106 | | | | ``printf("%lld")``. |
|
---|
| 107 | +-------------------+---------------+--------------------------------+
|
---|
| 108 | | :attr:`%llu` | unsigned | Exactly equivalent to |
|
---|
| 109 | | | long long | ``printf("%llu")``. |
|
---|
| 110 | +-------------------+---------------+--------------------------------+
|
---|
[2] | 111 | | :attr:`%zd` | Py_ssize_t | Exactly equivalent to |
|
---|
| 112 | | | | ``printf("%zd")``. |
|
---|
| 113 | +-------------------+---------------+--------------------------------+
|
---|
| 114 | | :attr:`%zu` | size_t | Exactly equivalent to |
|
---|
| 115 | | | | ``printf("%zu")``. |
|
---|
| 116 | +-------------------+---------------+--------------------------------+
|
---|
| 117 | | :attr:`%i` | int | Exactly equivalent to |
|
---|
| 118 | | | | ``printf("%i")``. |
|
---|
| 119 | +-------------------+---------------+--------------------------------+
|
---|
| 120 | | :attr:`%x` | int | Exactly equivalent to |
|
---|
| 121 | | | | ``printf("%x")``. |
|
---|
| 122 | +-------------------+---------------+--------------------------------+
|
---|
| 123 | | :attr:`%s` | char\* | A null-terminated C character |
|
---|
| 124 | | | | array. |
|
---|
| 125 | +-------------------+---------------+--------------------------------+
|
---|
| 126 | | :attr:`%p` | void\* | The hex representation of a C |
|
---|
| 127 | | | | pointer. Mostly equivalent to |
|
---|
| 128 | | | | ``printf("%p")`` except that |
|
---|
| 129 | | | | it is guaranteed to start with |
|
---|
| 130 | | | | the literal ``0x`` regardless |
|
---|
| 131 | | | | of what the platform's |
|
---|
| 132 | | | | ``printf`` yields. |
|
---|
| 133 | +-------------------+---------------+--------------------------------+
|
---|
| 134 |
|
---|
| 135 | An unrecognized format character causes all the rest of the format string to be
|
---|
| 136 | copied as-is to the result string, and any extra arguments discarded.
|
---|
| 137 |
|
---|
[391] | 138 | .. note::
|
---|
[2] | 139 |
|
---|
[391] | 140 | The `"%lld"` and `"%llu"` format specifiers are only available
|
---|
| 141 | when :const:`HAVE_LONG_LONG` is defined.
|
---|
[2] | 142 |
|
---|
[391] | 143 | .. versionchanged:: 2.7
|
---|
| 144 | Support for `"%lld"` and `"%llu"` added.
|
---|
| 145 |
|
---|
| 146 |
|
---|
| 147 | .. c:function:: PyObject* PyString_FromFormatV(const char *format, va_list vargs)
|
---|
| 148 |
|
---|
| 149 | Identical to :c:func:`PyString_FromFormat` except that it takes exactly two
|
---|
[2] | 150 | arguments.
|
---|
| 151 |
|
---|
| 152 |
|
---|
[391] | 153 | .. c:function:: Py_ssize_t PyString_Size(PyObject *string)
|
---|
[2] | 154 |
|
---|
| 155 | Return the length of the string in string object *string*.
|
---|
| 156 |
|
---|
| 157 | .. versionchanged:: 2.5
|
---|
[391] | 158 | This function returned an :c:type:`int` type. This might require changes
|
---|
[2] | 159 | in your code for properly supporting 64-bit systems.
|
---|
| 160 |
|
---|
| 161 |
|
---|
[391] | 162 | .. c:function:: Py_ssize_t PyString_GET_SIZE(PyObject *string)
|
---|
[2] | 163 |
|
---|
[391] | 164 | Macro form of :c:func:`PyString_Size` but without error checking.
|
---|
[2] | 165 |
|
---|
| 166 | .. versionchanged:: 2.5
|
---|
[391] | 167 | This macro returned an :c:type:`int` type. This might require changes in
|
---|
[2] | 168 | your code for properly supporting 64-bit systems.
|
---|
| 169 |
|
---|
| 170 |
|
---|
[391] | 171 | .. c:function:: char* PyString_AsString(PyObject *string)
|
---|
[2] | 172 |
|
---|
| 173 | Return a NUL-terminated representation of the contents of *string*. The pointer
|
---|
| 174 | refers to the internal buffer of *string*, not a copy. The data must not be
|
---|
| 175 | modified in any way, unless the string was just created using
|
---|
| 176 | ``PyString_FromStringAndSize(NULL, size)``. It must not be deallocated. If
|
---|
| 177 | *string* is a Unicode object, this function computes the default encoding of
|
---|
| 178 | *string* and operates on that. If *string* is not a string object at all,
|
---|
[391] | 179 | :c:func:`PyString_AsString` returns *NULL* and raises :exc:`TypeError`.
|
---|
[2] | 180 |
|
---|
| 181 |
|
---|
[391] | 182 | .. c:function:: char* PyString_AS_STRING(PyObject *string)
|
---|
[2] | 183 |
|
---|
[391] | 184 | Macro form of :c:func:`PyString_AsString` but without error checking. Only
|
---|
[2] | 185 | string objects are supported; no Unicode objects should be passed.
|
---|
| 186 |
|
---|
| 187 |
|
---|
[391] | 188 | .. c:function:: int PyString_AsStringAndSize(PyObject *obj, char **buffer, Py_ssize_t *length)
|
---|
[2] | 189 |
|
---|
| 190 | Return a NUL-terminated representation of the contents of the object *obj*
|
---|
| 191 | through the output variables *buffer* and *length*.
|
---|
| 192 |
|
---|
| 193 | The function accepts both string and Unicode objects as input. For Unicode
|
---|
| 194 | objects it returns the default encoded version of the object. If *length* is
|
---|
| 195 | *NULL*, the resulting buffer may not contain NUL characters; if it does, the
|
---|
| 196 | function returns ``-1`` and a :exc:`TypeError` is raised.
|
---|
| 197 |
|
---|
| 198 | The buffer refers to an internal string buffer of *obj*, not a copy. The data
|
---|
| 199 | must not be modified in any way, unless the string was just created using
|
---|
| 200 | ``PyString_FromStringAndSize(NULL, size)``. It must not be deallocated. If
|
---|
| 201 | *string* is a Unicode object, this function computes the default encoding of
|
---|
| 202 | *string* and operates on that. If *string* is not a string object at all,
|
---|
[391] | 203 | :c:func:`PyString_AsStringAndSize` returns ``-1`` and raises :exc:`TypeError`.
|
---|
[2] | 204 |
|
---|
| 205 | .. versionchanged:: 2.5
|
---|
[391] | 206 | This function used an :c:type:`int *` type for *length*. This might
|
---|
[2] | 207 | require changes in your code for properly supporting 64-bit systems.
|
---|
| 208 |
|
---|
| 209 |
|
---|
[391] | 210 | .. c:function:: void PyString_Concat(PyObject **string, PyObject *newpart)
|
---|
[2] | 211 |
|
---|
| 212 | Create a new string object in *\*string* containing the contents of *newpart*
|
---|
| 213 | appended to *string*; the caller will own the new reference. The reference to
|
---|
| 214 | the old value of *string* will be stolen. If the new string cannot be created,
|
---|
| 215 | the old reference to *string* will still be discarded and the value of
|
---|
| 216 | *\*string* will be set to *NULL*; the appropriate exception will be set.
|
---|
| 217 |
|
---|
| 218 |
|
---|
[391] | 219 | .. c:function:: void PyString_ConcatAndDel(PyObject **string, PyObject *newpart)
|
---|
[2] | 220 |
|
---|
| 221 | Create a new string object in *\*string* containing the contents of *newpart*
|
---|
| 222 | appended to *string*. This version decrements the reference count of *newpart*.
|
---|
| 223 |
|
---|
| 224 |
|
---|
[391] | 225 | .. c:function:: int _PyString_Resize(PyObject **string, Py_ssize_t newsize)
|
---|
[2] | 226 |
|
---|
| 227 | A way to resize a string object even though it is "immutable". Only use this to
|
---|
| 228 | build up a brand new string object; don't use this if the string may already be
|
---|
| 229 | known in other parts of the code. It is an error to call this function if the
|
---|
| 230 | refcount on the input string object is not one. Pass the address of an existing
|
---|
| 231 | string object as an lvalue (it may be written into), and the new size desired.
|
---|
| 232 | On success, *\*string* holds the resized string object and ``0`` is returned;
|
---|
| 233 | the address in *\*string* may differ from its input value. If the reallocation
|
---|
| 234 | fails, the original string object at *\*string* is deallocated, *\*string* is
|
---|
| 235 | set to *NULL*, a memory exception is set, and ``-1`` is returned.
|
---|
| 236 |
|
---|
| 237 | .. versionchanged:: 2.5
|
---|
[391] | 238 | This function used an :c:type:`int` type for *newsize*. This might
|
---|
[2] | 239 | require changes in your code for properly supporting 64-bit systems.
|
---|
| 240 |
|
---|
[391] | 241 | .. c:function:: PyObject* PyString_Format(PyObject *format, PyObject *args)
|
---|
[2] | 242 |
|
---|
| 243 | Return a new string object from *format* and *args*. Analogous to ``format %
|
---|
[391] | 244 | args``. The *args* argument must be a tuple or dict.
|
---|
[2] | 245 |
|
---|
| 246 |
|
---|
[391] | 247 | .. c:function:: void PyString_InternInPlace(PyObject **string)
|
---|
[2] | 248 |
|
---|
| 249 | Intern the argument *\*string* in place. The argument must be the address of a
|
---|
| 250 | pointer variable pointing to a Python string object. If there is an existing
|
---|
| 251 | interned string that is the same as *\*string*, it sets *\*string* to it
|
---|
| 252 | (decrementing the reference count of the old string object and incrementing the
|
---|
| 253 | reference count of the interned string object), otherwise it leaves *\*string*
|
---|
| 254 | alone and interns it (incrementing its reference count). (Clarification: even
|
---|
| 255 | though there is a lot of talk about reference counts, think of this function as
|
---|
| 256 | reference-count-neutral; you own the object after the call if and only if you
|
---|
| 257 | owned it before the call.)
|
---|
| 258 |
|
---|
| 259 | .. note::
|
---|
| 260 |
|
---|
| 261 | This function is not available in 3.x and does not have a PyBytes alias.
|
---|
| 262 |
|
---|
| 263 |
|
---|
[391] | 264 | .. c:function:: PyObject* PyString_InternFromString(const char *v)
|
---|
[2] | 265 |
|
---|
[391] | 266 | A combination of :c:func:`PyString_FromString` and
|
---|
| 267 | :c:func:`PyString_InternInPlace`, returning either a new string object that has
|
---|
[2] | 268 | been interned, or a new ("owned") reference to an earlier interned string object
|
---|
| 269 | with the same value.
|
---|
| 270 |
|
---|
| 271 | .. note::
|
---|
| 272 |
|
---|
| 273 | This function is not available in 3.x and does not have a PyBytes alias.
|
---|
| 274 |
|
---|
| 275 |
|
---|
[391] | 276 | .. c:function:: PyObject* PyString_Decode(const char *s, Py_ssize_t size, const char *encoding, const char *errors)
|
---|
[2] | 277 |
|
---|
| 278 | Create an object by decoding *size* bytes of the encoded buffer *s* using the
|
---|
| 279 | codec registered for *encoding*. *encoding* and *errors* have the same meaning
|
---|
| 280 | as the parameters of the same name in the :func:`unicode` built-in function.
|
---|
| 281 | The codec to be used is looked up using the Python codec registry. Return
|
---|
| 282 | *NULL* if an exception was raised by the codec.
|
---|
| 283 |
|
---|
| 284 | .. note::
|
---|
| 285 |
|
---|
| 286 | This function is not available in 3.x and does not have a PyBytes alias.
|
---|
| 287 |
|
---|
| 288 | .. versionchanged:: 2.5
|
---|
[391] | 289 | This function used an :c:type:`int` type for *size*. This might require
|
---|
[2] | 290 | changes in your code for properly supporting 64-bit systems.
|
---|
| 291 |
|
---|
| 292 |
|
---|
[391] | 293 | .. c:function:: PyObject* PyString_AsDecodedObject(PyObject *str, const char *encoding, const char *errors)
|
---|
[2] | 294 |
|
---|
| 295 | Decode a string object by passing it to the codec registered for *encoding* and
|
---|
| 296 | return the result as Python object. *encoding* and *errors* have the same
|
---|
| 297 | meaning as the parameters of the same name in the string :meth:`encode` method.
|
---|
| 298 | The codec to be used is looked up using the Python codec registry. Return *NULL*
|
---|
| 299 | if an exception was raised by the codec.
|
---|
| 300 |
|
---|
| 301 | .. note::
|
---|
| 302 |
|
---|
| 303 | This function is not available in 3.x and does not have a PyBytes alias.
|
---|
| 304 |
|
---|
| 305 |
|
---|
[391] | 306 | .. c:function:: PyObject* PyString_Encode(const char *s, Py_ssize_t size, const char *encoding, const char *errors)
|
---|
[2] | 307 |
|
---|
[391] | 308 | Encode the :c:type:`char` buffer of the given size by passing it to the codec
|
---|
[2] | 309 | registered for *encoding* and return a Python object. *encoding* and *errors*
|
---|
| 310 | have the same meaning as the parameters of the same name in the string
|
---|
| 311 | :meth:`encode` method. The codec to be used is looked up using the Python codec
|
---|
| 312 | registry. Return *NULL* if an exception was raised by the codec.
|
---|
| 313 |
|
---|
| 314 | .. note::
|
---|
| 315 |
|
---|
| 316 | This function is not available in 3.x and does not have a PyBytes alias.
|
---|
| 317 |
|
---|
| 318 | .. versionchanged:: 2.5
|
---|
[391] | 319 | This function used an :c:type:`int` type for *size*. This might require
|
---|
[2] | 320 | changes in your code for properly supporting 64-bit systems.
|
---|
| 321 |
|
---|
| 322 |
|
---|
[391] | 323 | .. c:function:: PyObject* PyString_AsEncodedObject(PyObject *str, const char *encoding, const char *errors)
|
---|
[2] | 324 |
|
---|
| 325 | Encode a string object using the codec registered for *encoding* and return the
|
---|
| 326 | result as Python object. *encoding* and *errors* have the same meaning as the
|
---|
| 327 | parameters of the same name in the string :meth:`encode` method. The codec to be
|
---|
| 328 | used is looked up using the Python codec registry. Return *NULL* if an exception
|
---|
| 329 | was raised by the codec.
|
---|
| 330 |
|
---|
| 331 | .. note::
|
---|
| 332 |
|
---|
| 333 | This function is not available in 3.x and does not have a PyBytes alias.
|
---|