Context Navigation

← Previous Revision
Next Revision →
Normal
Revision Log

unicode.rst

Last change on this file was 391, checked in by dmik, 11 years ago
python: Merge vendor 2.7.6 to trunk.
Property svn:eol-style set to `native`
File size: 44.0 KB

Rev	Line
[2]	1	.. highlightlang:: c
	2
	3	.. _unicodeobjects:
	4
	5	Unicode Objects and Codecs
	6	--------------------------
	7
	8	.. sectionauthor:: Marc-Andre Lemburg <mal@lemburg.com>
	9
	10	Unicode Objects
	11	^^^^^^^^^^^^^^^
	12
	13
[391]	14	Unicode Type
	15	""""""""""""
	16
[2]	17	These are the basic Unicode object types used for the Unicode implementation in
	18	Python:
	19
	20
[391]	21	.. c:type:: Py_UNICODE
[2]	22
	23	This type represents the storage type which is used by Python internally as
	24	basis for holding Unicode ordinals. Python's default builds use a 16-bit type
[391]	25	for :c:type:`Py_UNICODE` and store Unicode values internally as UCS2. It is also
[2]	26	possible to build a UCS4 version of Python (most recent Linux distributions come
	27	with UCS4 builds of Python). These builds then use a 32-bit type for
[391]	28	:c:type:`Py_UNICODE` and store Unicode data internally as UCS4. On platforms
	29	where :c:type:`wchar_t` is available and compatible with the chosen Python
	30	Unicode build variant, :c:type:`Py_UNICODE` is a typedef alias for
	31	:c:type:`wchar_t` to enhance native platform compatibility. On all other
	32	platforms, :c:type:`Py_UNICODE` is a typedef alias for either :c:type:`unsigned
	33	short` (UCS2) or :c:type:`unsigned long` (UCS4).
[2]	34
	35	Note that UCS2 and UCS4 Python builds are not binary compatible. Please keep
	36	this in mind when writing extensions or interfaces.
	37
	38
[391]	39	.. c:type:: PyUnicodeObject
[2]	40
[391]	41	This subtype of :c:type:`PyObject` represents a Python Unicode object.
[2]	42
	43
[391]	44	.. c:var:: PyTypeObject PyUnicode_Type
[2]	45
[391]	46	This instance of :c:type:`PyTypeObject` represents the Python Unicode type. It
[2]	47	is exposed to Python code as ``unicode`` and ``types.UnicodeType``.
	48
	49	The following APIs are really C macros and can be used to do fast checks and to
	50	access internal read-only data of Unicode objects:
	51
	52
[391]	53	.. c:function:: int PyUnicode_Check(PyObject *o)
[2]	54
	55	Return true if the object o is a Unicode object or an instance of a Unicode
	56	subtype.
	57
	58	.. versionchanged:: 2.2
	59	Allowed subtypes to be accepted.
	60
	61
[391]	62	.. c:function:: int PyUnicode_CheckExact(PyObject *o)
[2]	63
	64	Return true if the object o is a Unicode object, but not an instance of a
	65	subtype.
	66
	67	.. versionadded:: 2.2
	68
	69
[391]	70	.. c:function:: Py_ssize_t PyUnicode_GET_SIZE(PyObject *o)
[2]	71
[391]	72	Return the size of the object. o has to be a :c:type:`PyUnicodeObject` (not
[2]	73	checked).
	74
	75	.. versionchanged:: 2.5
[391]	76	This function returned an :c:type:`int` type. This might require changes
[2]	77	in your code for properly supporting 64-bit systems.
	78
	79
[391]	80	.. c:function:: Py_ssize_t PyUnicode_GET_DATA_SIZE(PyObject *o)
[2]	81
	82	Return the size of the object's internal buffer in bytes. o has to be a
[391]	83	:c:type:`PyUnicodeObject` (not checked).
[2]	84
	85	.. versionchanged:: 2.5
[391]	86	This function returned an :c:type:`int` type. This might require changes
[2]	87	in your code for properly supporting 64-bit systems.
	88
	89
[391]	90	.. c:function:: Py_UNICODE* PyUnicode_AS_UNICODE(PyObject *o)
[2]	91
[391]	92	Return a pointer to the internal :c:type:`Py_UNICODE` buffer of the object. o
	93	has to be a :c:type:`PyUnicodeObject` (not checked).
[2]	94
	95
[391]	96	.. c:function:: const char* PyUnicode_AS_DATA(PyObject *o)
[2]	97
	98	Return a pointer to the internal buffer of the object. o has to be a
[391]	99	:c:type:`PyUnicodeObject` (not checked).
[2]	100
	101
[391]	102	.. c:function:: int PyUnicode_ClearFreeList()
[2]	103
	104	Clear the free list. Return the total number of freed items.
	105
	106	.. versionadded:: 2.6
	107
	108
[391]	109	Unicode Character Properties
	110	""""""""""""""""""""""""""""
	111
[2]	112	Unicode provides many different character properties. The most often needed ones
	113	are available through these macros which are mapped to C functions depending on
	114	the Python configuration.
	115
	116
[391]	117	.. c:function:: int Py_UNICODE_ISSPACE(Py_UNICODE ch)
[2]	118
	119	Return 1 or 0 depending on whether ch is a whitespace character.
	120
	121
[391]	122	.. c:function:: int Py_UNICODE_ISLOWER(Py_UNICODE ch)
[2]	123
	124	Return 1 or 0 depending on whether ch is a lowercase character.
	125
	126
[391]	127	.. c:function:: int Py_UNICODE_ISUPPER(Py_UNICODE ch)
[2]	128
	129	Return 1 or 0 depending on whether ch is an uppercase character.
	130
	131
[391]	132	.. c:function:: int Py_UNICODE_ISTITLE(Py_UNICODE ch)
[2]	133
	134	Return 1 or 0 depending on whether ch is a titlecase character.
	135
	136
[391]	137	.. c:function:: int Py_UNICODE_ISLINEBREAK(Py_UNICODE ch)
[2]	138
	139	Return 1 or 0 depending on whether ch is a linebreak character.
	140
	141
[391]	142	.. c:function:: int Py_UNICODE_ISDECIMAL(Py_UNICODE ch)
[2]	143
	144	Return 1 or 0 depending on whether ch is a decimal character.
	145
	146
[391]	147	.. c:function:: int Py_UNICODE_ISDIGIT(Py_UNICODE ch)
[2]	148
	149	Return 1 or 0 depending on whether ch is a digit character.
	150
	151
[391]	152	.. c:function:: int Py_UNICODE_ISNUMERIC(Py_UNICODE ch)
[2]	153
	154	Return 1 or 0 depending on whether ch is a numeric character.
	155
	156
[391]	157	.. c:function:: int Py_UNICODE_ISALPHA(Py_UNICODE ch)
[2]	158
	159	Return 1 or 0 depending on whether ch is an alphabetic character.
	160
	161
[391]	162	.. c:function:: int Py_UNICODE_ISALNUM(Py_UNICODE ch)
[2]	163
	164	Return 1 or 0 depending on whether ch is an alphanumeric character.
	165
	166	These APIs can be used for fast direct character conversions:
	167
	168
[391]	169	.. c:function:: Py_UNICODE Py_UNICODE_TOLOWER(Py_UNICODE ch)
[2]	170
	171	Return the character ch converted to lower case.
	172
	173
[391]	174	.. c:function:: Py_UNICODE Py_UNICODE_TOUPPER(Py_UNICODE ch)
[2]	175
	176	Return the character ch converted to upper case.
	177
	178
[391]	179	.. c:function:: Py_UNICODE Py_UNICODE_TOTITLE(Py_UNICODE ch)
[2]	180
	181	Return the character ch converted to title case.
	182
	183
[391]	184	.. c:function:: int Py_UNICODE_TODECIMAL(Py_UNICODE ch)
[2]	185
	186	Return the character ch converted to a decimal positive integer. Return
	187	``-1`` if this is not possible. This macro does not raise exceptions.
	188
	189
[391]	190	.. c:function:: int Py_UNICODE_TODIGIT(Py_UNICODE ch)
[2]	191
	192	Return the character ch converted to a single digit integer. Return ``-1`` if
	193	this is not possible. This macro does not raise exceptions.
	194
	195
[391]	196	.. c:function:: double Py_UNICODE_TONUMERIC(Py_UNICODE ch)
[2]	197
	198	Return the character ch converted to a double. Return ``-1.0`` if this is not
	199	possible. This macro does not raise exceptions.
	200
[391]	201
	202	Plain Py_UNICODE
	203	""""""""""""""""
	204
[2]	205	To create Unicode objects and access their basic sequence properties, use these
	206	APIs:
	207
	208
[391]	209	.. c:function:: PyObject* PyUnicode_FromUnicode(const Py_UNICODE *u, Py_ssize_t size)
[2]	210
[391]	211	Create a Unicode object from the Py_UNICODE buffer u of the given size. u
[2]	212	may be NULL which causes the contents to be undefined. It is the user's
	213	responsibility to fill in the needed data. The buffer is copied into the new
	214	object. If the buffer is not NULL, the return value might be a shared object.
	215	Therefore, modification of the resulting Unicode object is only allowed when u
	216	is NULL.
	217
	218	.. versionchanged:: 2.5
[391]	219	This function used an :c:type:`int` type for size. This might require
[2]	220	changes in your code for properly supporting 64-bit systems.
	221
	222
[391]	223	.. c:function:: PyObject* PyUnicode_FromStringAndSize(const char *u, Py_ssize_t size)
[2]	224
[391]	225	Create a Unicode object from the char buffer u. The bytes will be interpreted
	226	as being UTF-8 encoded. u may also be NULL which
	227	causes the contents to be undefined. It is the user's responsibility to fill in
	228	the needed data. The buffer is copied into the new object. If the buffer is not
	229	NULL, the return value might be a shared object. Therefore, modification of
	230	the resulting Unicode object is only allowed when u is NULL.
[2]	231
[391]	232	.. versionadded:: 2.6
[2]	233
	234
[391]	235	.. c:function:: PyObject PyUnicode_FromString(const char u)
	236
	237	Create a Unicode object from an UTF-8 encoded null-terminated char buffer
	238	u.
	239
	240	.. versionadded:: 2.6
	241
	242
	243	.. c:function:: PyObject* PyUnicode_FromFormat(const char *format, ...)
	244
	245	Take a C :c:func:`printf`\ -style format string and a variable number of
	246	arguments, calculate the size of the resulting Python unicode string and return
	247	a string with the values formatted into it. The variable arguments must be C
	248	types and must correspond exactly to the format characters in the format
	249	string. The following format characters are allowed:
	250
	251	.. % The descriptions for %zd and %zu are wrong, but the truth is complicated
	252	.. % because not all compilers support the %z width modifier -- we fake it
	253	.. % when necessary via interpolating PY_FORMAT_SIZE_T.
	254
	255	.. tabularcolumns:: \|l\|l\|L\|
	256
	257	+-------------------+---------------------+--------------------------------+
	258	\| Format Characters \| Type \| Comment \|
	259	+===================+=====================+================================+
	260	\| :attr:`%%` \| n/a \| The literal % character. \|
	261	+-------------------+---------------------+--------------------------------+
	262	\| :attr:`%c` \| int \| A single character, \|
	263	\| \| \| represented as an C int. \|
	264	+-------------------+---------------------+--------------------------------+
	265	\| :attr:`%d` \| int \| Exactly equivalent to \|
	266	\| \| \| ``printf("%d")``. \|
	267	+-------------------+---------------------+--------------------------------+
	268	\| :attr:`%u` \| unsigned int \| Exactly equivalent to \|
	269	\| \| \| ``printf("%u")``. \|
	270	+-------------------+---------------------+--------------------------------+
	271	\| :attr:`%ld` \| long \| Exactly equivalent to \|
	272	\| \| \| ``printf("%ld")``. \|
	273	+-------------------+---------------------+--------------------------------+
	274	\| :attr:`%lu` \| unsigned long \| Exactly equivalent to \|
	275	\| \| \| ``printf("%lu")``. \|
	276	+-------------------+---------------------+--------------------------------+
	277	\| :attr:`%zd` \| Py_ssize_t \| Exactly equivalent to \|
	278	\| \| \| ``printf("%zd")``. \|
	279	+-------------------+---------------------+--------------------------------+
	280	\| :attr:`%zu` \| size_t \| Exactly equivalent to \|
	281	\| \| \| ``printf("%zu")``. \|
	282	+-------------------+---------------------+--------------------------------+
	283	\| :attr:`%i` \| int \| Exactly equivalent to \|
	284	\| \| \| ``printf("%i")``. \|
	285	+-------------------+---------------------+--------------------------------+
	286	\| :attr:`%x` \| int \| Exactly equivalent to \|
	287	\| \| \| ``printf("%x")``. \|
	288	+-------------------+---------------------+--------------------------------+
	289	\| :attr:`%s` \| char\* \| A null-terminated C character \|
	290	\| \| \| array. \|
	291	+-------------------+---------------------+--------------------------------+
	292	\| :attr:`%p` \| void\* \| The hex representation of a C \|
	293	\| \| \| pointer. Mostly equivalent to \|
	294	\| \| \| ``printf("%p")`` except that \|
	295	\| \| \| it is guaranteed to start with \|
	296	\| \| \| the literal ``0x`` regardless \|
	297	\| \| \| of what the platform's \|
	298	\| \| \| ``printf`` yields. \|
	299	+-------------------+---------------------+--------------------------------+
	300	\| :attr:`%U` \| PyObject\* \| A unicode object. \|
	301	+-------------------+---------------------+--------------------------------+
	302	\| :attr:`%V` \| PyObject\, char \ \| A unicode object (which may be \|
	303	\| \| \| NULL) and a null-terminated \|
	304	\| \| \| C character array as a second \|
	305	\| \| \| parameter (which will be used, \|
	306	\| \| \| if the first parameter is \|
	307	\| \| \| NULL). \|
	308	+-------------------+---------------------+--------------------------------+
	309	\| :attr:`%S` \| PyObject\* \| The result of calling \|
	310	\| \| \| :func:`PyObject_Unicode`. \|
	311	+-------------------+---------------------+--------------------------------+
	312	\| :attr:`%R` \| PyObject\* \| The result of calling \|
	313	\| \| \| :func:`PyObject_Repr`. \|
	314	+-------------------+---------------------+--------------------------------+
	315
	316	An unrecognized format character causes all the rest of the format string to be
	317	copied as-is to the result string, and any extra arguments discarded.
	318
	319	.. versionadded:: 2.6
	320
	321
	322	.. c:function:: PyObject* PyUnicode_FromFormatV(const char *format, va_list vargs)
	323
	324	Identical to :func:`PyUnicode_FromFormat` except that it takes exactly two
	325	arguments.
	326
	327	.. versionadded:: 2.6
	328
	329
	330	.. c:function:: Py_UNICODE* PyUnicode_AsUnicode(PyObject *unicode)
	331
	332	Return a read-only pointer to the Unicode object's internal
	333	:c:type:`Py_UNICODE` buffer, NULL if unicode is not a Unicode object.
	334	Note that the resulting :c:type:`Py_UNICODE*` string may contain embedded
	335	null characters, which would cause the string to be truncated when used in
	336	most C functions.
	337
	338
	339	.. c:function:: Py_ssize_t PyUnicode_GetSize(PyObject *unicode)
	340
[2]	341	Return the length of the Unicode object.
	342
	343	.. versionchanged:: 2.5
[391]	344	This function returned an :c:type:`int` type. This might require changes
[2]	345	in your code for properly supporting 64-bit systems.
	346
	347
[391]	348	.. c:function:: PyObject* PyUnicode_FromEncodedObject(PyObject obj, const char encoding, const char *errors)
[2]	349
	350	Coerce an encoded object obj to an Unicode object and return a reference with
	351	incremented refcount.
	352
	353	String and other char buffer compatible objects are decoded according to the
	354	given encoding and using the error handling defined by errors. Both can be
	355	NULL to have the interface use the default values (see the next section for
	356	details).
	357
	358	All other objects, including Unicode objects, cause a :exc:`TypeError` to be
	359	set.
	360
	361	The API returns NULL if there was an error. The caller is responsible for
	362	decref'ing the returned objects.
	363
	364
[391]	365	.. c:function:: PyObject* PyUnicode_FromObject(PyObject *obj)
[2]	366
	367	Shortcut for ``PyUnicode_FromEncodedObject(obj, NULL, "strict")`` which is used
	368	throughout the interpreter whenever coercion to Unicode is needed.
	369
[391]	370	If the platform supports :c:type:`wchar_t` and provides a header file wchar.h,
[2]	371	Python can interface directly to this type using the following functions.
[391]	372	Support is optimized if Python's own :c:type:`Py_UNICODE` type is identical to
	373	the system's :c:type:`wchar_t`.
[2]	374
	375
[391]	376	wchar_t Support
	377	"""""""""""""""
[2]	378
[391]	379	:c:type:`wchar_t` support for platforms which support it:
[2]	380
[391]	381	.. c:function:: PyObject* PyUnicode_FromWideChar(const wchar_t *w, Py_ssize_t size)
	382
	383	Create a Unicode object from the :c:type:`wchar_t` buffer w of the given size.
[2]	384	Return NULL on failure.
	385
	386	.. versionchanged:: 2.5
[391]	387	This function used an :c:type:`int` type for size. This might require
[2]	388	changes in your code for properly supporting 64-bit systems.
	389
	390
[391]	391	.. c:function:: Py_ssize_t PyUnicode_AsWideChar(PyUnicodeObject unicode, wchar_t w, Py_ssize_t size)
[2]	392
[391]	393	Copy the Unicode object contents into the :c:type:`wchar_t` buffer w. At most
	394	size :c:type:`wchar_t` characters are copied (excluding a possibly trailing
	395	0-termination character). Return the number of :c:type:`wchar_t` characters
	396	copied or -1 in case of an error. Note that the resulting :c:type:`wchar_t`
[2]	397	string may or may not be 0-terminated. It is the responsibility of the caller
[391]	398	to make sure that the :c:type:`wchar_t` string is 0-terminated in case this is
	399	required by the application. Also, note that the :c:type:`wchar_t*` string
	400	might contain null characters, which would cause the string to be truncated
	401	when used with most C functions.
[2]	402
	403	.. versionchanged:: 2.5
[391]	404	This function returned an :c:type:`int` type and used an :c:type:`int`
[2]	405	type for size. This might require changes in your code for properly
	406	supporting 64-bit systems.
	407
	408
	409	.. _builtincodecs:
	410
	411	Built-in Codecs
	412	^^^^^^^^^^^^^^^
	413
	414	Python provides a set of built-in codecs which are written in C for speed. All of
	415	these codecs are directly usable via the following functions.
	416
[391]	417	Many of the following APIs take two arguments encoding and errors, and they
	418	have the same semantics as the ones of the built-in :func:`unicode` Unicode
	419	object constructor.
[2]	420
	421	Setting encoding to NULL causes the default encoding to be used which is
[391]	422	ASCII. The file system calls should use :c:data:`Py_FileSystemDefaultEncoding`
	423	as the encoding for file names. This variable should be treated as read-only: on
[2]	424	some systems, it will be a pointer to a static string, on others, it will change
	425	at run-time (such as when the application invokes setlocale).
	426
	427	Error handling is set by errors which may also be set to NULL meaning to use
	428	the default handling defined for the codec. Default error handling for all
	429	built-in codecs is "strict" (:exc:`ValueError` is raised).
	430
	431	The codecs all use a similar interface. Only deviation from the following
	432	generic ones are documented for simplicity.
	433
[391]	434
	435	Generic Codecs
	436	""""""""""""""
	437
[2]	438	These are the generic codec APIs:
	439
	440
[391]	441	.. c:function:: PyObject* PyUnicode_Decode(const char s, Py_ssize_t size, const char encoding, const char *errors)
[2]	442
	443	Create a Unicode object by decoding size bytes of the encoded string s.
	444	encoding and errors have the same meaning as the parameters of the same name
	445	in the :func:`unicode` built-in function. The codec to be used is looked up
	446	using the Python codec registry. Return NULL if an exception was raised by
	447	the codec.
	448
	449	.. versionchanged:: 2.5
[391]	450	This function used an :c:type:`int` type for size. This might require
[2]	451	changes in your code for properly supporting 64-bit systems.
	452
	453
[391]	454	.. c:function:: PyObject* PyUnicode_Encode(const Py_UNICODE s, Py_ssize_t size, const char encoding, const char *errors)
[2]	455
[391]	456	Encode the :c:type:`Py_UNICODE` buffer s of the given size and return a Python
[2]	457	string object. encoding and errors have the same meaning as the parameters
[391]	458	of the same name in the Unicode :meth:`~unicode.encode` method. The codec
	459	to be used is looked up using the Python codec registry. Return NULL if
	460	an exception was raised by the codec.
[2]	461
	462	.. versionchanged:: 2.5
[391]	463	This function used an :c:type:`int` type for size. This might require
[2]	464	changes in your code for properly supporting 64-bit systems.
	465
	466
[391]	467	.. c:function:: PyObject* PyUnicode_AsEncodedString(PyObject unicode, const char encoding, const char *errors)
[2]	468
	469	Encode a Unicode object and return the result as Python string object.
	470	encoding and errors have the same meaning as the parameters of the same name
	471	in the Unicode :meth:`encode` method. The codec to be used is looked up using
	472	the Python codec registry. Return NULL if an exception was raised by the
	473	codec.
	474
[391]	475
	476	UTF-8 Codecs
	477	""""""""""""
	478
[2]	479	These are the UTF-8 codec APIs:
	480
	481
[391]	482	.. c:function:: PyObject* PyUnicode_DecodeUTF8(const char s, Py_ssize_t size, const char errors)
[2]	483
	484	Create a Unicode object by decoding size bytes of the UTF-8 encoded string
	485	s. Return NULL if an exception was raised by the codec.
	486
	487	.. versionchanged:: 2.5
[391]	488	This function used an :c:type:`int` type for size. This might require
[2]	489	changes in your code for properly supporting 64-bit systems.
	490
	491
[391]	492	.. c:function:: PyObject* PyUnicode_DecodeUTF8Stateful(const char s, Py_ssize_t size, const char errors, Py_ssize_t *consumed)
[2]	493
[391]	494	If consumed is NULL, behave like :c:func:`PyUnicode_DecodeUTF8`. If
[2]	495	consumed is not NULL, trailing incomplete UTF-8 byte sequences will not be
	496	treated as an error. Those bytes will not be decoded and the number of bytes
	497	that have been decoded will be stored in consumed.
	498
	499	.. versionadded:: 2.4
	500
	501	.. versionchanged:: 2.5
[391]	502	This function used an :c:type:`int` type for size. This might require
[2]	503	changes in your code for properly supporting 64-bit systems.
	504
	505
[391]	506	.. c:function:: PyObject* PyUnicode_EncodeUTF8(const Py_UNICODE s, Py_ssize_t size, const char errors)
[2]	507
[391]	508	Encode the :c:type:`Py_UNICODE` buffer s of the given size using UTF-8 and return a
[2]	509	Python string object. Return NULL if an exception was raised by the codec.
	510
	511	.. versionchanged:: 2.5
[391]	512	This function used an :c:type:`int` type for size. This might require
[2]	513	changes in your code for properly supporting 64-bit systems.
	514
	515
[391]	516	.. c:function:: PyObject* PyUnicode_AsUTF8String(PyObject *unicode)
[2]	517
	518	Encode a Unicode object using UTF-8 and return the result as Python string
	519	object. Error handling is "strict". Return NULL if an exception was raised
	520	by the codec.
	521
[391]	522
	523	UTF-32 Codecs
	524	"""""""""""""
	525
[2]	526	These are the UTF-32 codec APIs:
	527
	528
[391]	529	.. c:function:: PyObject* PyUnicode_DecodeUTF32(const char s, Py_ssize_t size, const char errors, int *byteorder)
[2]	530
[391]	531	Decode size bytes from a UTF-32 encoded buffer string and return the
[2]	532	corresponding Unicode object. errors (if non-NULL) defines the error
	533	handling. It defaults to "strict".
	534
	535	If byteorder is non-NULL, the decoder starts decoding using the given byte
	536	order::
	537
	538	*byteorder == -1: little endian
	539	*byteorder == 0: native order
	540	*byteorder == 1: big endian
	541
	542	If ``*byteorder`` is zero, and the first four bytes of the input data are a
	543	byte order mark (BOM), the decoder switches to this byte order and the BOM is
	544	not copied into the resulting Unicode string. If ``*byteorder`` is ``-1`` or
	545	``1``, any byte order mark is copied to the output.
	546
	547	After completion, \byteorder* is set to the current byte order at the end
	548	of input data.
	549
	550	In a narrow build codepoints outside the BMP will be decoded as surrogate pairs.
	551
	552	If byteorder is NULL, the codec starts in native order mode.
	553
	554	Return NULL if an exception was raised by the codec.
	555
	556	.. versionadded:: 2.6
	557
	558
[391]	559	.. c:function:: PyObject* PyUnicode_DecodeUTF32Stateful(const char s, Py_ssize_t size, const char errors, int byteorder, Py_ssize_t consumed)
[2]	560
[391]	561	If consumed is NULL, behave like :c:func:`PyUnicode_DecodeUTF32`. If
	562	consumed is not NULL, :c:func:`PyUnicode_DecodeUTF32Stateful` will not treat
[2]	563	trailing incomplete UTF-32 byte sequences (such as a number of bytes not divisible
	564	by four) as an error. Those bytes will not be decoded and the number of bytes
	565	that have been decoded will be stored in consumed.
	566
	567	.. versionadded:: 2.6
	568
	569
[391]	570	.. c:function:: PyObject* PyUnicode_EncodeUTF32(const Py_UNICODE s, Py_ssize_t size, const char errors, int byteorder)
[2]	571
	572	Return a Python bytes object holding the UTF-32 encoded value of the Unicode
	573	data in s. Output is written according to the following byte order::
	574
	575	byteorder == -1: little endian
	576	byteorder == 0: native byte order (writes a BOM mark)
	577	byteorder == 1: big endian
	578
	579	If byteorder is ``0``, the output string will always start with the Unicode BOM
	580	mark (U+FEFF). In the other two modes, no BOM mark is prepended.
	581
	582	If Py_UNICODE_WIDE is not defined, surrogate pairs will be output
	583	as a single codepoint.
	584
	585	Return NULL if an exception was raised by the codec.
	586
	587	.. versionadded:: 2.6
	588
	589
[391]	590	.. c:function:: PyObject* PyUnicode_AsUTF32String(PyObject *unicode)
[2]	591
	592	Return a Python string using the UTF-32 encoding in native byte order. The
	593	string always starts with a BOM mark. Error handling is "strict". Return
	594	NULL if an exception was raised by the codec.
	595
	596	.. versionadded:: 2.6
	597
	598
[391]	599	UTF-16 Codecs
	600	"""""""""""""
	601
[2]	602	These are the UTF-16 codec APIs:
	603
	604
[391]	605	.. c:function:: PyObject* PyUnicode_DecodeUTF16(const char s, Py_ssize_t size, const char errors, int *byteorder)
[2]	606
[391]	607	Decode size bytes from a UTF-16 encoded buffer string and return the
[2]	608	corresponding Unicode object. errors (if non-NULL) defines the error
	609	handling. It defaults to "strict".
	610
	611	If byteorder is non-NULL, the decoder starts decoding using the given byte
	612	order::
	613
	614	*byteorder == -1: little endian
	615	*byteorder == 0: native order
	616	*byteorder == 1: big endian
	617
	618	If ``*byteorder`` is zero, and the first two bytes of the input data are a
	619	byte order mark (BOM), the decoder switches to this byte order and the BOM is
	620	not copied into the resulting Unicode string. If ``*byteorder`` is ``-1`` or
	621	``1``, any byte order mark is copied to the output (where it will result in
	622	either a ``\ufeff`` or a ``\ufffe`` character).
	623
	624	After completion, \byteorder* is set to the current byte order at the end
	625	of input data.
	626
	627	If byteorder is NULL, the codec starts in native order mode.
	628
	629	Return NULL if an exception was raised by the codec.
	630
	631	.. versionchanged:: 2.5
[391]	632	This function used an :c:type:`int` type for size. This might require
[2]	633	changes in your code for properly supporting 64-bit systems.
	634
	635
[391]	636	.. c:function:: PyObject* PyUnicode_DecodeUTF16Stateful(const char s, Py_ssize_t size, const char errors, int byteorder, Py_ssize_t consumed)
[2]	637
[391]	638	If consumed is NULL, behave like :c:func:`PyUnicode_DecodeUTF16`. If
	639	consumed is not NULL, :c:func:`PyUnicode_DecodeUTF16Stateful` will not treat
[2]	640	trailing incomplete UTF-16 byte sequences (such as an odd number of bytes or a
	641	split surrogate pair) as an error. Those bytes will not be decoded and the
	642	number of bytes that have been decoded will be stored in consumed.
	643
	644	.. versionadded:: 2.4
	645
	646	.. versionchanged:: 2.5
[391]	647	This function used an :c:type:`int` type for size and an :c:type:`int *`
[2]	648	type for consumed. This might require changes in your code for
	649	properly supporting 64-bit systems.
	650
	651
[391]	652	.. c:function:: PyObject* PyUnicode_EncodeUTF16(const Py_UNICODE s, Py_ssize_t size, const char errors, int byteorder)
[2]	653
	654	Return a Python string object holding the UTF-16 encoded value of the Unicode
	655	data in s. Output is written according to the following byte order::
	656
	657	byteorder == -1: little endian
	658	byteorder == 0: native byte order (writes a BOM mark)
	659	byteorder == 1: big endian
	660
	661	If byteorder is ``0``, the output string will always start with the Unicode BOM
	662	mark (U+FEFF). In the other two modes, no BOM mark is prepended.
	663
[391]	664	If Py_UNICODE_WIDE is defined, a single :c:type:`Py_UNICODE` value may get
	665	represented as a surrogate pair. If it is not defined, each :c:type:`Py_UNICODE`
[2]	666	values is interpreted as an UCS-2 character.
	667
	668	Return NULL if an exception was raised by the codec.
	669
	670	.. versionchanged:: 2.5
[391]	671	This function used an :c:type:`int` type for size. This might require
[2]	672	changes in your code for properly supporting 64-bit systems.
	673
	674
[391]	675	.. c:function:: PyObject* PyUnicode_AsUTF16String(PyObject *unicode)
[2]	676
	677	Return a Python string using the UTF-16 encoding in native byte order. The
	678	string always starts with a BOM mark. Error handling is "strict". Return
	679	NULL if an exception was raised by the codec.
	680
[391]	681
	682	UTF-7 Codecs
	683	""""""""""""
	684
	685	These are the UTF-7 codec APIs:
	686
	687
	688	.. c:function:: PyObject* PyUnicode_DecodeUTF7(const char s, Py_ssize_t size, const char errors)
	689
	690	Create a Unicode object by decoding size bytes of the UTF-7 encoded string
	691	s. Return NULL if an exception was raised by the codec.
	692
	693
	694	.. c:function:: PyObject* PyUnicode_DecodeUTF7Stateful(const char s, Py_ssize_t size, const char errors, Py_ssize_t *consumed)
	695
	696	If consumed is NULL, behave like :c:func:`PyUnicode_DecodeUTF7`. If
	697	consumed is not NULL, trailing incomplete UTF-7 base-64 sections will not
	698	be treated as an error. Those bytes will not be decoded and the number of
	699	bytes that have been decoded will be stored in consumed.
	700
	701
	702	.. c:function:: PyObject* PyUnicode_EncodeUTF7(const Py_UNICODE s, Py_ssize_t size, int base64SetO, int base64WhiteSpace, const char errors)
	703
	704	Encode the :c:type:`Py_UNICODE` buffer of the given size using UTF-7 and
	705	return a Python bytes object. Return NULL if an exception was raised by
	706	the codec.
	707
	708	If base64SetO is nonzero, "Set O" (punctuation that has no otherwise
	709	special meaning) will be encoded in base-64. If base64WhiteSpace is
	710	nonzero, whitespace will be encoded in base-64. Both are set to zero for the
	711	Python "utf-7" codec.
	712
	713
	714	Unicode-Escape Codecs
	715	"""""""""""""""""""""
	716
[2]	717	These are the "Unicode Escape" codec APIs:
	718
	719
[391]	720	.. c:function:: PyObject* PyUnicode_DecodeUnicodeEscape(const char s, Py_ssize_t size, const char errors)
[2]	721
	722	Create a Unicode object by decoding size bytes of the Unicode-Escape encoded
	723	string s. Return NULL if an exception was raised by the codec.
	724
	725	.. versionchanged:: 2.5
[391]	726	This function used an :c:type:`int` type for size. This might require
[2]	727	changes in your code for properly supporting 64-bit systems.
	728
	729
[391]	730	.. c:function:: PyObject* PyUnicode_EncodeUnicodeEscape(const Py_UNICODE *s, Py_ssize_t size)
[2]	731
[391]	732	Encode the :c:type:`Py_UNICODE` buffer of the given size using Unicode-Escape and
[2]	733	return a Python string object. Return NULL if an exception was raised by the
	734	codec.
	735
	736	.. versionchanged:: 2.5
[391]	737	This function used an :c:type:`int` type for size. This might require
[2]	738	changes in your code for properly supporting 64-bit systems.
	739
	740
[391]	741	.. c:function:: PyObject* PyUnicode_AsUnicodeEscapeString(PyObject *unicode)
[2]	742
	743	Encode a Unicode object using Unicode-Escape and return the result as Python
	744	string object. Error handling is "strict". Return NULL if an exception was
	745	raised by the codec.
	746
[391]	747
	748	Raw-Unicode-Escape Codecs
	749	"""""""""""""""""""""""""
	750
[2]	751	These are the "Raw Unicode Escape" codec APIs:
	752
	753
[391]	754	.. c:function:: PyObject* PyUnicode_DecodeRawUnicodeEscape(const char s, Py_ssize_t size, const char errors)
[2]	755
	756	Create a Unicode object by decoding size bytes of the Raw-Unicode-Escape
	757	encoded string s. Return NULL if an exception was raised by the codec.
	758
	759	.. versionchanged:: 2.5
[391]	760	This function used an :c:type:`int` type for size. This might require
[2]	761	changes in your code for properly supporting 64-bit systems.
	762
	763
[391]	764	.. c:function:: PyObject* PyUnicode_EncodeRawUnicodeEscape(const Py_UNICODE s, Py_ssize_t size, const char errors)
[2]	765
[391]	766	Encode the :c:type:`Py_UNICODE` buffer of the given size using Raw-Unicode-Escape
[2]	767	and return a Python string object. Return NULL if an exception was raised by
	768	the codec.
	769
	770	.. versionchanged:: 2.5
[391]	771	This function used an :c:type:`int` type for size. This might require
[2]	772	changes in your code for properly supporting 64-bit systems.
	773
	774
[391]	775	.. c:function:: PyObject* PyUnicode_AsRawUnicodeEscapeString(PyObject *unicode)
[2]	776
	777	Encode a Unicode object using Raw-Unicode-Escape and return the result as
	778	Python string object. Error handling is "strict". Return NULL if an exception
	779	was raised by the codec.
	780
[391]	781
	782	Latin-1 Codecs
	783	""""""""""""""
	784
[2]	785	These are the Latin-1 codec APIs: Latin-1 corresponds to the first 256 Unicode
	786	ordinals and only these are accepted by the codecs during encoding.
	787
	788
[391]	789	.. c:function:: PyObject* PyUnicode_DecodeLatin1(const char s, Py_ssize_t size, const char errors)
[2]	790
	791	Create a Unicode object by decoding size bytes of the Latin-1 encoded string
	792	s. Return NULL if an exception was raised by the codec.
	793
	794	.. versionchanged:: 2.5
[391]	795	This function used an :c:type:`int` type for size. This might require
[2]	796	changes in your code for properly supporting 64-bit systems.
	797
	798
[391]	799	.. c:function:: PyObject* PyUnicode_EncodeLatin1(const Py_UNICODE s, Py_ssize_t size, const char errors)
[2]	800
[391]	801	Encode the :c:type:`Py_UNICODE` buffer of the given size using Latin-1 and return
[2]	802	a Python string object. Return NULL if an exception was raised by the codec.
	803
	804	.. versionchanged:: 2.5
[391]	805	This function used an :c:type:`int` type for size. This might require
[2]	806	changes in your code for properly supporting 64-bit systems.
	807
	808
[391]	809	.. c:function:: PyObject* PyUnicode_AsLatin1String(PyObject *unicode)
[2]	810
	811	Encode a Unicode object using Latin-1 and return the result as Python string
	812	object. Error handling is "strict". Return NULL if an exception was raised
	813	by the codec.
	814
[391]	815
	816	ASCII Codecs
	817	""""""""""""
	818
[2]	819	These are the ASCII codec APIs. Only 7-bit ASCII data is accepted. All other
	820	codes generate errors.
	821
	822
[391]	823	.. c:function:: PyObject* PyUnicode_DecodeASCII(const char s, Py_ssize_t size, const char errors)
[2]	824
	825	Create a Unicode object by decoding size bytes of the ASCII encoded string
	826	s. Return NULL if an exception was raised by the codec.
	827
	828	.. versionchanged:: 2.5
[391]	829	This function used an :c:type:`int` type for size. This might require
[2]	830	changes in your code for properly supporting 64-bit systems.
	831
	832
[391]	833	.. c:function:: PyObject* PyUnicode_EncodeASCII(const Py_UNICODE s, Py_ssize_t size, const char errors)
[2]	834
[391]	835	Encode the :c:type:`Py_UNICODE` buffer of the given size using ASCII and return a
[2]	836	Python string object. Return NULL if an exception was raised by the codec.
	837
	838	.. versionchanged:: 2.5
[391]	839	This function used an :c:type:`int` type for size. This might require
[2]	840	changes in your code for properly supporting 64-bit systems.
	841
	842
[391]	843	.. c:function:: PyObject* PyUnicode_AsASCIIString(PyObject *unicode)
[2]	844
	845	Encode a Unicode object using ASCII and return the result as Python string
	846	object. Error handling is "strict". Return NULL if an exception was raised
	847	by the codec.
	848
	849
[391]	850	Character Map Codecs
	851	""""""""""""""""""""
[2]	852
	853	This codec is special in that it can be used to implement many different codecs
	854	(and this is in fact what was done to obtain most of the standard codecs
	855	included in the :mod:`encodings` package). The codec uses mapping to encode and
	856	decode characters.
	857
	858	Decoding mappings must map single string characters to single Unicode
	859	characters, integers (which are then interpreted as Unicode ordinals) or None
	860	(meaning "undefined mapping" and causing an error).
	861
	862	Encoding mappings must map single Unicode characters to single string
	863	characters, integers (which are then interpreted as Latin-1 ordinals) or None
	864	(meaning "undefined mapping" and causing an error).
	865
	866	The mapping objects provided must only support the __getitem__ mapping
	867	interface.
	868
	869	If a character lookup fails with a LookupError, the character is copied as-is
	870	meaning that its ordinal value will be interpreted as Unicode or Latin-1 ordinal
	871	resp. Because of this, mappings only need to contain those mappings which map
	872	characters to different code points.
	873
[391]	874	These are the mapping codec APIs:
[2]	875
[391]	876	.. c:function:: PyObject* PyUnicode_DecodeCharmap(const char s, Py_ssize_t size, PyObject mapping, const char *errors)
[2]	877
	878	Create a Unicode object by decoding size bytes of the encoded string s using
	879	the given mapping object. Return NULL if an exception was raised by the
	880	codec. If mapping is NULL latin-1 decoding will be done. Else it can be a
	881	dictionary mapping byte or a unicode string, which is treated as a lookup table.
	882	Byte values greater that the length of the string and U+FFFE "characters" are
	883	treated as "undefined mapping".
	884
	885	.. versionchanged:: 2.4
	886	Allowed unicode string as mapping argument.
	887
	888	.. versionchanged:: 2.5
[391]	889	This function used an :c:type:`int` type for size. This might require
[2]	890	changes in your code for properly supporting 64-bit systems.
	891
	892
[391]	893	.. c:function:: PyObject* PyUnicode_EncodeCharmap(const Py_UNICODE s, Py_ssize_t size, PyObject mapping, const char *errors)
[2]	894
[391]	895	Encode the :c:type:`Py_UNICODE` buffer of the given size using the given
[2]	896	mapping object and return a Python string object. Return NULL if an
	897	exception was raised by the codec.
	898
	899	.. versionchanged:: 2.5
[391]	900	This function used an :c:type:`int` type for size. This might require
[2]	901	changes in your code for properly supporting 64-bit systems.
	902
	903
[391]	904	.. c:function:: PyObject* PyUnicode_AsCharmapString(PyObject unicode, PyObject mapping)
[2]	905
	906	Encode a Unicode object using the given mapping object and return the result
	907	as Python string object. Error handling is "strict". Return NULL if an
	908	exception was raised by the codec.
	909
	910	The following codec API is special in that maps Unicode to Unicode.
	911
	912
[391]	913	.. c:function:: PyObject* PyUnicode_TranslateCharmap(const Py_UNICODE s, Py_ssize_t size, PyObject table, const char *errors)
[2]	914
[391]	915	Translate a :c:type:`Py_UNICODE` buffer of the given size by applying a
[2]	916	character mapping table to it and return the resulting Unicode object. Return
	917	NULL when an exception was raised by the codec.
	918
	919	The mapping table must map Unicode ordinal integers to Unicode ordinal
	920	integers or None (causing deletion of the character).
	921
	922	Mapping tables need only provide the :meth:`__getitem__` interface; dictionaries
	923	and sequences work well. Unmapped character ordinals (ones which cause a
	924	:exc:`LookupError`) are left untouched and are copied as-is.
	925
	926	.. versionchanged:: 2.5
[391]	927	This function used an :c:type:`int` type for size. This might require
[2]	928	changes in your code for properly supporting 64-bit systems.
	929
[391]	930
	931	MBCS codecs for Windows
	932	"""""""""""""""""""""""
	933
[2]	934	These are the MBCS codec APIs. They are currently only available on Windows and
	935	use the Win32 MBCS converters to implement the conversions. Note that MBCS (or
	936	DBCS) is a class of encodings, not just one. The target encoding is defined by
	937	the user settings on the machine running the codec.
	938
	939
[391]	940	.. c:function:: PyObject* PyUnicode_DecodeMBCS(const char s, Py_ssize_t size, const char errors)
[2]	941
	942	Create a Unicode object by decoding size bytes of the MBCS encoded string s.
	943	Return NULL if an exception was raised by the codec.
	944
	945	.. versionchanged:: 2.5
[391]	946	This function used an :c:type:`int` type for size. This might require
[2]	947	changes in your code for properly supporting 64-bit systems.
	948
	949
[391]	950	.. c:function:: PyObject* PyUnicode_DecodeMBCSStateful(const char s, int size, const char errors, int *consumed)
[2]	951
[391]	952	If consumed is NULL, behave like :c:func:`PyUnicode_DecodeMBCS`. If
	953	consumed is not NULL, :c:func:`PyUnicode_DecodeMBCSStateful` will not decode
[2]	954	trailing lead byte and the number of bytes that have been decoded will be stored
	955	in consumed.
	956
	957	.. versionadded:: 2.5
	958
	959
[391]	960	.. c:function:: PyObject* PyUnicode_EncodeMBCS(const Py_UNICODE s, Py_ssize_t size, const char errors)
[2]	961
[391]	962	Encode the :c:type:`Py_UNICODE` buffer of the given size using MBCS and return a
[2]	963	Python string object. Return NULL if an exception was raised by the codec.
	964
	965	.. versionchanged:: 2.5
[391]	966	This function used an :c:type:`int` type for size. This might require
[2]	967	changes in your code for properly supporting 64-bit systems.
	968
	969
[391]	970	.. c:function:: PyObject* PyUnicode_AsMBCSString(PyObject *unicode)
[2]	971
	972	Encode a Unicode object using MBCS and return the result as Python string
	973	object. Error handling is "strict". Return NULL if an exception was raised
	974	by the codec.
	975
	976
[391]	977	Methods & Slots
	978	"""""""""""""""
[2]	979
	980	.. _unicodemethodsandslots:
	981
	982	Methods and Slot Functions
	983	^^^^^^^^^^^^^^^^^^^^^^^^^^
	984
	985	The following APIs are capable of handling Unicode objects and strings on input
	986	(we refer to them as strings in the descriptions) and return Unicode objects or
	987	integers as appropriate.
	988
	989	They all return NULL or ``-1`` if an exception occurs.
	990
	991
[391]	992	.. c:function:: PyObject* PyUnicode_Concat(PyObject left, PyObject right)
[2]	993
	994	Concat two strings giving a new Unicode string.
	995
	996
[391]	997	.. c:function:: PyObject* PyUnicode_Split(PyObject s, PyObject sep, Py_ssize_t maxsplit)
[2]	998
[391]	999	Split a string giving a list of Unicode strings. If sep is NULL, splitting
[2]	1000	will be done at all whitespace substrings. Otherwise, splits occur at the given
	1001	separator. At most maxsplit splits will be done. If negative, no limit is
	1002	set. Separators are not included in the resulting list.
	1003
	1004	.. versionchanged:: 2.5
[391]	1005	This function used an :c:type:`int` type for maxsplit. This might require
[2]	1006	changes in your code for properly supporting 64-bit systems.
	1007
	1008
[391]	1009	.. c:function:: PyObject* PyUnicode_Splitlines(PyObject *s, int keepend)
[2]	1010
	1011	Split a Unicode string at line breaks, returning a list of Unicode strings.
	1012	CRLF is considered to be one line break. If keepend is 0, the Line break
	1013	characters are not included in the resulting strings.
	1014
	1015
[391]	1016	.. c:function:: PyObject* PyUnicode_Translate(PyObject str, PyObject table, const char *errors)
[2]	1017
	1018	Translate a string by applying a character mapping table to it and return the
	1019	resulting Unicode object.
	1020
	1021	The mapping table must map Unicode ordinal integers to Unicode ordinal integers
	1022	or None (causing deletion of the character).
	1023
	1024	Mapping tables need only provide the :meth:`__getitem__` interface; dictionaries
	1025	and sequences work well. Unmapped character ordinals (ones which cause a
	1026	:exc:`LookupError`) are left untouched and are copied as-is.
	1027
	1028	errors has the usual meaning for codecs. It may be NULL which indicates to
	1029	use the default error handling.
	1030
	1031
[391]	1032	.. c:function:: PyObject* PyUnicode_Join(PyObject separator, PyObject seq)
[2]	1033
[391]	1034	Join a sequence of strings using the given separator and return the resulting
[2]	1035	Unicode string.
	1036
	1037
[391]	1038	.. c:function:: int PyUnicode_Tailmatch(PyObject str, PyObject substr, Py_ssize_t start, Py_ssize_t end, int direction)
[2]	1039
[391]	1040	Return 1 if substr matches ``str[start:end]`` at the given tail end
[2]	1041	(direction == -1 means to do a prefix match, direction == 1 a suffix match),
	1042	0 otherwise. Return ``-1`` if an error occurred.
	1043
	1044	.. versionchanged:: 2.5
[391]	1045	This function used an :c:type:`int` type for start and end. This
[2]	1046	might require changes in your code for properly supporting 64-bit
	1047	systems.
	1048
	1049
[391]	1050	.. c:function:: Py_ssize_t PyUnicode_Find(PyObject str, PyObject substr, Py_ssize_t start, Py_ssize_t end, int direction)
[2]	1051
[391]	1052	Return the first position of substr in ``str[start:end]`` using the given
[2]	1053	direction (direction == 1 means to do a forward search, direction == -1 a
	1054	backward search). The return value is the index of the first match; a value of
	1055	``-1`` indicates that no match was found, and ``-2`` indicates that an error
	1056	occurred and an exception has been set.
	1057
	1058	.. versionchanged:: 2.5
[391]	1059	This function used an :c:type:`int` type for start and end. This
[2]	1060	might require changes in your code for properly supporting 64-bit
	1061	systems.
	1062
	1063
[391]	1064	.. c:function:: Py_ssize_t PyUnicode_Count(PyObject str, PyObject substr, Py_ssize_t start, Py_ssize_t end)
[2]	1065
	1066	Return the number of non-overlapping occurrences of substr in
	1067	``str[start:end]``. Return ``-1`` if an error occurred.
	1068
	1069	.. versionchanged:: 2.5
[391]	1070	This function returned an :c:type:`int` type and used an :c:type:`int`
[2]	1071	type for start and end. This might require changes in your code for
	1072	properly supporting 64-bit systems.
	1073
	1074
[391]	1075	.. c:function:: PyObject* PyUnicode_Replace(PyObject str, PyObject substr, PyObject *replstr, Py_ssize_t maxcount)
[2]	1076
	1077	Replace at most maxcount occurrences of substr in str with replstr and
	1078	return the resulting Unicode object. maxcount == -1 means replace all
	1079	occurrences.
	1080
	1081	.. versionchanged:: 2.5
[391]	1082	This function used an :c:type:`int` type for maxcount. This might
[2]	1083	require changes in your code for properly supporting 64-bit systems.
	1084
	1085
[391]	1086	.. c:function:: int PyUnicode_Compare(PyObject left, PyObject right)
[2]	1087
	1088	Compare two strings and return -1, 0, 1 for less than, equal, and greater than,
	1089	respectively.
	1090
	1091
[391]	1092	.. c:function:: int PyUnicode_RichCompare(PyObject left, PyObject right, int op)
[2]	1093
	1094	Rich compare two unicode strings and return one of the following:
	1095
	1096	* ``NULL`` in case an exception was raised
	1097	* :const:`Py_True` or :const:`Py_False` for successful comparisons
	1098	* :const:`Py_NotImplemented` in case the type combination is unknown
	1099
	1100	Note that :const:`Py_EQ` and :const:`Py_NE` comparisons can cause a
	1101	:exc:`UnicodeWarning` in case the conversion of the arguments to Unicode fails
	1102	with a :exc:`UnicodeDecodeError`.
	1103
	1104	Possible values for op are :const:`Py_GT`, :const:`Py_GE`, :const:`Py_EQ`,
	1105	:const:`Py_NE`, :const:`Py_LT`, and :const:`Py_LE`.
	1106
	1107
[391]	1108	.. c:function:: PyObject* PyUnicode_Format(PyObject format, PyObject args)
[2]	1109
	1110	Return a new string object from format and args; this is analogous to
	1111	``format % args``. The args argument must be a tuple.
	1112
	1113
[391]	1114	.. c:function:: int PyUnicode_Contains(PyObject container, PyObject element)
[2]	1115
	1116	Check whether element is contained in container and return true or false
	1117	accordingly.
	1118
	1119	element has to coerce to a one element Unicode string. ``-1`` is returned if
	1120	there was an error.

Note: See TracBrowser for help on using the repository browser.

Context Navigation

source: python/trunk/Doc/c-api/unicode.rst

Download in other formats: