Context Navigation

← Previous Revision
Next Revision →
Normal
Revision Log

string.rst

Last change on this file was 391, checked in by dmik, 11 years ago
python: Merge vendor 2.7.6 to trunk.
Property svn:eol-style set to `native`
File size: 42.7 KB

Rev	Line
[2]	1	:mod:`string` --- Common string operations
	2	==========================================
	3
	4	.. module:: string
	5	:synopsis: Common string operations.
	6
	7
	8	.. index:: module: re
	9
[391]	10	Source code: :source:`Lib/string.py`
	11
	12	--------------
	13
[2]	14	The :mod:`string` module contains a number of useful constants and
	15	classes, as well as some deprecated legacy functions that are also
	16	available as methods on strings. In addition, Python's built-in string
	17	classes support the sequence type methods described in the
	18	:ref:`typesseq` section, and also the string-specific methods described
	19	in the :ref:`string-methods` section. To output formatted strings use
	20	template strings or the ``%`` operator described in the
	21	:ref:`string-formatting` section. Also, see the :mod:`re` module for
	22	string functions based on regular expressions.
	23
	24	String constants
	25	----------------
	26
	27	The constants defined in this module are:
	28
	29
	30	.. data:: ascii_letters
	31
	32	The concatenation of the :const:`ascii_lowercase` and :const:`ascii_uppercase`
	33	constants described below. This value is not locale-dependent.
	34
	35
	36	.. data:: ascii_lowercase
	37
	38	The lowercase letters ``'abcdefghijklmnopqrstuvwxyz'``. This value is not
	39	locale-dependent and will not change.
	40
	41
	42	.. data:: ascii_uppercase
	43
	44	The uppercase letters ``'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``. This value is not
	45	locale-dependent and will not change.
	46
	47
	48	.. data:: digits
	49
	50	The string ``'0123456789'``.
	51
	52
	53	.. data:: hexdigits
	54
	55	The string ``'0123456789abcdefABCDEF'``.
	56
	57
	58	.. data:: letters
	59
	60	The concatenation of the strings :const:`lowercase` and :const:`uppercase`
	61	described below. The specific value is locale-dependent, and will be updated
	62	when :func:`locale.setlocale` is called.
	63
	64
	65	.. data:: lowercase
	66
	67	A string containing all the characters that are considered lowercase letters.
	68	On most systems this is the string ``'abcdefghijklmnopqrstuvwxyz'``. The
	69	specific value is locale-dependent, and will be updated when
	70	:func:`locale.setlocale` is called.
	71
	72
	73	.. data:: octdigits
	74
	75	The string ``'01234567'``.
	76
	77
	78	.. data:: punctuation
	79
	80	String of ASCII characters which are considered punctuation characters in the
	81	``C`` locale.
	82
	83
	84	.. data:: printable
	85
	86	String of characters which are considered printable. This is a combination of
	87	:const:`digits`, :const:`letters`, :const:`punctuation`, and
	88	:const:`whitespace`.
	89
	90
	91	.. data:: uppercase
	92
	93	A string containing all the characters that are considered uppercase letters.
	94	On most systems this is the string ``'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``. The
	95	specific value is locale-dependent, and will be updated when
	96	:func:`locale.setlocale` is called.
	97
	98
	99	.. data:: whitespace
	100
	101	A string containing all characters that are considered whitespace. On most
	102	systems this includes the characters space, tab, linefeed, return, formfeed, and
	103	vertical tab.
	104
	105
	106	.. _new-string-formatting:
	107
	108	String Formatting
	109	-----------------
	110
[391]	111	.. versionadded:: 2.6
	112
	113	The built-in str and unicode classes provide the ability
[2]	114	to do complex variable substitutions and value formatting via the
	115	:meth:`str.format` method described in :pep:`3101`. The :class:`Formatter`
	116	class in the :mod:`string` module allows you to create and customize your own
	117	string formatting behaviors using the same implementation as the built-in
	118	:meth:`format` method.
	119
	120	.. class:: Formatter
	121
	122	The :class:`Formatter` class has the following public methods:
	123
[391]	124	.. method:: format(format_string, args, *kwargs)
[2]	125
[391]	126	:meth:`format` is the primary API method. It takes a format string and
	127	an arbitrary set of positional and keyword arguments.
[2]	128	:meth:`format` is just a wrapper that calls :meth:`vformat`.
	129
	130	.. method:: vformat(format_string, args, kwargs)
	131
	132	This function does the actual work of formatting. It is exposed as a
	133	separate function for cases where you want to pass in a predefined
	134	dictionary of arguments, rather than unpacking and repacking the
[391]	135	dictionary as individual arguments using the ``args`` and ``*kwargs``
	136	syntax. :meth:`vformat` does the work of breaking up the format string
	137	into character data and replacement fields. It calls the various
[2]	138	methods described below.
	139
	140	In addition, the :class:`Formatter` defines a number of methods that are
	141	intended to be replaced by subclasses:
	142
	143	.. method:: parse(format_string)
	144
	145	Loop over the format_string and return an iterable of tuples
	146	(literal_text, field_name, format_spec, conversion). This is used
[391]	147	by :meth:`vformat` to break the string into either literal text, or
[2]	148	replacement fields.
	149
	150	The values in the tuple conceptually represent a span of literal text
	151	followed by a single replacement field. If there is no literal text
	152	(which can happen if two replacement fields occur consecutively), then
	153	literal_text will be a zero-length string. If there is no replacement
	154	field, then the values of field_name, format_spec and conversion
	155	will be ``None``.
	156
	157	.. method:: get_field(field_name, args, kwargs)
	158
	159	Given field_name as returned by :meth:`parse` (see above), convert it to
	160	an object to be formatted. Returns a tuple (obj, used_key). The default
	161	version takes strings of the form defined in :pep:`3101`, such as
	162	"0[name]" or "label.title". args and kwargs are as passed in to
	163	:meth:`vformat`. The return value used_key has the same meaning as the
	164	key parameter to :meth:`get_value`.
	165
	166	.. method:: get_value(key, args, kwargs)
	167
	168	Retrieve a given field value. The key argument will be either an
	169	integer or a string. If it is an integer, it represents the index of the
	170	positional argument in args; if it is a string, then it represents a
	171	named argument in kwargs.
	172
	173	The args parameter is set to the list of positional arguments to
	174	:meth:`vformat`, and the kwargs parameter is set to the dictionary of
	175	keyword arguments.
	176
	177	For compound field names, these functions are only called for the first
	178	component of the field name; Subsequent components are handled through
	179	normal attribute and indexing operations.
	180
	181	So for example, the field expression '0.name' would cause
	182	:meth:`get_value` to be called with a key argument of 0. The ``name``
	183	attribute will be looked up after :meth:`get_value` returns by calling the
	184	built-in :func:`getattr` function.
	185
	186	If the index or keyword refers to an item that does not exist, then an
	187	:exc:`IndexError` or :exc:`KeyError` should be raised.
	188
	189	.. method:: check_unused_args(used_args, args, kwargs)
	190
	191	Implement checking for unused arguments if desired. The arguments to this
	192	function is the set of all argument keys that were actually referred to in
	193	the format string (integers for positional arguments, and strings for
	194	named arguments), and a reference to the args and kwargs that was
	195	passed to vformat. The set of unused args can be calculated from these
[391]	196	parameters. :meth:`check_unused_args` is assumed to raise an exception if
[2]	197	the check fails.
	198
	199	.. method:: format_field(value, format_spec)
	200
	201	:meth:`format_field` simply calls the global :func:`format` built-in. The
	202	method is provided so that subclasses can override it.
	203
	204	.. method:: convert_field(value, conversion)
	205
	206	Converts the value (returned by :meth:`get_field`) given a conversion type
[391]	207	(as in the tuple returned by the :meth:`parse` method). The default
	208	version understands 's' (str), 'r' (repr) and 'a' (ascii) conversion
	209	types.
[2]	210
	211
	212	.. _formatstrings:
	213
	214	Format String Syntax
	215	--------------------
	216
	217	The :meth:`str.format` method and the :class:`Formatter` class share the same
	218	syntax for format strings (although in the case of :class:`Formatter`,
[391]	219	subclasses can define their own format string syntax).
[2]	220
	221	Format strings contain "replacement fields" surrounded by curly braces ``{}``.
	222	Anything that is not contained in braces is considered literal text, which is
	223	copied unchanged to the output. If you need to include a brace character in the
	224	literal text, it can be escaped by doubling: ``{{`` and ``}}``.
	225
	226	The grammar for a replacement field is as follows:
	227
	228	.. productionlist:: sf
[391]	229	replacement_field: "{" [`field_name`] ["!" `conversion`] [":" `format_spec`] "}"
	230	field_name: arg_name ("." `attribute_name` \| "[" `element_index` "]")*
	231	arg_name: [`identifier` \| `integer`]
[2]	232	attribute_name: `identifier`
	233	element_index: `integer` \| `index_string`
	234	index_string: <any source character except "]"> +
	235	conversion: "r" \| "s"
	236	format_spec: <described in the next section>
	237
[391]	238	In less formal terms, the replacement field can start with a field_name that specifies
	239	the object whose value is to be formatted and inserted
	240	into the output instead of the replacement field.
	241	The field_name is optionally followed by a conversion field, which is
[2]	242	preceded by an exclamation point ``'!'``, and a format_spec, which is preceded
[391]	243	by a colon ``':'``. These specify a non-default format for the replacement value.
[2]	244
[391]	245	See also the :ref:`formatspec` section.
	246
	247	The field_name itself begins with an arg_name that is either a number or a
	248	keyword. If it's a number, it refers to a positional argument, and if it's a keyword,
	249	it refers to a named keyword argument. If the numerical arg_names in a format string
	250	are 0, 1, 2, ... in sequence, they can all be omitted (not just some)
	251	and the numbers 0, 1, 2, ... will be automatically inserted in that order.
	252	Because arg_name is not quote-delimited, it is not possible to specify arbitrary
	253	dictionary keys (e.g., the strings ``'10'`` or ``':-]'``) within a format string.
	254	The arg_name can be followed by any number of index or
[2]	255	attribute expressions. An expression of the form ``'.name'`` selects the named
	256	attribute using :func:`getattr`, while an expression of the form ``'[index]'``
	257	does an index lookup using :func:`__getitem__`.
	258
[391]	259	.. versionchanged:: 2.7
	260	The positional argument specifiers can be omitted, so ``'{} {}'`` is
	261	equivalent to ``'{0} {1}'``.
	262
[2]	263	Some simple format string examples::
	264
	265	"First, thou shalt count to {0}" # References first positional argument
[391]	266	"Bring me a {}" # Implicitly references the first positional argument
	267	"From {} to {}" # Same as "From {0} to {1}"
[2]	268	"My quest is {name}" # References keyword argument 'name'
	269	"Weight in tons {0.weight}" # 'weight' attribute of first positional arg
	270	"Units destroyed: {players[0]}" # First element of keyword argument 'players'.
	271
	272	The conversion field causes a type coercion before formatting. Normally, the
	273	job of formatting a value is done by the :meth:`__format__` method of the value
	274	itself. However, in some cases it is desirable to force a type to be formatted
	275	as a string, overriding its own definition of formatting. By converting the
	276	value to a string before calling :meth:`__format__`, the normal formatting logic
	277	is bypassed.
	278
	279	Two conversion flags are currently supported: ``'!s'`` which calls :func:`str`
	280	on the value, and ``'!r'`` which calls :func:`repr`.
	281
	282	Some examples::
	283
	284	"Harold's a clever {0!s}" # Calls str() on the argument first
	285	"Bring out the holy {name!r}" # Calls repr() on the argument first
	286
	287	The format_spec field contains a specification of how the value should be
	288	presented, including such details as field width, alignment, padding, decimal
	289	precision and so on. Each value type can define its own "formatting
	290	mini-language" or interpretation of the format_spec.
	291
	292	Most built-in types support a common formatting mini-language, which is
	293	described in the next section.
	294
	295	A format_spec field can also include nested replacement fields within it.
	296	These nested replacement fields can contain only a field name; conversion flags
	297	and format specifications are not allowed. The replacement fields within the
	298	format_spec are substituted before the format_spec string is interpreted.
	299	This allows the formatting of a value to be dynamically specified.
	300
[391]	301	See the :ref:`formatexamples` section for some examples.
[2]	302
	303
	304	.. _formatspec:
	305
	306	Format Specification Mini-Language
	307	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	308
	309	"Format specifications" are used within replacement fields contained within a
	310	format string to define how individual values are presented (see
[391]	311	:ref:`formatstrings`). They can also be passed directly to the built-in
[2]	312	:func:`format` function. Each formattable type may define how the format
	313	specification is to be interpreted.
	314
	315	Most built-in types implement the following options for format specifications,
	316	although some of the formatting options are only supported by the numeric types.
	317
	318	A general convention is that an empty format string (``""``) produces
	319	the same result as if you had called :func:`str` on the value. A
	320	non-empty format string typically modifies the result.
	321
	322	The general form of a standard format specifier is:
	323
	324	.. productionlist:: sf
[391]	325	format_spec: [[`fill`]`align`][`sign`][#][0][`width`][,][.`precision`][`type`]
	326	fill: <any character>
[2]	327	align: "<" \| ">" \| "=" \| "^"
	328	sign: "+" \| "-" \| " "
	329	width: `integer`
	330	precision: `integer`
	331	type: "b" \| "c" \| "d" \| "e" \| "E" \| "f" \| "F" \| "g" \| "G" \| "n" \| "o" \| "s" \| "x" \| "X" \| "%"
	332
[391]	333	If a valid align value is specified, it can be preceeded by a fill
	334	character that can be any character and defaults to a space if omitted.
	335	Note that it is not possible to use ``{`` and ``}`` as fill char while
	336	using the :meth:`str.format` method; this limitation however doesn't
	337	affect the :func:`format` function.
[2]	338
	339	The meaning of the various alignment options is as follows:
	340
	341	+---------+----------------------------------------------------------+
	342	\| Option \| Meaning \|
	343	+=========+==========================================================+
	344	\| ``'<'`` \| Forces the field to be left-aligned within the available \|
[391]	345	\| \| space (this is the default for most objects). \|
[2]	346	+---------+----------------------------------------------------------+
	347	\| ``'>'`` \| Forces the field to be right-aligned within the \|
[391]	348	\| \| available space (this is the default for numbers). \|
[2]	349	+---------+----------------------------------------------------------+
	350	\| ``'='`` \| Forces the padding to be placed after the sign (if any) \|
	351	\| \| but before the digits. This is used for printing fields \|
	352	\| \| in the form '+000000120'. This alignment option is only \|
	353	\| \| valid for numeric types. \|
	354	+---------+----------------------------------------------------------+
	355	\| ``'^'`` \| Forces the field to be centered within the available \|
	356	\| \| space. \|
	357	+---------+----------------------------------------------------------+
	358
	359	Note that unless a minimum field width is defined, the field width will always
	360	be the same size as the data to fill it, so that the alignment option has no
	361	meaning in this case.
	362
	363	The sign option is only valid for number types, and can be one of the
	364	following:
	365
	366	+---------+----------------------------------------------------------+
	367	\| Option \| Meaning \|
	368	+=========+==========================================================+
	369	\| ``'+'`` \| indicates that a sign should be used for both \|
	370	\| \| positive as well as negative numbers. \|
	371	+---------+----------------------------------------------------------+
	372	\| ``'-'`` \| indicates that a sign should be used only for negative \|
	373	\| \| numbers (this is the default behavior). \|
	374	+---------+----------------------------------------------------------+
	375	\| space \| indicates that a leading space should be used on \|
	376	\| \| positive numbers, and a minus sign on negative numbers. \|
	377	+---------+----------------------------------------------------------+
	378
	379	The ``'#'`` option is only valid for integers, and only for binary, octal, or
	380	hexadecimal output. If present, it specifies that the output will be prefixed
	381	by ``'0b'``, ``'0o'``, or ``'0x'``, respectively.
	382
[391]	383	The ``','`` option signals the use of a comma for a thousands separator.
	384	For a locale aware separator, use the ``'n'`` integer presentation type
	385	instead.
	386
	387	.. versionchanged:: 2.7
	388	Added the ``','`` option (see also :pep:`378`).
	389
[2]	390	width is a decimal integer defining the minimum field width. If not
	391	specified, then the field width will be determined by the content.
	392
[391]	393	Preceding the width field by a zero (``'0'``) character enables
	394	sign-aware zero-padding for numeric types. This is equivalent to a fill
	395	character of ``'0'`` with an alignment type of ``'='``.
[2]	396
	397	The precision is a decimal number indicating how many digits should be
	398	displayed after the decimal point for a floating point value formatted with
	399	``'f'`` and ``'F'``, or before and after the decimal point for a floating point
	400	value formatted with ``'g'`` or ``'G'``. For non-number types the field
	401	indicates the maximum field size - in other words, how many characters will be
	402	used from the field content. The precision is not allowed for integer values.
	403
	404	Finally, the type determines how the data should be presented.
	405
	406	The available string presentation types are:
	407
	408	+---------+----------------------------------------------------------+
	409	\| Type \| Meaning \|
	410	+=========+==========================================================+
	411	\| ``'s'`` \| String format. This is the default type for strings and \|
	412	\| \| may be omitted. \|
	413	+---------+----------------------------------------------------------+
	414	\| None \| The same as ``'s'``. \|
	415	+---------+----------------------------------------------------------+
	416
	417	The available integer presentation types are:
	418
	419	+---------+----------------------------------------------------------+
	420	\| Type \| Meaning \|
	421	+=========+==========================================================+
	422	\| ``'b'`` \| Binary format. Outputs the number in base 2. \|
	423	+---------+----------------------------------------------------------+
	424	\| ``'c'`` \| Character. Converts the integer to the corresponding \|
	425	\| \| unicode character before printing. \|
	426	+---------+----------------------------------------------------------+
	427	\| ``'d'`` \| Decimal Integer. Outputs the number in base 10. \|
	428	+---------+----------------------------------------------------------+
	429	\| ``'o'`` \| Octal format. Outputs the number in base 8. \|
	430	+---------+----------------------------------------------------------+
	431	\| ``'x'`` \| Hex format. Outputs the number in base 16, using lower- \|
	432	\| \| case letters for the digits above 9. \|
	433	+---------+----------------------------------------------------------+
	434	\| ``'X'`` \| Hex format. Outputs the number in base 16, using upper- \|
	435	\| \| case letters for the digits above 9. \|
	436	+---------+----------------------------------------------------------+
	437	\| ``'n'`` \| Number. This is the same as ``'d'``, except that it uses \|
	438	\| \| the current locale setting to insert the appropriate \|
	439	\| \| number separator characters. \|
	440	+---------+----------------------------------------------------------+
	441	\| None \| The same as ``'d'``. \|
	442	+---------+----------------------------------------------------------+
	443
	444	In addition to the above presentation types, integers can be formatted
	445	with the floating point presentation types listed below (except
	446	``'n'`` and None). When doing so, :func:`float` is used to convert the
	447	integer to a floating point number before formatting.
	448
	449	The available presentation types for floating point and decimal values are:
	450
	451	+---------+----------------------------------------------------------+
	452	\| Type \| Meaning \|
	453	+=========+==========================================================+
	454	\| ``'e'`` \| Exponent notation. Prints the number in scientific \|
	455	\| \| notation using the letter 'e' to indicate the exponent. \|
[391]	456	\| \| The default precision is ``6``. \|
[2]	457	+---------+----------------------------------------------------------+
	458	\| ``'E'`` \| Exponent notation. Same as ``'e'`` except it uses an \|
	459	\| \| upper case 'E' as the separator character. \|
	460	+---------+----------------------------------------------------------+
	461	\| ``'f'`` \| Fixed point. Displays the number as a fixed-point \|
[391]	462	\| \| number. The default precision is ``6``. \|
[2]	463	+---------+----------------------------------------------------------+
	464	\| ``'F'`` \| Fixed point. Same as ``'f'``. \|
	465	+---------+----------------------------------------------------------+
	466	\| ``'g'`` \| General format. For a given precision ``p >= 1``, \|
	467	\| \| this rounds the number to ``p`` significant digits and \|
	468	\| \| then formats the result in either fixed-point format \|
	469	\| \| or in scientific notation, depending on its magnitude. \|
	470	\| \| \|
	471	\| \| The precise rules are as follows: suppose that the \|
	472	\| \| result formatted with presentation type ``'e'`` and \|
	473	\| \| precision ``p-1`` would have exponent ``exp``. Then \|
	474	\| \| if ``-4 <= exp < p``, the number is formatted \|
	475	\| \| with presentation type ``'f'`` and precision \|
	476	\| \| ``p-1-exp``. Otherwise, the number is formatted \|
	477	\| \| with presentation type ``'e'`` and precision ``p-1``. \|
	478	\| \| In both cases insignificant trailing zeros are removed \|
	479	\| \| from the significand, and the decimal point is also \|
	480	\| \| removed if there are no remaining digits following it. \|
	481	\| \| \|
[391]	482	\| \| Positive and negative infinity, positive and negative \|
[2]	483	\| \| zero, and nans, are formatted as ``inf``, ``-inf``, \|
	484	\| \| ``0``, ``-0`` and ``nan`` respectively, regardless of \|
	485	\| \| the precision. \|
	486	\| \| \|
	487	\| \| A precision of ``0`` is treated as equivalent to a \|
[391]	488	\| \| precision of ``1``. The default precision is ``6``. \|
[2]	489	+---------+----------------------------------------------------------+
	490	\| ``'G'`` \| General format. Same as ``'g'`` except switches to \|
	491	\| \| ``'E'`` if the number gets too large. The \|
	492	\| \| representations of infinity and NaN are uppercased, too. \|
	493	+---------+----------------------------------------------------------+
	494	\| ``'n'`` \| Number. This is the same as ``'g'``, except that it uses \|
	495	\| \| the current locale setting to insert the appropriate \|
	496	\| \| number separator characters. \|
	497	+---------+----------------------------------------------------------+
	498	\| ``'%'`` \| Percentage. Multiplies the number by 100 and displays \|
	499	\| \| in fixed (``'f'``) format, followed by a percent sign. \|
	500	+---------+----------------------------------------------------------+
	501	\| None \| The same as ``'g'``. \|
	502	+---------+----------------------------------------------------------+
	503
	504
[391]	505
	506	.. _formatexamples:
	507
	508	Format examples
	509	^^^^^^^^^^^^^^^
	510
	511	This section contains examples of the new format syntax and comparison with
	512	the old ``%``-formatting.
	513
	514	In most of the cases the syntax is similar to the old ``%``-formatting, with the
	515	addition of the ``{}`` and with ``:`` used instead of ``%``.
	516	For example, ``'%03.2f'`` can be translated to ``'{:03.2f}'``.
	517
	518	The new format syntax also supports new and different options, shown in the
	519	follow examples.
	520
	521	Accessing arguments by position::
	522
	523	>>> '{0}, {1}, {2}'.format('a', 'b', 'c')
	524	'a, b, c'
	525	>>> '{}, {}, {}'.format('a', 'b', 'c') # 2.7+ only
	526	'a, b, c'
	527	>>> '{2}, {1}, {0}'.format('a', 'b', 'c')
	528	'c, b, a'
	529	>>> '{2}, {1}, {0}'.format(*'abc') # unpacking argument sequence
	530	'c, b, a'
	531	>>> '{0}{1}{0}'.format('abra', 'cad') # arguments' indices can be repeated
	532	'abracadabra'
	533
	534	Accessing arguments by name::
	535
	536	>>> 'Coordinates: {latitude}, {longitude}'.format(latitude='37.24N', longitude='-115.81W')
	537	'Coordinates: 37.24N, -115.81W'
	538	>>> coord = {'latitude': '37.24N', 'longitude': '-115.81W'}
	539	>>> 'Coordinates: {latitude}, {longitude}'.format(**coord)
	540	'Coordinates: 37.24N, -115.81W'
	541
	542	Accessing arguments' attributes::
	543
	544	>>> c = 3-5j
	545	>>> ('The complex number {0} is formed from the real part {0.real} '
	546	... 'and the imaginary part {0.imag}.').format(c)
	547	'The complex number (3-5j) is formed from the real part 3.0 and the imaginary part -5.0.'
	548	>>> class Point(object):
	549	... def __init__(self, x, y):
	550	... self.x, self.y = x, y
	551	... def __str__(self):
	552	... return 'Point({self.x}, {self.y})'.format(self=self)
	553	...
	554	>>> str(Point(4, 2))
	555	'Point(4, 2)'
	556
	557
	558	Accessing arguments' items::
	559
	560	>>> coord = (3, 5)
	561	>>> 'X: {0[0]}; Y: {0[1]}'.format(coord)
	562	'X: 3; Y: 5'
	563
	564	Replacing ``%s`` and ``%r``::
	565
	566	>>> "repr() shows quotes: {!r}; str() doesn't: {!s}".format('test1', 'test2')
	567	"repr() shows quotes: 'test1'; str() doesn't: test2"
	568
	569	Aligning the text and specifying a width::
	570
	571	>>> '{:<30}'.format('left aligned')
	572	'left aligned '
	573	>>> '{:>30}'.format('right aligned')
	574	' right aligned'
	575	>>> '{:^30}'.format('centered')
	576	' centered '
	577	>>> '{:^30}'.format('centered') # use '' as a fill char
	578	'*********centered*********'
	579
	580	Replacing ``%+f``, ``%-f``, and ``% f`` and specifying a sign::
	581
	582	>>> '{:+f}; {:+f}'.format(3.14, -3.14) # show it always
	583	'+3.140000; -3.140000'
	584	>>> '{: f}; {: f}'.format(3.14, -3.14) # show a space for positive numbers
	585	' 3.140000; -3.140000'
	586	>>> '{:-f}; {:-f}'.format(3.14, -3.14) # show only the minus -- same as '{:f}; {:f}'
	587	'3.140000; -3.140000'
	588
	589	Replacing ``%x`` and ``%o`` and converting the value to different bases::
	590
	591	>>> # format also supports binary numbers
	592	>>> "int: {0:d}; hex: {0:x}; oct: {0:o}; bin: {0:b}".format(42)
	593	'int: 42; hex: 2a; oct: 52; bin: 101010'
	594	>>> # with 0x, 0o, or 0b as prefix:
	595	>>> "int: {0:d}; hex: {0:#x}; oct: {0:#o}; bin: {0:#b}".format(42)
	596	'int: 42; hex: 0x2a; oct: 0o52; bin: 0b101010'
	597
	598	Using the comma as a thousands separator::
	599
	600	>>> '{:,}'.format(1234567890)
	601	'1,234,567,890'
	602
	603	Expressing a percentage::
	604
	605	>>> points = 19.5
	606	>>> total = 22
	607	>>> 'Correct answers: {:.2%}'.format(points/total)
	608	'Correct answers: 88.64%'
	609
	610	Using type-specific formatting::
	611
	612	>>> import datetime
	613	>>> d = datetime.datetime(2010, 7, 4, 12, 15, 58)
	614	>>> '{:%Y-%m-%d %H:%M:%S}'.format(d)
	615	'2010-07-04 12:15:58'
	616
	617	Nesting arguments and more complex examples::
	618
	619	>>> for align, text in zip('<^>', ['left', 'center', 'right']):
	620	... '{0:{fill}{align}16}'.format(text, fill=align, align=align)
	621	...
	622	'left<<<<<<<<<<<<'
	623	'^^^^^center^^^^^'
	624	'>>>>>>>>>>>right'
	625	>>>
	626	>>> octets = [192, 168, 0, 1]
	627	>>> '{:02X}{:02X}{:02X}{:02X}'.format(*octets)
	628	'C0A80001'
	629	>>> int(_, 16)
	630	3232235521
	631	>>>
	632	>>> width = 5
	633	>>> for num in range(5,12):
	634	... for base in 'dXob':
	635	... print '{0:{width}{base}}'.format(num, base=base, width=width),
	636	... print
	637	...
	638	5 5 5 101
	639	6 6 6 110
	640	7 7 7 111
	641	8 8 10 1000
	642	9 9 11 1001
	643	10 A 12 1010
	644	11 B 13 1011
	645
	646
	647
[2]	648	Template strings
	649	----------------
	650
[391]	651	.. versionadded:: 2.4
	652
[2]	653	Templates provide simpler string substitutions as described in :pep:`292`.
	654	Instead of the normal ``%``\ -based substitutions, Templates support ``$``\
	655	-based substitutions, using the following rules:
	656
	657	* ``$$`` is an escape; it is replaced with a single ``$``.
	658
	659	* ``$identifier`` names a substitution placeholder matching a mapping key of
	660	``"identifier"``. By default, ``"identifier"`` must spell a Python
	661	identifier. The first non-identifier character after the ``$`` character
	662	terminates this placeholder specification.
	663
	664	* ``${identifier}`` is equivalent to ``$identifier``. It is required when valid
	665	identifier characters follow the placeholder but are not part of the
	666	placeholder, such as ``"${noun}ification"``.
	667
	668	Any other appearance of ``$`` in the string will result in a :exc:`ValueError`
	669	being raised.
	670
	671	The :mod:`string` module provides a :class:`Template` class that implements
	672	these rules. The methods of :class:`Template` are:
	673
	674
	675	.. class:: Template(template)
	676
	677	The constructor takes a single argument which is the template string.
	678
	679
	680	.. method:: substitute(mapping[, **kws])
	681
	682	Performs the template substitution, returning a new string. mapping is
	683	any dictionary-like object with keys that match the placeholders in the
	684	template. Alternatively, you can provide keyword arguments, where the
	685	keywords are the placeholders. When both mapping and kws are given
	686	and there are duplicates, the placeholders from kws take precedence.
	687
	688
	689	.. method:: safe_substitute(mapping[, **kws])
	690
	691	Like :meth:`substitute`, except that if placeholders are missing from
	692	mapping and kws, instead of raising a :exc:`KeyError` exception, the
	693	original placeholder will appear in the resulting string intact. Also,
	694	unlike with :meth:`substitute`, any other appearances of the ``$`` will
	695	simply return ``$`` instead of raising :exc:`ValueError`.
	696
	697	While other exceptions may still occur, this method is called "safe"
	698	because substitutions always tries to return a usable string instead of
	699	raising an exception. In another sense, :meth:`safe_substitute` may be
	700	anything other than safe, since it will silently ignore malformed
	701	templates containing dangling delimiters, unmatched braces, or
	702	placeholders that are not valid Python identifiers.
	703
[391]	704	:class:`Template` instances also provide one public data attribute:
[2]	705
[391]	706	.. attribute:: template
[2]	707
[391]	708	This is the object passed to the constructor's template argument. In
	709	general, you shouldn't change it, but read-only access is not enforced.
[2]	710
[391]	711	Here is an example of how to use a Template::
[2]	712
	713	>>> from string import Template
	714	>>> s = Template('$who likes $what')
	715	>>> s.substitute(who='tim', what='kung pao')
	716	'tim likes kung pao'
	717	>>> d = dict(who='tim')
	718	>>> Template('Give $who $100').substitute(d)
	719	Traceback (most recent call last):
[391]	720	...
	721	ValueError: Invalid placeholder in string: line 1, col 11
[2]	722	>>> Template('$who likes $what').substitute(d)
	723	Traceback (most recent call last):
[391]	724	...
[2]	725	KeyError: 'what'
	726	>>> Template('$who likes $what').safe_substitute(d)
	727	'tim likes $what'
	728
	729	Advanced usage: you can derive subclasses of :class:`Template` to customize the
	730	placeholder syntax, delimiter character, or the entire regular expression used
	731	to parse template strings. To do this, you can override these class attributes:
	732
	733	* delimiter -- This is the literal string describing a placeholder introducing
[391]	734	delimiter. The default value is ``$``. Note that this should not be a
	735	regular expression, as the implementation will call :meth:`re.escape` on this
	736	string as needed.
[2]	737
	738	* idpattern -- This is the regular expression describing the pattern for
	739	non-braced placeholders (the braces will be added automatically as
	740	appropriate). The default value is the regular expression
	741	``[_a-z][_a-z0-9]*``.
	742
	743	Alternatively, you can provide the entire regular expression pattern by
	744	overriding the class attribute pattern. If you do this, the value must be a
	745	regular expression object with four named capturing groups. The capturing
	746	groups correspond to the rules given above, along with the invalid placeholder
	747	rule:
	748
	749	* escaped -- This group matches the escape sequence, e.g. ``$$``, in the
	750	default pattern.
	751
	752	* named -- This group matches the unbraced placeholder name; it should not
	753	include the delimiter in capturing group.
	754
	755	* braced -- This group matches the brace enclosed placeholder name; it should
	756	not include either the delimiter or braces in the capturing group.
	757
	758	* invalid -- This group matches any other delimiter pattern (usually a single
	759	delimiter), and it should appear last in the regular expression.
	760
	761
	762	String functions
	763	----------------
	764
	765	The following functions are available to operate on string and Unicode objects.
	766	They are not available as string methods.
	767
	768
	769	.. function:: capwords(s[, sep])
	770
	771	Split the argument into words using :meth:`str.split`, capitalize each word
	772	using :meth:`str.capitalize`, and join the capitalized words using
	773	:meth:`str.join`. If the optional second argument sep is absent
	774	or ``None``, runs of whitespace characters are replaced by a single space
	775	and leading and trailing whitespace are removed, otherwise sep is used to
	776	split and join the words.
	777
	778
	779	.. function:: maketrans(from, to)
	780
	781	Return a translation table suitable for passing to :func:`translate`, that will
	782	map each character in from into the character at the same position in to;
	783	from and to must have the same length.
	784
	785	.. note::
	786
	787	Don't use strings derived from :const:`lowercase` and :const:`uppercase` as
	788	arguments; in some locales, these don't have the same length. For case
	789	conversions, always use :meth:`str.lower` and :meth:`str.upper`.
	790
	791
	792	Deprecated string functions
	793	---------------------------
	794
	795	The following list of functions are also defined as methods of string and
	796	Unicode objects; see section :ref:`string-methods` for more information on
	797	those. You should consider these functions as deprecated, although they will
[391]	798	not be removed until Python 3. The functions defined in this module are:
[2]	799
	800
	801	.. function:: atof(s)
	802
	803	.. deprecated:: 2.0
	804	Use the :func:`float` built-in function.
	805
	806	.. index:: builtin: float
	807
	808	Convert a string to a floating point number. The string must have the standard
	809	syntax for a floating point literal in Python, optionally preceded by a sign
	810	(``+`` or ``-``). Note that this behaves identical to the built-in function
	811	:func:`float` when passed a string.
	812
	813	.. note::
	814
	815	.. index::
	816	single: NaN
	817	single: Infinity
	818
	819	When passing in a string, values for NaN and Infinity may be returned, depending
	820	on the underlying C library. The specific set of strings accepted which cause
	821	these values to be returned depends entirely on the C library and is known to
	822	vary.
	823
	824
	825	.. function:: atoi(s[, base])
	826
	827	.. deprecated:: 2.0
	828	Use the :func:`int` built-in function.
	829
	830	.. index:: builtin: eval
	831
	832	Convert string s to an integer in the given base. The string must consist
	833	of one or more digits, optionally preceded by a sign (``+`` or ``-``). The
	834	base defaults to 10. If it is 0, a default base is chosen depending on the
	835	leading characters of the string (after stripping the sign): ``0x`` or ``0X``
	836	means 16, ``0`` means 8, anything else means 10. If base is 16, a leading
	837	``0x`` or ``0X`` is always accepted, though not required. This behaves
	838	identically to the built-in function :func:`int` when passed a string. (Also
	839	note: for a more flexible interpretation of numeric literals, use the built-in
	840	function :func:`eval`.)
	841
	842
	843	.. function:: atol(s[, base])
	844
	845	.. deprecated:: 2.0
	846	Use the :func:`long` built-in function.
	847
	848	.. index:: builtin: long
	849
	850	Convert string s to a long integer in the given base. The string must
	851	consist of one or more digits, optionally preceded by a sign (``+`` or ``-``).
	852	The base argument has the same meaning as for :func:`atoi`. A trailing ``l``
	853	or ``L`` is not allowed, except if the base is 0. Note that when invoked
	854	without base or with base set to 10, this behaves identical to the built-in
	855	function :func:`long` when passed a string.
	856
	857
	858	.. function:: capitalize(word)
	859
	860	Return a copy of word with only its first character capitalized.
	861
	862
	863	.. function:: expandtabs(s[, tabsize])
	864
	865	Expand tabs in a string replacing them by one or more spaces, depending on the
	866	current column and the given tab size. The column number is reset to zero after
	867	each newline occurring in the string. This doesn't understand other non-printing
	868	characters or escape sequences. The tab size defaults to 8.
	869
	870
	871	.. function:: find(s, sub[, start[,end]])
	872
	873	Return the lowest index in s where the substring sub is found such that
	874	sub is wholly contained in ``s[start:end]``. Return ``-1`` on failure.
	875	Defaults for start and end and interpretation of negative values is the same
	876	as for slices.
	877
	878
	879	.. function:: rfind(s, sub[, start[, end]])
	880
	881	Like :func:`find` but find the highest index.
	882
	883
	884	.. function:: index(s, sub[, start[, end]])
	885
	886	Like :func:`find` but raise :exc:`ValueError` when the substring is not found.
	887
	888
	889	.. function:: rindex(s, sub[, start[, end]])
	890
	891	Like :func:`rfind` but raise :exc:`ValueError` when the substring is not found.
	892
	893
	894	.. function:: count(s, sub[, start[, end]])
	895
	896	Return the number of (non-overlapping) occurrences of substring sub in string
	897	``s[start:end]``. Defaults for start and end and interpretation of negative
	898	values are the same as for slices.
	899
	900
	901	.. function:: lower(s)
	902
	903	Return a copy of s, but with upper case letters converted to lower case.
	904
	905
	906	.. function:: split(s[, sep[, maxsplit]])
	907
	908	Return a list of the words of the string s. If the optional second argument
	909	sep is absent or ``None``, the words are separated by arbitrary strings of
[391]	910	whitespace characters (space, tab, newline, return, formfeed). If the second
[2]	911	argument sep is present and not ``None``, it specifies a string to be used as
	912	the word separator. The returned list will then have one more item than the
[391]	913	number of non-overlapping occurrences of the separator in the string.
	914	If maxsplit is given, at most maxsplit number of splits occur, and the
	915	remainder of the string is returned as the final element of the list (thus,
	916	the list will have at most ``maxsplit+1`` elements). If maxsplit is not
	917	specified or ``-1``, then there is no limit on the number of splits (all
	918	possible splits are made).
[2]	919
	920	The behavior of split on an empty string depends on the value of sep. If sep
	921	is not specified, or specified as ``None``, the result will be an empty list.
	922	If sep is specified as any string, the result will be a list containing one
	923	element which is an empty string.
	924
	925
	926	.. function:: rsplit(s[, sep[, maxsplit]])
	927
	928	Return a list of the words of the string s, scanning s from the end. To all
	929	intents and purposes, the resulting list of words is the same as returned by
	930	:func:`split`, except when the optional third argument maxsplit is explicitly
[391]	931	specified and nonzero. If maxsplit is given, at most maxsplit number of
[2]	932	splits -- the rightmost ones -- occur, and the remainder of the string is
	933	returned as the first element of the list (thus, the list will have at most
	934	``maxsplit+1`` elements).
	935
	936	.. versionadded:: 2.4
	937
	938
	939	.. function:: splitfields(s[, sep[, maxsplit]])
	940
	941	This function behaves identically to :func:`split`. (In the past, :func:`split`
	942	was only used with one argument, while :func:`splitfields` was only used with
	943	two arguments.)
	944
	945
	946	.. function:: join(words[, sep])
	947
	948	Concatenate a list or tuple of words with intervening occurrences of sep.
	949	The default value for sep is a single space character. It is always true that
	950	``string.join(string.split(s, sep), sep)`` equals s.
	951
	952
	953	.. function:: joinfields(words[, sep])
	954
	955	This function behaves identically to :func:`join`. (In the past, :func:`join`
	956	was only used with one argument, while :func:`joinfields` was only used with two
	957	arguments.) Note that there is no :meth:`joinfields` method on string objects;
	958	use the :meth:`join` method instead.
	959
	960
	961	.. function:: lstrip(s[, chars])
	962
	963	Return a copy of the string with leading characters removed. If chars is
	964	omitted or ``None``, whitespace characters are removed. If given and not
	965	``None``, chars must be a string; the characters in the string will be
	966	stripped from the beginning of the string this method is called on.
	967
	968	.. versionchanged:: 2.2.3
	969	The chars parameter was added. The chars parameter cannot be passed in
	970	earlier 2.2 versions.
	971
	972
	973	.. function:: rstrip(s[, chars])
	974
	975	Return a copy of the string with trailing characters removed. If chars is
	976	omitted or ``None``, whitespace characters are removed. If given and not
	977	``None``, chars must be a string; the characters in the string will be
	978	stripped from the end of the string this method is called on.
	979
	980	.. versionchanged:: 2.2.3
	981	The chars parameter was added. The chars parameter cannot be passed in
	982	earlier 2.2 versions.
	983
	984
	985	.. function:: strip(s[, chars])
	986
	987	Return a copy of the string with leading and trailing characters removed. If
	988	chars is omitted or ``None``, whitespace characters are removed. If given and
	989	not ``None``, chars must be a string; the characters in the string will be
	990	stripped from the both ends of the string this method is called on.
	991
	992	.. versionchanged:: 2.2.3
	993	The chars parameter was added. The chars parameter cannot be passed in
	994	earlier 2.2 versions.
	995
	996
	997	.. function:: swapcase(s)
	998
	999	Return a copy of s, but with lower case letters converted to upper case and
	1000	vice versa.
	1001
	1002
	1003	.. function:: translate(s, table[, deletechars])
	1004
	1005	Delete all characters from s that are in deletechars (if present), and then
	1006	translate the characters using table, which must be a 256-character string
	1007	giving the translation for each character value, indexed by its ordinal. If
	1008	table is ``None``, then only the character deletion step is performed.
	1009
	1010
	1011	.. function:: upper(s)
	1012
	1013	Return a copy of s, but with lower case letters converted to upper case.
	1014
	1015
	1016	.. function:: ljust(s, width[, fillchar])
	1017	rjust(s, width[, fillchar])
	1018	center(s, width[, fillchar])
	1019
	1020	These functions respectively left-justify, right-justify and center a string in
	1021	a field of given width. They return a string that is at least width
	1022	characters wide, created by padding the string s with the character fillchar
	1023	(default is a space) until the given width on the right, left or both sides.
	1024	The string is never truncated.
	1025
	1026
	1027	.. function:: zfill(s, width)
	1028
[391]	1029	Pad a numeric string s on the left with zero digits until the
	1030	given width is reached. Strings starting with a sign are handled
	1031	correctly.
[2]	1032
	1033
[391]	1034	.. function:: replace(s, old, new[, maxreplace])
[2]	1035
[391]	1036	Return a copy of string s with all occurrences of substring old replaced
[2]	1037	by new. If the optional argument maxreplace is given, the first
	1038	maxreplace occurrences are replaced.
	1039

Note: See TracBrowser for help on using the repository browser.

Context Navigation

source: python/trunk/Doc/library/string.rst

Download in other formats: