1 | \section{\module{locale} ---
|
---|
2 | Internationalization services}
|
---|
3 |
|
---|
4 | \declaremodule{standard}{locale}
|
---|
5 | \modulesynopsis{Internationalization services.}
|
---|
6 | \moduleauthor{Martin von L\"owis}{martin@v.loewis.de}
|
---|
7 | \sectionauthor{Martin von L\"owis}{martin@v.loewis.de}
|
---|
8 |
|
---|
9 |
|
---|
10 | The \module{locale} module opens access to the \POSIX{} locale
|
---|
11 | database and functionality. The \POSIX{} locale mechanism allows
|
---|
12 | programmers to deal with certain cultural issues in an application,
|
---|
13 | without requiring the programmer to know all the specifics of each
|
---|
14 | country where the software is executed.
|
---|
15 |
|
---|
16 | The \module{locale} module is implemented on top of the
|
---|
17 | \module{_locale}\refbimodindex{_locale} module, which in turn uses an
|
---|
18 | ANSI C locale implementation if available.
|
---|
19 |
|
---|
20 | The \module{locale} module defines the following exception and
|
---|
21 | functions:
|
---|
22 |
|
---|
23 |
|
---|
24 | \begin{excdesc}{Error}
|
---|
25 | Exception raised when \function{setlocale()} fails.
|
---|
26 | \end{excdesc}
|
---|
27 |
|
---|
28 | \begin{funcdesc}{setlocale}{category\optional{, locale}}
|
---|
29 | If \var{locale} is specified, it may be a string, a tuple of the
|
---|
30 | form \code{(\var{language code}, \var{encoding})}, or \code{None}.
|
---|
31 | If it is a tuple, it is converted to a string using the locale
|
---|
32 | aliasing engine. If \var{locale} is given and not \code{None},
|
---|
33 | \function{setlocale()} modifies the locale setting for the
|
---|
34 | \var{category}. The available categories are listed in the data
|
---|
35 | description below. The value is the name of a locale. An empty
|
---|
36 | string specifies the user's default settings. If the modification of
|
---|
37 | the locale fails, the exception \exception{Error} is raised. If
|
---|
38 | successful, the new locale setting is returned.
|
---|
39 |
|
---|
40 | If \var{locale} is omitted or \code{None}, the current setting for
|
---|
41 | \var{category} is returned.
|
---|
42 |
|
---|
43 | \function{setlocale()} is not thread safe on most systems.
|
---|
44 | Applications typically start with a call of
|
---|
45 |
|
---|
46 | \begin{verbatim}
|
---|
47 | import locale
|
---|
48 | locale.setlocale(locale.LC_ALL, '')
|
---|
49 | \end{verbatim}
|
---|
50 |
|
---|
51 | This sets the locale for all categories to the user's default
|
---|
52 | setting (typically specified in the \envvar{LANG} environment
|
---|
53 | variable). If the locale is not changed thereafter, using
|
---|
54 | multithreading should not cause problems.
|
---|
55 |
|
---|
56 | \versionchanged[Added support for tuple values of the \var{locale}
|
---|
57 | parameter]{2.0}
|
---|
58 | \end{funcdesc}
|
---|
59 |
|
---|
60 | \begin{funcdesc}{localeconv}{}
|
---|
61 | Returns the database of the local conventions as a dictionary.
|
---|
62 | This dictionary has the following strings as keys:
|
---|
63 |
|
---|
64 | \begin{tableiii}{l|l|p{3in}}{constant}{Category}{Key}{Meaning}
|
---|
65 | \lineiii{LC_NUMERIC}{\code{'decimal_point'}}
|
---|
66 | {Decimal point character.}
|
---|
67 | \lineiii{}{\code{'grouping'}}
|
---|
68 | {Sequence of numbers specifying which relative positions
|
---|
69 | the \code{'thousands_sep'} is expected. If the sequence is
|
---|
70 | terminated with \constant{CHAR_MAX}, no further grouping
|
---|
71 | is performed. If the sequence terminates with a \code{0},
|
---|
72 | the last group size is repeatedly used.}
|
---|
73 | \lineiii{}{\code{'thousands_sep'}}
|
---|
74 | {Character used between groups.}\hline
|
---|
75 | \lineiii{LC_MONETARY}{\code{'int_curr_symbol'}}
|
---|
76 | {International currency symbol.}
|
---|
77 | \lineiii{}{\code{'currency_symbol'}}
|
---|
78 | {Local currency symbol.}
|
---|
79 | \lineiii{}{\code{'p_cs_precedes/n_cs_precedes'}}
|
---|
80 | {Whether the currency symbol precedes the value (for positive resp.
|
---|
81 | negative values).}
|
---|
82 | \lineiii{}{\code{'p_sep_by_space/n_sep_by_space'}}
|
---|
83 | {Whether the currency symbol is separated from the value
|
---|
84 | by a space (for positive resp. negative values).}
|
---|
85 | \lineiii{}{\code{'mon_decimal_point'}}
|
---|
86 | {Decimal point used for monetary values.}
|
---|
87 | \lineiii{}{\code{'frac_digits'}}
|
---|
88 | {Number of fractional digits used in local formatting
|
---|
89 | of monetary values.}
|
---|
90 | \lineiii{}{\code{'int_frac_digits'}}
|
---|
91 | {Number of fractional digits used in international
|
---|
92 | formatting of monetary values.}
|
---|
93 | \lineiii{}{\code{'mon_thousands_sep'}}
|
---|
94 | {Group separator used for monetary values.}
|
---|
95 | \lineiii{}{\code{'mon_grouping'}}
|
---|
96 | {Equivalent to \code{'grouping'}, used for monetary
|
---|
97 | values.}
|
---|
98 | \lineiii{}{\code{'positive_sign'}}
|
---|
99 | {Symbol used to annotate a positive monetary value.}
|
---|
100 | \lineiii{}{\code{'negative_sign'}}
|
---|
101 | {Symbol used to annotate a negative monetary value.}
|
---|
102 | \lineiii{}{\code{'p_sign_posn/n_sign_posn'}}
|
---|
103 | {The position of the sign (for positive resp. negative values), see below.}
|
---|
104 | \end{tableiii}
|
---|
105 |
|
---|
106 | All numeric values can be set to \constant{CHAR_MAX} to indicate that
|
---|
107 | there is no value specified in this locale.
|
---|
108 |
|
---|
109 | The possible values for \code{'p_sign_posn'} and
|
---|
110 | \code{'n_sign_posn'} are given below.
|
---|
111 |
|
---|
112 | \begin{tableii}{c|l}{code}{Value}{Explanation}
|
---|
113 | \lineii{0}{Currency and value are surrounded by parentheses.}
|
---|
114 | \lineii{1}{The sign should precede the value and currency symbol.}
|
---|
115 | \lineii{2}{The sign should follow the value and currency symbol.}
|
---|
116 | \lineii{3}{The sign should immediately precede the value.}
|
---|
117 | \lineii{4}{The sign should immediately follow the value.}
|
---|
118 | \lineii{\constant{CHAR_MAX}}{Nothing is specified in this locale.}
|
---|
119 | \end{tableii}
|
---|
120 | \end{funcdesc}
|
---|
121 |
|
---|
122 | \begin{funcdesc}{nl_langinfo}{option}
|
---|
123 |
|
---|
124 | Return some locale-specific information as a string. This function is
|
---|
125 | not available on all systems, and the set of possible options might
|
---|
126 | also vary across platforms. The possible argument values are numbers,
|
---|
127 | for which symbolic constants are available in the locale module.
|
---|
128 |
|
---|
129 | \end{funcdesc}
|
---|
130 |
|
---|
131 | \begin{funcdesc}{getdefaultlocale}{\optional{envvars}}
|
---|
132 | Tries to determine the default locale settings and returns
|
---|
133 | them as a tuple of the form \code{(\var{language code},
|
---|
134 | \var{encoding})}.
|
---|
135 |
|
---|
136 | According to \POSIX, a program which has not called
|
---|
137 | \code{setlocale(LC_ALL, '')} runs using the portable \code{'C'}
|
---|
138 | locale. Calling \code{setlocale(LC_ALL, '')} lets it use the
|
---|
139 | default locale as defined by the \envvar{LANG} variable. Since we
|
---|
140 | do not want to interfere with the current locale setting we thus
|
---|
141 | emulate the behavior in the way described above.
|
---|
142 |
|
---|
143 | To maintain compatibility with other platforms, not only the
|
---|
144 | \envvar{LANG} variable is tested, but a list of variables given as
|
---|
145 | envvars parameter. The first found to be defined will be
|
---|
146 | used. \var{envvars} defaults to the search path used in GNU gettext;
|
---|
147 | it must always contain the variable name \samp{LANG}. The GNU
|
---|
148 | gettext search path contains \code{'LANGUAGE'}, \code{'LC_ALL'},
|
---|
149 | \code{'LC_CTYPE'}, and \code{'LANG'}, in that order.
|
---|
150 |
|
---|
151 | Except for the code \code{'C'}, the language code corresponds to
|
---|
152 | \rfc{1766}. \var{language code} and \var{encoding} may be
|
---|
153 | \code{None} if their values cannot be determined.
|
---|
154 | \versionadded{2.0}
|
---|
155 | \end{funcdesc}
|
---|
156 |
|
---|
157 | \begin{funcdesc}{getlocale}{\optional{category}}
|
---|
158 | Returns the current setting for the given locale category as
|
---|
159 | sequence containing \var{language code}, \var{encoding}.
|
---|
160 | \var{category} may be one of the \constant{LC_*} values except
|
---|
161 | \constant{LC_ALL}. It defaults to \constant{LC_CTYPE}.
|
---|
162 |
|
---|
163 | Except for the code \code{'C'}, the language code corresponds to
|
---|
164 | \rfc{1766}. \var{language code} and \var{encoding} may be
|
---|
165 | \code{None} if their values cannot be determined.
|
---|
166 | \versionadded{2.0}
|
---|
167 | \end{funcdesc}
|
---|
168 |
|
---|
169 | \begin{funcdesc}{getpreferredencoding}{\optional{do_setlocale}}
|
---|
170 | Return the encoding used for text data, according to user
|
---|
171 | preferences. User preferences are expressed differently on
|
---|
172 | different systems, and might not be available programmatically on
|
---|
173 | some systems, so this function only returns a guess.
|
---|
174 |
|
---|
175 | On some systems, it is necessary to invoke \function{setlocale}
|
---|
176 | to obtain the user preferences, so this function is not thread-safe.
|
---|
177 | If invoking setlocale is not necessary or desired, \var{do_setlocale}
|
---|
178 | should be set to \code{False}.
|
---|
179 |
|
---|
180 | \versionadded{2.3}
|
---|
181 | \end{funcdesc}
|
---|
182 |
|
---|
183 | \begin{funcdesc}{normalize}{localename}
|
---|
184 | Returns a normalized locale code for the given locale name. The
|
---|
185 | returned locale code is formatted for use with
|
---|
186 | \function{setlocale()}. If normalization fails, the original name
|
---|
187 | is returned unchanged.
|
---|
188 |
|
---|
189 | If the given encoding is not known, the function defaults to
|
---|
190 | the default encoding for the locale code just like
|
---|
191 | \function{setlocale()}.
|
---|
192 | \versionadded{2.0}
|
---|
193 | \end{funcdesc}
|
---|
194 |
|
---|
195 | \begin{funcdesc}{resetlocale}{\optional{category}}
|
---|
196 | Sets the locale for \var{category} to the default setting.
|
---|
197 |
|
---|
198 | The default setting is determined by calling
|
---|
199 | \function{getdefaultlocale()}. \var{category} defaults to
|
---|
200 | \constant{LC_ALL}.
|
---|
201 | \versionadded{2.0}
|
---|
202 | \end{funcdesc}
|
---|
203 |
|
---|
204 | \begin{funcdesc}{strcoll}{string1, string2}
|
---|
205 | Compares two strings according to the current
|
---|
206 | \constant{LC_COLLATE} setting. As any other compare function,
|
---|
207 | returns a negative, or a positive value, or \code{0}, depending on
|
---|
208 | whether \var{string1} collates before or after \var{string2} or is
|
---|
209 | equal to it.
|
---|
210 | \end{funcdesc}
|
---|
211 |
|
---|
212 | \begin{funcdesc}{strxfrm}{string}
|
---|
213 | Transforms a string to one that can be used for the built-in
|
---|
214 | function \function{cmp()}\bifuncindex{cmp}, and still returns
|
---|
215 | locale-aware results. This function can be used when the same
|
---|
216 | string is compared repeatedly, e.g. when collating a sequence of
|
---|
217 | strings.
|
---|
218 | \end{funcdesc}
|
---|
219 |
|
---|
220 | \begin{funcdesc}{format}{format, val\optional{, grouping\optional{, monetary}}}
|
---|
221 | Formats a number \var{val} according to the current
|
---|
222 | \constant{LC_NUMERIC} setting. The format follows the conventions
|
---|
223 | of the \code{\%} operator. For floating point values, the decimal
|
---|
224 | point is modified if appropriate. If \var{grouping} is true, also
|
---|
225 | takes the grouping into account.
|
---|
226 |
|
---|
227 | If \var{monetary} is true, the conversion uses monetary thousands
|
---|
228 | separator and grouping strings.
|
---|
229 |
|
---|
230 | Please note that this function will only work for exactly one \%char
|
---|
231 | specifier. For whole format strings, use \function{format_string()}.
|
---|
232 |
|
---|
233 | \versionchanged[Added the \var{monetary} parameter]{2.5}
|
---|
234 | \end{funcdesc}
|
---|
235 |
|
---|
236 | \begin{funcdesc}{format_string}{format, val\optional{, grouping}}
|
---|
237 | Processes formatting specifiers as in \code{format \% val},
|
---|
238 | but takes the current locale settings into account.
|
---|
239 |
|
---|
240 | \versionadded{2.5}
|
---|
241 | \end{funcdesc}
|
---|
242 |
|
---|
243 | \begin{funcdesc}{currency}{val\optional{, symbol\optional{, grouping\optional{, international}}}}
|
---|
244 | Formats a number \var{val} according to the current \constant{LC_MONETARY}
|
---|
245 | settings.
|
---|
246 |
|
---|
247 | The returned string includes the currency symbol if \var{symbol} is true,
|
---|
248 | which is the default.
|
---|
249 | If \var{grouping} is true (which is not the default), grouping is done with
|
---|
250 | the value.
|
---|
251 | If \var{international} is true (which is not the default), the international
|
---|
252 | currency symbol is used.
|
---|
253 |
|
---|
254 | Note that this function will not work with the `C' locale, so you have to set
|
---|
255 | a locale via \function{setlocale()} first.
|
---|
256 |
|
---|
257 | \versionadded{2.5}
|
---|
258 | \end{funcdesc}
|
---|
259 |
|
---|
260 | \begin{funcdesc}{str}{float}
|
---|
261 | Formats a floating point number using the same format as the
|
---|
262 | built-in function \code{str(\var{float})}, but takes the decimal
|
---|
263 | point into account.
|
---|
264 | \end{funcdesc}
|
---|
265 |
|
---|
266 | \begin{funcdesc}{atof}{string}
|
---|
267 | Converts a string to a floating point number, following the
|
---|
268 | \constant{LC_NUMERIC} settings.
|
---|
269 | \end{funcdesc}
|
---|
270 |
|
---|
271 | \begin{funcdesc}{atoi}{string}
|
---|
272 | Converts a string to an integer, following the
|
---|
273 | \constant{LC_NUMERIC} conventions.
|
---|
274 | \end{funcdesc}
|
---|
275 |
|
---|
276 | \begin{datadesc}{LC_CTYPE}
|
---|
277 | \refstmodindex{string}
|
---|
278 | Locale category for the character type functions. Depending on the
|
---|
279 | settings of this category, the functions of module
|
---|
280 | \refmodule{string} dealing with case change their behaviour.
|
---|
281 | \end{datadesc}
|
---|
282 |
|
---|
283 | \begin{datadesc}{LC_COLLATE}
|
---|
284 | Locale category for sorting strings. The functions
|
---|
285 | \function{strcoll()} and \function{strxfrm()} of the
|
---|
286 | \module{locale} module are affected.
|
---|
287 | \end{datadesc}
|
---|
288 |
|
---|
289 | \begin{datadesc}{LC_TIME}
|
---|
290 | Locale category for the formatting of time. The function
|
---|
291 | \function{time.strftime()} follows these conventions.
|
---|
292 | \end{datadesc}
|
---|
293 |
|
---|
294 | \begin{datadesc}{LC_MONETARY}
|
---|
295 | Locale category for formatting of monetary values. The available
|
---|
296 | options are available from the \function{localeconv()} function.
|
---|
297 | \end{datadesc}
|
---|
298 |
|
---|
299 | \begin{datadesc}{LC_MESSAGES}
|
---|
300 | Locale category for message display. Python currently does not
|
---|
301 | support application specific locale-aware messages. Messages
|
---|
302 | displayed by the operating system, like those returned by
|
---|
303 | \function{os.strerror()} might be affected by this category.
|
---|
304 | \end{datadesc}
|
---|
305 |
|
---|
306 | \begin{datadesc}{LC_NUMERIC}
|
---|
307 | Locale category for formatting numbers. The functions
|
---|
308 | \function{format()}, \function{atoi()}, \function{atof()} and
|
---|
309 | \function{str()} of the \module{locale} module are affected by that
|
---|
310 | category. All other numeric formatting operations are not
|
---|
311 | affected.
|
---|
312 | \end{datadesc}
|
---|
313 |
|
---|
314 | \begin{datadesc}{LC_ALL}
|
---|
315 | Combination of all locale settings. If this flag is used when the
|
---|
316 | locale is changed, setting the locale for all categories is
|
---|
317 | attempted. If that fails for any category, no category is changed at
|
---|
318 | all. When the locale is retrieved using this flag, a string
|
---|
319 | indicating the setting for all categories is returned. This string
|
---|
320 | can be later used to restore the settings.
|
---|
321 | \end{datadesc}
|
---|
322 |
|
---|
323 | \begin{datadesc}{CHAR_MAX}
|
---|
324 | This is a symbolic constant used for different values returned by
|
---|
325 | \function{localeconv()}.
|
---|
326 | \end{datadesc}
|
---|
327 |
|
---|
328 | The \function{nl_langinfo} function accepts one of the following keys.
|
---|
329 | Most descriptions are taken from the corresponding description in the
|
---|
330 | GNU C library.
|
---|
331 |
|
---|
332 | \begin{datadesc}{CODESET}
|
---|
333 | Return a string with the name of the character encoding used in the
|
---|
334 | selected locale.
|
---|
335 | \end{datadesc}
|
---|
336 |
|
---|
337 | \begin{datadesc}{D_T_FMT}
|
---|
338 | Return a string that can be used as a format string for strftime(3) to
|
---|
339 | represent time and date in a locale-specific way.
|
---|
340 | \end{datadesc}
|
---|
341 |
|
---|
342 | \begin{datadesc}{D_FMT}
|
---|
343 | Return a string that can be used as a format string for strftime(3) to
|
---|
344 | represent a date in a locale-specific way.
|
---|
345 | \end{datadesc}
|
---|
346 |
|
---|
347 | \begin{datadesc}{T_FMT}
|
---|
348 | Return a string that can be used as a format string for strftime(3) to
|
---|
349 | represent a time in a locale-specific way.
|
---|
350 | \end{datadesc}
|
---|
351 |
|
---|
352 | \begin{datadesc}{T_FMT_AMPM}
|
---|
353 | The return value can be used as a format string for `strftime' to
|
---|
354 | represent time in the am/pm format.
|
---|
355 | \end{datadesc}
|
---|
356 |
|
---|
357 | \begin{datadesc}{DAY_1 ... DAY_7}
|
---|
358 | Return name of the n-th day of the week. \warning{This
|
---|
359 | follows the US convention of \constant{DAY_1} being Sunday, not the
|
---|
360 | international convention (ISO 8601) that Monday is the first day of
|
---|
361 | the week.}
|
---|
362 | \end{datadesc}
|
---|
363 |
|
---|
364 | \begin{datadesc}{ABDAY_1 ... ABDAY_7}
|
---|
365 | Return abbreviated name of the n-th day of the week.
|
---|
366 | \end{datadesc}
|
---|
367 |
|
---|
368 | \begin{datadesc}{MON_1 ... MON_12}
|
---|
369 | Return name of the n-th month.
|
---|
370 | \end{datadesc}
|
---|
371 |
|
---|
372 | \begin{datadesc}{ABMON_1 ... ABMON_12}
|
---|
373 | Return abbreviated name of the n-th month.
|
---|
374 | \end{datadesc}
|
---|
375 |
|
---|
376 | \begin{datadesc}{RADIXCHAR}
|
---|
377 | Return radix character (decimal dot, decimal comma, etc.)
|
---|
378 | \end{datadesc}
|
---|
379 |
|
---|
380 | \begin{datadesc}{THOUSEP}
|
---|
381 | Return separator character for thousands (groups of three digits).
|
---|
382 | \end{datadesc}
|
---|
383 |
|
---|
384 | \begin{datadesc}{YESEXPR}
|
---|
385 | Return a regular expression that can be used with the regex
|
---|
386 | function to recognize a positive response to a yes/no question.
|
---|
387 | \warning{The expression is in the syntax suitable for the
|
---|
388 | \cfunction{regex()} function from the C library, which might differ
|
---|
389 | from the syntax used in \refmodule{re}.}
|
---|
390 | \end{datadesc}
|
---|
391 |
|
---|
392 | \begin{datadesc}{NOEXPR}
|
---|
393 | Return a regular expression that can be used with the regex(3)
|
---|
394 | function to recognize a negative response to a yes/no question.
|
---|
395 | \end{datadesc}
|
---|
396 |
|
---|
397 | \begin{datadesc}{CRNCYSTR}
|
---|
398 | Return the currency symbol, preceded by "-" if the symbol should
|
---|
399 | appear before the value, "+" if the symbol should appear after the
|
---|
400 | value, or "." if the symbol should replace the radix character.
|
---|
401 | \end{datadesc}
|
---|
402 |
|
---|
403 | \begin{datadesc}{ERA}
|
---|
404 | The return value represents the era used in the current locale.
|
---|
405 |
|
---|
406 | Most locales do not define this value. An example of a locale which
|
---|
407 | does define this value is the Japanese one. In Japan, the traditional
|
---|
408 | representation of dates includes the name of the era corresponding to
|
---|
409 | the then-emperor's reign.
|
---|
410 |
|
---|
411 | Normally it should not be necessary to use this value directly.
|
---|
412 | Specifying the \code{E} modifier in their format strings causes the
|
---|
413 | \function{strftime} function to use this information. The format of the
|
---|
414 | returned string is not specified, and therefore you should not assume
|
---|
415 | knowledge of it on different systems.
|
---|
416 | \end{datadesc}
|
---|
417 |
|
---|
418 | \begin{datadesc}{ERA_YEAR}
|
---|
419 | The return value gives the year in the relevant era of the locale.
|
---|
420 | \end{datadesc}
|
---|
421 |
|
---|
422 | \begin{datadesc}{ERA_D_T_FMT}
|
---|
423 | This return value can be used as a format string for
|
---|
424 | \function{strftime} to represent dates and times in a locale-specific
|
---|
425 | era-based way.
|
---|
426 | \end{datadesc}
|
---|
427 |
|
---|
428 | \begin{datadesc}{ERA_D_FMT}
|
---|
429 | This return value can be used as a format string for
|
---|
430 | \function{strftime} to represent time in a locale-specific era-based
|
---|
431 | way.
|
---|
432 | \end{datadesc}
|
---|
433 |
|
---|
434 | \begin{datadesc}{ALT_DIGITS}
|
---|
435 | The return value is a representation of up to 100 values used to
|
---|
436 | represent the values 0 to 99.
|
---|
437 | \end{datadesc}
|
---|
438 |
|
---|
439 | Example:
|
---|
440 |
|
---|
441 | \begin{verbatim}
|
---|
442 | >>> import locale
|
---|
443 | >>> loc = locale.getlocale(locale.LC_ALL) # get current locale
|
---|
444 | >>> locale.setlocale(locale.LC_ALL, 'de_DE') # use German locale; name might vary with platform
|
---|
445 | >>> locale.strcoll('f\xe4n', 'foo') # compare a string containing an umlaut
|
---|
446 | >>> locale.setlocale(locale.LC_ALL, '') # use user's preferred locale
|
---|
447 | >>> locale.setlocale(locale.LC_ALL, 'C') # use default (C) locale
|
---|
448 | >>> locale.setlocale(locale.LC_ALL, loc) # restore saved locale
|
---|
449 | \end{verbatim}
|
---|
450 |
|
---|
451 |
|
---|
452 | \subsection{Background, details, hints, tips and caveats}
|
---|
453 |
|
---|
454 | The C standard defines the locale as a program-wide property that may
|
---|
455 | be relatively expensive to change. On top of that, some
|
---|
456 | implementation are broken in such a way that frequent locale changes
|
---|
457 | may cause core dumps. This makes the locale somewhat painful to use
|
---|
458 | correctly.
|
---|
459 |
|
---|
460 | Initially, when a program is started, the locale is the \samp{C} locale, no
|
---|
461 | matter what the user's preferred locale is. The program must
|
---|
462 | explicitly say that it wants the user's preferred locale settings by
|
---|
463 | calling \code{setlocale(LC_ALL, '')}.
|
---|
464 |
|
---|
465 | It is generally a bad idea to call \function{setlocale()} in some library
|
---|
466 | routine, since as a side effect it affects the entire program. Saving
|
---|
467 | and restoring it is almost as bad: it is expensive and affects other
|
---|
468 | threads that happen to run before the settings have been restored.
|
---|
469 |
|
---|
470 | If, when coding a module for general use, you need a locale
|
---|
471 | independent version of an operation that is affected by the locale
|
---|
472 | (such as \function{string.lower()}, or certain formats used with
|
---|
473 | \function{time.strftime()}), you will have to find a way to do it
|
---|
474 | without using the standard library routine. Even better is convincing
|
---|
475 | yourself that using locale settings is okay. Only as a last resort
|
---|
476 | should you document that your module is not compatible with
|
---|
477 | non-\samp{C} locale settings.
|
---|
478 |
|
---|
479 | The case conversion functions in the
|
---|
480 | \refmodule{string}\refstmodindex{string} module are affected by the
|
---|
481 | locale settings. When a call to the \function{setlocale()} function
|
---|
482 | changes the \constant{LC_CTYPE} settings, the variables
|
---|
483 | \code{string.lowercase}, \code{string.uppercase} and
|
---|
484 | \code{string.letters} are recalculated. Note that this code that uses
|
---|
485 | these variable through `\keyword{from} ... \keyword{import} ...',
|
---|
486 | e.g.\ \code{from string import letters}, is not affected by subsequent
|
---|
487 | \function{setlocale()} calls.
|
---|
488 |
|
---|
489 | The only way to perform numeric operations according to the locale
|
---|
490 | is to use the special functions defined by this module:
|
---|
491 | \function{atof()}, \function{atoi()}, \function{format()},
|
---|
492 | \function{str()}.
|
---|
493 |
|
---|
494 | \subsection{For extension writers and programs that embed Python
|
---|
495 | \label{embedding-locale}}
|
---|
496 |
|
---|
497 | Extension modules should never call \function{setlocale()}, except to
|
---|
498 | find out what the current locale is. But since the return value can
|
---|
499 | only be used portably to restore it, that is not very useful (except
|
---|
500 | perhaps to find out whether or not the locale is \samp{C}).
|
---|
501 |
|
---|
502 | When Python code uses the \module{locale} module to change the locale,
|
---|
503 | this also affects the embedding application. If the embedding
|
---|
504 | application doesn't want this to happen, it should remove the
|
---|
505 | \module{_locale} extension module (which does all the work) from the
|
---|
506 | table of built-in modules in the \file{config.c} file, and make sure
|
---|
507 | that the \module{_locale} module is not accessible as a shared library.
|
---|
508 |
|
---|
509 |
|
---|
510 | \subsection{Access to message catalogs \label{locale-gettext}}
|
---|
511 |
|
---|
512 | The locale module exposes the C library's gettext interface on systems
|
---|
513 | that provide this interface. It consists of the functions
|
---|
514 | \function{gettext()}, \function{dgettext()}, \function{dcgettext()},
|
---|
515 | \function{textdomain()}, \function{bindtextdomain()}, and
|
---|
516 | \function{bind_textdomain_codeset()}. These are similar to the same
|
---|
517 | functions in the \refmodule{gettext} module, but use the C library's
|
---|
518 | binary format for message catalogs, and the C library's search
|
---|
519 | algorithms for locating message catalogs.
|
---|
520 |
|
---|
521 | Python applications should normally find no need to invoke these
|
---|
522 | functions, and should use \refmodule{gettext} instead. A known
|
---|
523 | exception to this rule are applications that link use additional C
|
---|
524 | libraries which internally invoke \cfunction{gettext()} or
|
---|
525 | \function{dcgettext()}. For these applications, it may be necessary to
|
---|
526 | bind the text domain, so that the libraries can properly locate their
|
---|
527 | message catalogs.
|
---|