Unicode Utilities: Unicode Language Identifers and BCP47

Unmarked properties are from Unicode V15.1.0; the beta properties are from Unicode V16.0.0β. For more information, see Unicode Utilities Beta.

help | character | properties | confusables | unicode-set | compare-sets | regex | bnf-regex | breaks | transform | bidi | bidi-c | idna | languageid

Input
  Localization:

Status

Source: fr-CA

TypeCodeNameReplacement
LanguagefrFrench
RegionCACanada

Source: gsw-Arab-AQ

TypeCodeNameReplacement
LanguagegswSwiss German
ScriptArabArabic
RegionAQAntarctica

Source: eng-Latn-840

Canonical Form: en-Latn-US

Minimal Form: en

TypeCodeNameReplacement
Languageenginvalid codeen
ScriptLatnLatin
Region840invalid CodeUS

Samples

Notes


Fonts and Display. If you don't have a good set of Unicode fonts (and modern browser), you may not be able to read some of the characters. Some suggested fonts that you can add for coverage are: Noto Fonts site, Unicode Fonts for Ancient Scripts, Large, multi-script Unicode fonts. See also: Unicode Display Problems.

Version 3.9; ICU version: 74.1; Unicode/Emoji version: 15.1.0; Unicodeβ version: 16.0.0;