Unicode Utilities: Confusables

Unmarked properties are from Unicode V15.1.0; the beta properties are from Unicode V16.0.0β. For more information, see Unicode Utilities Beta.

help | character | properties | confusables | unicode-set | compare-sets | regex | bnf-regex | breaks | transform | bidi | bidi-c | idna | languageid

Input With this demo, you can supply an Input string and see the combinations that are confusable with it, using data collected by the Unicode consortium. You can also try different restrictions, using characters valid in different approaches to international domain names. For more info, see Data below.
  

Confusable Characters

x × х      
007800D704451541157D166E292B292C2A2F
LATIN SMALL LETTER XMULTIPLICATION SIGNCYRILLIC SMALL LETTER HACANADIAN SYLLABICS SAYISI YICANADIAN SYLLABICS HKCANADIAN SYLLABICS FULL STOPRISING DIAGONAL CROSSING FALLING DIAGONALFALLING DIAGONAL CROSSING RISING DIAGONALVECTOR OR CROSS PRODUCT
n ո ռ            
006E0578057C
LATIN SMALL LETTER NARMENIAN SMALL LETTER VOARMENIAN SMALL LETTER RA
- ˗ ۔       
002D02D706D420102012201320432212
HYPHEN-MINUSMODIFIER LETTER MINUS SIGNARABIC FULL STOPHYPHENFIGURE DASHEN DASHHYPHEN BULLETMINUS SIGN
- ˗ ۔       
002D02D706D420102012201320432212
HYPHEN-MINUSMODIFIER LETTER MINUS SIGNARABIC FULL STOPHYPHENFIGURE DASHEN DASHHYPHEN BULLETMINUS SIGN
a ɑ α а          
0061025103B10430237A
LATIN SMALL LETTER ALATIN SMALL LETTER ALPHAGREEK SMALL LETTER ALPHACYRILLIC SMALL LETTER AAPL FUNCTIONAL SYMBOL ALPHA
b           
006213CF147215AF
LATIN SMALL LETTER BCHEROKEE LETTER SICANADIAN SYLLABICS KACANADIAN SYLLABICS AIVILIK B
- ˗ ۔       
002D02D706D420102012201320432212
HYPHEN-MINUSMODIFIER LETTER MINUS SIGNARABIC FULL STOPHYPHENFIGURE DASHEN DASHHYPHEN BULLETMINUS SIGN
j ϳ ј            
006A03F30458
LATIN SMALL LETTER JGREEK LETTER YOTCYRILLIC SMALL LETTER JE
1 l ǀ Ӏ ׀ ו ן ا ١ ۱ 𐌉 𐌠
0031006C01C004C005C005D505DF0627066106F116C122231030910320
DIGIT ONELATIN SMALL LETTER LLATIN LETTER DENTAL CLICKCYRILLIC LETTER PALOCHKAHEBREW PUNCTUATION PASEQHEBREW LETTER VAVHEBREW LETTER FINAL NUNARABIC LETTER ALEFARABIC-INDIC DIGIT ONEEXTENDED ARABIC-INDIC DIGIT ONERUNIC LETTER ISAZ IS ISS IDIVIDESOLD ITALIC LETTER IOLD ITALIC NUMERAL ONE
t              
0074
LATIN SMALL LETTER T

Total raw values: 11,612,160

Too many raw items to process.


Data

Confusable characters are those that may be confused with others (in some common UI fonts), such as the Latin letter "o" and the Greek letter omicron "ο". Fonts make a difference: for example, the Hebrew character "ס" looks confusingly similar to "o" in some fonts (such as Arial Hebrew), but not in others. See also unaccented Latin Characters..

The data for confusables and restrictions is from UTS39. You can suggest additions or changes to the Unicode data for future versions of that standard.

For more information on the use of the data, see proposed updates Unicode Security Mechanisms and Unicode Security Considerations.

The restrictions are purely on a character level. For a more detailed view, see idna.

Caveats

The Unicode data is designed for testing, not enumerating, so not all combinations are generated in this demo; In particular, where a character is confusable with a sequence, not all combinations are generated.



Fonts and Display. If you don't have a good set of Unicode fonts (and modern browser), you may not be able to read some of the characters. Some suggested fonts that you can add for coverage are: Noto Fonts site, Unicode Fonts for Ancient Scripts, Large, multi-script Unicode fonts. See also: Unicode Display Problems.

Version 3.9; ICU version: 74.1; Unicode/Emoji version: 15.1.0; Unicodeβ version: 16.0.0;