UCD: Derived Character Properties

This document describes a number of data files in the Unicode Character database. These are the Derived data files, containing information that can be completely derived from other data files, but is presented in a different format for ease of use.

The files themselves are informative, although they may contain normative properties. For more information, see UnicodeCharacterDatabase.html.

Derived Core Properties

The following are important derived properties of Unicode characters, and are contained in DerivedCoreProperties.txt.

Property Value	N/I	Definition and Generation
Math	I	Characters with the Math property. For more information, see Chapter 4, Character Properties. Generated from: Sm + Other_Math
Alphabetic	I	Characters with the Alphabetic property. For more information, see Chapter 4, Character Properties. Generated from: Lu+Ll+Lt+Lm+Lo+ Other_Alphabetic
Lowercase	I	Characters with the Lowercase property. For more information, see Chapter 4, Character Properties and UTR #21: Case Mappings. Generated from: Ll + Other_Lowercase
Uppercase	I	Characters with the Uppercase property. For more information, see Chapter 4, Character Properties and UTR #21: Case Mappings. Generated from: Lu + Other_Uppercase
ID_Start	I	Characters that can start an identifier. Generated from Lu+Ll+Lt+Lm+Lo+Nl
ID_Continue	I	Characters that can continue an identifier. See Cf Note. Generated from: ID_Start + Mn+Mc+Nd+Pc
XID_Start	I	Same as ID_Start, except for modifications to allow closure under normalization forms NFKC and NFKD. Generated from: ID_Start; see Closure Note
XID_Continue	I	Same as ID_Continue, except for modifications to allow closure under normalization forms NFKC and NFKD. See Closure Note and Cf Note. Generated from: ID_Continue; see Closure Note

Derived Extracted Properties

The following files contain other properties of the UCD that are simply separated out, and listed in range format. These files are provided purely as a reformatting of existing data, with a certain exceptions listed below.

Derived Normalization Properties

The properties in DerivedNormalizationProperties.txt are useful in dealing with normalization forms. In the following table, NF* refers to one of NFD, NFC, NFKC, or NFKD.

Revision	3.1.0
Authors	Mark Davis
Date	2001-02-28
This Version	http://www.unicode.org/Public/3.1-Update/DerivedProperties-3.1.0.html
Previous Version	n/a
Latest Version	http://www.unicode.org/Public/UNIDATA/DerivedProperties.html

Property Value	N/I	Definition and Generation
FNC	N	Characters that require extra mappings for closure under Case Folding plus Normalization Form KC. Characters marked with this property have a third field with the mapping in it. Generated with the following: b = NFKC(Fold(a)); c = NFKC(Fold(b)); if (c != b) add mapping from a to c
Comp_Ex	N	Characters that are excluded from composition: those explicitly in CompositionExclusions.txt, plus: (3) Singleton Decompositions (4) Non-Starter Decompositions
NF*_NO	N	Characters that cannot ever occur in NF*. See QuickCheck Note.
NF*_MAYBE	N	Characters that may occur in valid NF*, depending on the context. See QuickCheck Note.
NF*_Expands	N	Characters that expand to more than one character in the specified normalization form.

Derived Character Properties

Summary

Status

Introduction

Derived Core Properties

Derived Extracted Properties

Derived Normalization Properties

UCD Terms of Use

Disclaimer

Limitations on Rights to Redistribute This Data