Unicode Information¶

class jkUnicode.UniInfo(uni: int | None = None)¶

The main Unicode Info object. It gets its Unicode information from the submodules aglfn, uniCase, uniCat, uniDecomposition, uniName, and uniRangesBits which are generated from the official Unicode data. You can find tools to download and regenerate the data in the tools subfolder.

The Unicode Info object is meant to be instantiated once and then reused to get information about different codepoints. Avoid to instantiate it often, because it is rather expensive.

Initialize the Info object with a None e.g. before a loop and then in the loop assign the actual codepoints that you want information about by setting the unicode instance variable. This will automatically update the other instance variables with the correct information from the Unicode standard.

Parameters:: uni (int) – The codepoint.

property block: str | None¶: The name of the block for the current codepoint.

property category: str | None¶: The name of the category for the current codepoint.

property category_short: str | None¶: The short name of the category for the current codepoint.

property char: str | None¶: The character for the current codepoint.

property decomposition_mapping: list[int]¶: The decomposition mapping for the current codepoint.

property glyphname: str | None¶: The AGLFN glyph name for the current codepoint.

property lc_mapping: int | None¶: The lowercase mapping for the current codepoint.

property name: str | None¶: The Unicode name for the current codepoint.

property nice_name: str | None¶: A more human-readable Unicode name for the current codepoint.

property uc_mapping: int | None¶: The uppercase mapping for the current codepoint.

property unicode: int | None¶: The Unicode codepoint. Setting this value will look up and fill the other pieces of information, like category, range, decomposition mapping, and case mapping.

jkUnicode.getUnicodeChar(code: int) → str¶

Return the Unicode character for a Unicode codepoint.

Parameters:: code (int) – The codepoint

jkUnicode.get_expanded_glyph_list(unicodes: list[int], ui: UniInfo | None = None) → list[tuple[int, str | None]]¶

“Expand” or annotate a list of codepoints.

For codepoints that have a case mapping (UC or LC), the target codepoint of the case mapping will be added to the list. AGLFN glyph names are added to the list too, so the returned list contains tuples of (codepoint, glyphname), sorted by the codepoint value.

Parameters:

unicodes (list) – A list of codepoints
ui (UniInfo) – The UniInfo instance to use. If None, one will be instantiated.

Unicode Information¶

jkUnicode

Navigation

Related Topics