Unicode Alias Names And Abbreviations

In Unicode, characters can have a unique name.

A character can also have one or more alias names. An alias name can be an abbreviation, a C0 or C1 control name, a correction, an alternate name or a figment. An alias too is unique over all names and aliases, and therefore identifying.

Background

The formal, primary Unicode name is unique over all names, only uses certain characters & format, and is guaranteed never to change. The formal name consists of characters A–Z (uppercase), 0–9, " " (space), and "-" (hyphen). Next to this name, a character can have one or more formal (normative) alias names. Such an alias name also follows the rules of a name: characters used (A-Z, -, 0-9, ) and not used (a-z, %, $, etc.). Alias names are also unique in the full name set (that is, all names and alias names are all unique in their combined set). Alias names are formally described in the Unicode Standard. In this sense, an abbreviation is also considered a Unicode name.

Reason to add an alias

There are five possible reasons to assign an alias name to a code point. A character can have multiple aliases: for example U+0008 has control alias BACKSPACE and abbreviation alias BS.

    1. Abbreviation
    Commonly occurring abbreviations (or acronyms) for control codes, format characters, spaces, and variation selectors.
    There are 354 such aliases, including 256 aliases for variant selectors (VS-1 ... VS-256).
    For example, U+00A0   NO-BREAK SPACE has alias NBSP.
    Presentation: in the code charts, the abbreviation is shown in a dashed box:
    NBSP
    .
    2. Control
    ISO 6429 names for C0 and C1 control functions and similar commonly occurring names, are added as an alias to the character.
    There are 84 such aliases.
    For example, U+0008 has alias BACKSPACE.
    Presentation: Control characters do not have a primary name, they are labeled like . Its alias name like BACKSPACE is used in the chart documentation, but never as a primary name. This prevents unintended (automated) replacement by the actual, disrupting control character. For example, using alias name BEL in line would be replaced by U+0007 , triggering the bell sound.
    3. Correction
    This is a correction for a "serious problem" in the primary character name, usually an error.
    There are 31 such aliases.
    For example, U+2118 SCRIPT CAPITAL P is actually a lowercase p, and so is given alias name ※ WEIERSTRASS ELLIPTIC FUNCTION: "actually this has the form of a lowercase calligraphic p, despite its name, and through the alias the correct spelling is added."
    Presentation: A corrected name is preceded by symbol ※ (the reference mark).
    4. Alternate
    For widely used alternate name for a character.
    There is 1 such alias.
    Example: U+FEFF ZERO WIDTH NO-BREAK SPACE has alternate BYTE ORDER MARK.
    Presentation: listed in character charts description.
    5. Figment
    Several documented labels for C1 control code points which were never actually approved in any standard (figment = feigned, in fiction).
    There are 3 such aliases.
    For example, U+0099 has figment alias SINGLE GRAPHIC CHARACTER INTRODUCER. This name is an architectural concept from early drafts of ISO/IEC 10646-1, but it was never approved and standardized.
    Presentation: These figment abbreviations are not published in Standard; the chart shows "XXX" for each informally, that is: not a unique or identifying abbreviation.

List of aliases

code
point
html
decimal
Name
or
Alias Reason Chart Note
Abbr Name
U+0000
NUL
NULL Control C0 Controls and Basic Latin (pdf)
U+0001 
SOH
START OF HEADING Control C0 Controls and Basic Latin (pdf)
U+0002 
STX
START OF TEXT Control C0 Controls and Basic Latin (pdf)
U+0003 
ETX
END OF TEXT Control C0 Controls and Basic Latin (pdf)
U+0004 
EOT
END OF TRANSMISSION Control C0 Controls and Basic Latin (pdf)
U+0005 
ENQ
ENQUIRY Control C0 Controls and Basic Latin (pdf)
U+0006 
ACK
ACKNOWLEDGE Control C0 Controls and Basic Latin (pdf)
U+0007 
BEL
ALERT Control C0 Controls and Basic Latin (pdf)
U+0008 
BS
BACKSPACE Control C0 Controls and Basic Latin (pdf)
U+0009
TAB
CHARACTER TABULATION Control C0 Controls and Basic Latin (pdf)
HT
HORIZONTAL TABULATION Control
U+000A
LF
LINE FEED Control C0 Controls and Basic Latin (pdf)
NL
NEW LINE Control
EOL
END OF LINE Control
U+000B LINE TABULATION Control C0 Controls and Basic Latin (pdf)
VT
VERTICAL TABULATION Control
U+000C
FF
FORM FEED Control C0 Controls and Basic Latin (pdf)
U+000D
CR
CARRIAGE RETURN Control C0 Controls and Basic Latin (pdf)
U+000E 
SO
SHIFT OUT Control C0 Controls and Basic Latin (pdf)
LOCKING-SHIFT ONE Control
U+000F 
SI
SHIFT IN Control C0 Controls and Basic Latin (pdf)
LOCKING-SHIFT ZERO Control
U+0010 
DLE
DATA LINK ESCAPE Control C0 Controls and Basic Latin (pdf)
U+0011 
DC1
DEVICE CONTROL ONE Control C0 Controls and Basic Latin (pdf)
U+0012 
DC2
DEVICE CONTROL TWO Control C0 Controls and Basic Latin (pdf)
U+0013 
DC3
DEVICE CONTROL THREE Control C0 Controls and Basic Latin (pdf)
U+0014 
DC4
DEVICE CONTROL FOUR Control C0 Controls and Basic Latin (pdf)
U+0015 
NAK
NEGATIVE ACKNOWLEDGE Control C0 Controls and Basic Latin (pdf)
U+0016 
SYN
SYNCHRONOUS IDLE Control C0 Controls and Basic Latin (pdf)
U+0017 
ETB
END OF TRANSMISSION BLOCK Control C0 Controls and Basic Latin (pdf)
U+0018 
CAN
CANCEL Control C0 Controls and Basic Latin (pdf)
U+0019 
EOM
END OF MEDIUM Control C0 Controls and Basic Latin (pdf)
EM
Abbreviation added in version 15.0
U+001A 
SUB
SUBSTITUTE Control C0 Controls and Basic Latin (pdf)
U+001B 
ESC
ESCAPE Control C0 Controls and Basic Latin (pdf)
U+001C  INFORMATION SEPARATOR FOUR Control C0 Controls and Basic Latin (pdf)
FS
FILE SEPARATOR Control
U+001D  INFORMATION SEPARATOR THREE Control C0 Controls and Basic Latin (pdf)
GS
GROUP SEPARATOR Control
U+001E  INFORMATION SEPARATOR TWO Control C0 Controls and Basic Latin (pdf)
RS
RECORD SEPARATOR Control
U+001F  INFORMATION SEPARATOR ONE Control C0 Controls and Basic Latin (pdf)
US
UNIT SEPARATOR Control
U+0020 SPACE
SP
Abbreviation C0 Controls and Basic Latin (pdf)
U+007F 
DEL
DELETE Control C0 Controls and Basic Latin (pdf)
U+0080
PAD
PADDING CHARACTER Figment C1 Controls and Latin-1 Supplement (pdf) Aliases are not widely published by Unicode; chart shows non-unique XXX
U+0081 
HOP
HIGH OCTET PRESET Figment C1 Controls and Latin-1 Supplement (pdf) Aliases are not widely published by Unicode; chart shows non-unique XXX
U+0082
BPH
BREAK PERMITTED HERE Control C1 Controls and Latin-1 Supplement (pdf)
U+0083 ƒ
NBH
NO BREAK HERE Control C1 Controls and Latin-1 Supplement (pdf)
U+0084
IND
INDEX Control C1 Controls and Latin-1 Supplement (pdf)
U+0085
NEL
NEXT LINE Control C1 Controls and Latin-1 Supplement (pdf)
U+0086
SSA
START OF SELECTED AREA Control C1 Controls and Latin-1 Supplement (pdf)
U+0087
ESA
END OF SELECTED AREA Control C1 Controls and Latin-1 Supplement (pdf)
U+0088 ˆ CHARACTER TABULATION SET Control C1 Controls and Latin-1 Supplement (pdf)
HTS
HORIZONTAL TABULATION SET Control
U+0089 CHARACTER TABULATION WITH JUSTIFICATION Control C1 Controls and Latin-1 Supplement (pdf)
HTJ
HORIZONTAL TABULATION WITH JUSTIFICATION Control
U+008A Š LINE TABULATION SET Control C1 Controls and Latin-1 Supplement (pdf)
VTS
VERTICAL TABULATION SET Control
U+008B PARTIAL LINE FORWARD Control C1 Controls and Latin-1 Supplement (pdf)
PLD
PARTIAL LINE DOWN Control
U+008C Œ PARTIAL LINE BACKWARD Control C1 Controls and Latin-1 Supplement (pdf)
PLU
PARTIAL LINE UP Control
U+008D  REVERSE LINE FEED Control C1 Controls and Latin-1 Supplement (pdf)
RI
REVERSE INDEX Control
U+008E Ž SINGLE SHIFT TWO Control C1 Controls and Latin-1 Supplement (pdf)
SS2
SINGLE-SHIFT-2 Control
U+008F  SINGLE SHIFT THREE Control C1 Controls and Latin-1 Supplement (pdf)
SS3
SINGLE-SHIFT-3 Control
U+0090 
DCS
DEVICE CONTROL STRING Control C1 Controls and Latin-1 Supplement (pdf)
U+0091 PRIVATE USE ONE Control C1 Controls and Latin-1 Supplement (pdf)
PU1
PRIVATE USE-1 Control
U+0092 PRIVATE USE TWO Control C1 Controls and Latin-1 Supplement (pdf)
PU2
PRIVATE USE-2 Control
U+0093
STS
SET TRANSMIT STATE Control C1 Controls and Latin-1 Supplement (pdf)
U+0094
CCH
CANCEL CHARACTER Control C1 Controls and Latin-1 Supplement (pdf)
U+0095
MW
MESSAGE WAITING Control C1 Controls and Latin-1 Supplement (pdf)
U+0096 START OF GUARDED AREA Control C1 Controls and Latin-1 Supplement (pdf)
SPA
START OF PROTECTED AREA Control
U+0097 END OF GUARDED AREA Control C1 Controls and Latin-1 Supplement (pdf)
EPA
END OF PROTECTED AREA Control
U+0098 ˜
SOS
START OF STRING Control C1 Controls and Latin-1 Supplement (pdf)
U+0099
SGC
SINGLE GRAPHIC CHARACTER INTRODUCER Figment C1 Controls and Latin-1 Supplement (pdf) Aliases are not widely published by Unicode; chart shows non-unique XXX
U+009A š
SCI
SINGLE CHARACTER INTRODUCER Control C1 Controls and Latin-1 Supplement (pdf)
U+009B
CSI
CONTROL SEQUENCE INTRODUCER Control C1 Controls and Latin-1 Supplement (pdf)
U+009C œ
ST
STRING TERMINATOR Control C1 Controls and Latin-1 Supplement (pdf)
U+009D 
OSC
OPERATING SYSTEM COMMAND Control C1 Controls and Latin-1 Supplement (pdf)
U+009E ž
PM
PRIVACY MESSAGE Control C1 Controls and Latin-1 Supplement (pdf)
U+009F Ÿ
APC
APPLICATION PROGRAM COMMAND Control C1 Controls and Latin-1 Supplement (pdf)
U+00A0  
 
NO-BREAK SPACE
NBSP
Abbreviation C1 Controls and Latin-1 Supplement (pdf)
U+00AD ­
­
SOFT HYPHEN
SHY
Abbreviation C1 Controls and Latin-1 Supplement (pdf)
U+01A2 Ƣ LATIN CAPITAL LETTER OI LATIN CAPITAL LETTER GHA ※ Correction Latin Extended-B (pdf)
U+01A3 ƣ LATIN SMALL LETTER OI LATIN SMALL LETTER GHA ※ Correction Latin Extended-B (pdf)
U+034F ͏ COMBINING GRAPHEME JOINER
CGJ
Abbreviation Combining Diacritical Marks (pdf) The name of this character is misleading; it does not actually join graphemes
U+0616 ؖ ARABIC SMALL HIGH LIGATURE ALEF WITH LAM WITH YEH ARABIC SMALL HIGH LIGATURE ALEF WITH YEH BARREE ※ Correction Arabic  added in version 15.0
U+061C ؜ ARABIC LETTER MARK
ALM
Abbreviation Arabic (pdf) See RLM
U+0709 ܉ SYRIAC SUBLINEAR COLON SKEWED RIGHT SYRIAC SUBLINEAR COLON SKEWED LEFT ※ Correction Syriac (pdf)
U+0CDE KANNADA LETTER FA KANNADA LETTER LLLA ※ Correction Kannada (pdf)
U+0E9D LAO LETTER FO TAM LAO LETTER FO FON ※ Correction Lao (pdf)
U+0E9F LAO LETTER FO SUNG LAO LETTER FO FAY ※ Correction Lao (pdf)
U+0EA3 LAO LETTER LO LING LAO LETTER RO ※ Correction Lao (pdf)
U+0EA5 LAO LETTER LO LOOT LAO LETTER LO ※ Correction Lao (pdf)
U+0FD0 TIBETAN MARK BSKA- SHOG GI MGO RGYAN TIBETAN MARK BKA- SHOG GI MGO RGYAN ※ Correction Tibetan (pdf)
U+11EC HANGUL JONGSEONG IEUNG-KIYEOK HANGUL JONGSEONG YESIEUNG-KIYEOK ※ Correction Hangul Jamo (pdf)
U+11ED HANGUL JONGSEONG IEUNG-SSANGKIYEOK HANGUL JONGSEONG YESIEUNG-SSANGKIYEOK ※ Correction Hangul Jamo (pdf)
U+11EE HANGUL JONGSEONG SSANGIEUNG HANGUL JONGSEONG SSANGYESIEUNG ※ Correction Hangul Jamo (pdf)
U+11EF HANGUL JONGSEONG IEUNG-KHIEUKH HANGUL JONGSEONG YESIEUNG-KHIEUKH ※ Correction Hangul Jamo (pdf)
U+180B MONGOLIAN FREE VARIATION SELECTOR ONE
FVS1
Abbreviation Mongolian (pdf)
U+180C MONGOLIAN FREE VARIATION SELECTOR TWO
FVS2
Abbreviation Mongolian (pdf)
U+180D MONGOLIAN FREE VARIATION SELECTOR THREE
FVS3
Abbreviation Mongolian (pdf)
U+180E MONGOLIAN VOWEL SEPARATOR
MVS
Abbreviation Mongolian (pdf)
U+180F MONGOLIAN FREE VARIATION SELECTOR FOUR
FVS4
Abbreviation Mongolian (pdf)
U+1BBD SUNDANESE LETTER BHA SUNDANESE LETTER ARCHAIC I ※ Correction Sudanese (pdf) added in version 15.0
U+200B
ZERO WIDTH SPACE
ZWSP
Abbreviation General Punctuation (pdf)
U+200C
ZERO WIDTH NON-JOINER
ZWNJ
Abbreviation General Punctuation (pdf)
U+200D
ZERO WIDTH JOINER
ZWJ
Abbreviation General Punctuation (pdf)
U+200E
LEFT-TO-RIGHT MARK
LRM
Abbreviation General Punctuation (pdf)
U+200F
RIGHT-TO-LEFT MARK
RLM
Abbreviation General Punctuation (pdf)
U+202A LEFT-TO-RIGHT EMBEDDING
LRE
Abbreviation General Punctuation (pdf)
U+202B RIGHT-TO-LEFT EMBEDDING
RLE
Abbreviation General Punctuation (pdf)
U+202C POP DIRECTIONAL FORMATTING
PDF
Abbreviation General Punctuation (pdf)
U+202D LEFT-TO-RIGHT OVERRIDE
LRO
Abbreviation General Punctuation (pdf)
U+202E RIGHT-TO-LEFT OVERRIDE
RLO
Abbreviation General Punctuation (pdf)
U+202F NARROW NO-BREAK SPACE
NNBSP
Abbreviation General Punctuation (pdf)
U+205F
MEDIUM MATHEMATICAL SPACE
MMSP
Abbreviation General Punctuation (pdf)
U+2060
WORD JOINER
WJ
Abbreviation General Punctuation (pdf)
U+2066 LEFT-TO-RIGHT ISOLATE
LRI
Abbreviation General Punctuation (pdf)
U+2067 RIGHT-TO-LEFT ISOLATE
RLI
Abbreviation General Punctuation (pdf)
U+2068 FIRST STRONG ISOLATE
FSI
Abbreviation General Punctuation (pdf)
U+2069 POP DIRECTIONAL ISOLATE
PDI
Abbreviation General Punctuation (pdf)
U+2118
SCRIPT CAPITAL P WEIERSTRASS ELLIPTIC FUNCTION ※ Correction Letterlike Symbols (pdf)
U+2448 OCR DASH MICR ON US SYMBOL ※ Correction Optical Character Recognition (pdf)
U+2449 OCR CUSTOMER ACCOUNT NUMBER MICR DASH SYMBOL ※ Correction Optical Character Recognition (pdf)
U+2B7A LEFTWARDS TRIANGLE-HEADED ARROW WITH DOUBLE HORIZONTAL STROKE LEFTWARDS TRIANGLE-HEADED ARROW WITH DOUBLE VERTICAL STROKE ※ Correction Miscellaneous Symbols and Arrows (pdf)
U+2B7C RIGHTWARDS TRIANGLE-HEADED ARROW WITH DOUBLE HORIZONTAL STROKE RIGHTWARDS TRIANGLE-HEADED ARROW WITH DOUBLE VERTICAL STROKE ※ Correction Miscellaneous Symbols and Arrows (pdf)
U+A015 YI SYLLABLE WU YI SYLLABLE ITERATION MARK ※ Correction Yi Syllables (pdf)
U+AA6E MYANMAR LETTER KHAMTI HHA MYANMAR LETTER KHAMTI LLA ※ Correction Myanmar Extended-A (pdf)
U+FE00
...
U+FE0F
...
VARIATION SELECTOR-1
...
VARIATION SELECTOR-16
VS1
...
VS16
Abbreviation Variation Selectors (pdf)
(16 code points)
Abbreviation
U+FE18 PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRAKCET PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRACKET ※ Correction Vertical Forms (pdf)
U+FEFF  ZERO WIDTH NO-BREAK SPACE
BOM
BYTE ORDER MARK Alternate Arabic Presentation Forms-B (pdf)
ZWNBSP
Abbreviation
U+122D4 𒋔 CUNEIFORM SIGN SHIR TENU CUNEIFORM SIGN NU11 TENU ※ Correction Cuneiform (pdf)
U+122D5 𒋕 CUNEIFORM SIGN SHIR OVER SHIR BUR OVER BUR CUNEIFORM SIGN NU11 OVER NU11 BUR OVER BUR ※ Correction Cuneiform (pdf)
U+16E56 𖹖 MEDEFAIDRIN CAPITAL LETTER HP MEDEFAIDRIN CAPITAL LETTER H ※ Correction Medefaidrin (pdf)
U+16E57 𖹗 MEDEFAIDRIN CAPITAL LETTER NY MEDEFAIDRIN CAPITAL LETTER NG ※ Correction Medefaidrin (pdf)
U+16E76 𖹶 MEDEFAIDRIN SMALL LETTER HP MEDEFAIDRIN SMALL LETTER H ※ Correction Medefaidrin (pdf)
U+16E77 𖹷 MEDEFAIDRIN SMALL LETTER NY MEDEFAIDRIN SMALL LETTER NG ※ Correction Medefaidrin (pdf)
U+1B001 𛀁 HIRAGANA LETTER ARCHAIC YE HENTAIGANA LETTER E-1 ※ Correction Kana Supplement (pdf)
U+1D0C5 𝃅 BYZANTINE MUSICAL SYMBOL FHTORA SKLIRON CHROMA VASIS BYZANTINE MUSICAL SYMBOL FTHORA SKLIRON CHROMA VASIS ※ Correction Byzantine Musical Symbols (pdf)
U+E0100
...
U+E01EF
󠄀
...
󠇯
VARIATION SELECTOR-17
...
VARIATION SELECTOR-256
VS17
...
VS256
Abbreviation Variation Selectors Supplement (pdf)
(240 code points)
Abbreviation

Informal alternative names

The Unicode standard also uses and publishes alternative names that are not formal, and are not listed as normative alias names. These labels may not be unique and may use irregular characters in their name. They are used in Unicode code charts, for example U+070F   SYRIAC ABBREVIATION MARK: SAM.

See also

References

Tags:

Unicode Alias Names And Abbreviations BackgroundUnicode Alias Names And Abbreviations Reason to add an aliasUnicode Alias Names And Abbreviations List of aliasesUnicode Alias Names And Abbreviations Informal alternative namesUnicode Alias Names And AbbreviationsUnicodeUnicode character property

🔥 Trending searches on Wiki English:

Rita OraXaviAFC U-23 Asian CupTerry A. AndersonNava Mau27 ClubOpenAIVarshangalkku SheshamBarbra StreisandValentín BarcoRipley (TV series)Madame Web (film)IndiGoTokugawa shogunateAsh ReganBrad PittTravis HeadCharlie SheenClinton–Lewinsky scandalLady GagaSiren (2024 film)XVideos2024 Indian Premier LeagueDune (2021 film)André Villas-BoasCloud seedingChinaSex and the CityAnsel AdamsYouTube PremiumNaslen K. GafoorJason RitterYouTube KidsArizona CoyotesRobert Downey Jr.CanvaBill ClintonJames Clavell2024 Formula One World ChampionshipRonald ReaganSaudi ArabiaList of constituencies of the Lok SabhaNorth KoreaThe SupremesBruno FernandesO. J. SimpsonRajiv Gandhi International Cricket StadiumSri LankaLily Gladstone2022 NFL draftThe Empire Strikes BackBillboard Hot 100Baldwin IV of JerusalemElisabeth MossIlluminatiMyanmarGeorgina ChapmanKendrick LamarSandra OhThe Office (American TV series)Eliot SumnerMonkey Man (film)Ariana GrandeMaya RudolphNicole KidmanTerry HillUnited KingdomThe First OmenFahadh FaasilTwitch (service)Michael AvenattiTheodore RooseveltGeorge VInna Lillahi wa inna ilayhi raji'unKim Ji-won (actress)2019 NFL draft🡆 More