Add support for 4-character script abbreviations

This commit is contained in:
Philip Hazel
2021-12-28 15:10:12 +00:00
parent af2637ee5e
commit 7713f33e46
15 changed files with 1552 additions and 970 deletions

View File

@@ -207,171 +207,172 @@ at release 5.18.
</P>
<br><a name="SEC7" href="#TOC1">SCRIPT MATCHING WITH \p AND \P</a><br>
<P>
The following script names are recognized in \p{sc:...} or \p{scx:...} items,
or on their own with \p (and also \P of course):
The following script names and their 4-letter abbreviations are recognized in
\p{sc:...} or \p{scx:...} items, or on their own with \p (and also \P of
course):
</P>
<P>
Adlam,
Ahom,
Anatolian_Hieroglyphs,
Arabic,
Armenian,
Avestan,
Balinese,
Bamum,
Bassa_Vah,
Batak,
Bengali,
Bhaiksuki,
Bopomofo,
Brahmi,
Braille,
Buginese,
Buhid,
Canadian_Aboriginal,
Carian,
Caucasian_Albanian,
Chakma,
Cham,
Cherokee,
Chorasmian,
Common,
Coptic,
Cuneiform,
Cypriot,
Cypro_Minoan,
Cyrillic,
Deseret,
Devanagari,
Dives_Akuru,
Dogra,
Duployan,
Egyptian_Hieroglyphs,
Elbasan,
Elymaic,
Ethiopic,
Georgian,
Glagolitic,
Gothic,
Grantha,
Greek,
Gujarati,
Gunjala_Gondi,
Gurmukhi,
Han,
Hangul,
Hanifi_Rohingya,
Hanunoo,
Hatran,
Hebrew,
Hiragana,
Imperial_Aramaic,
Inherited,
Inscriptional_Pahlavi,
Inscriptional_Parthian,
Javanese,
Kaithi,
Kannada,
Katakana,
Kayah_Li,
Kharoshthi,
Khitan_Small_Script,
Khmer,
Khojki,
Khudawadi,
Lao,
Latin,
Lepcha,
Limbu,
Linear_A,
Linear_B,
Lisu,
Lycian,
Lydian,
Mahajani,
Makasar,
Malayalam,
Mandaic,
Manichaean,
Marchen,
Masaram_Gondi,
Medefaidrin,
Meetei_Mayek,
Mende_Kikakui,
Meroitic_Cursive,
Meroitic_Hieroglyphs,
Miao,
Modi,
Mongolian,
Mro,
Multani,
Myanmar,
Nabataean,
Nandinagari,
New_Tai_Lue,
Newa,
Nko,
Nushu,
Nyakeng_Puachue_Hmong,
Ogham,
Ol_Chiki,
Old_Hungarian,
Old_Italic,
Old_North_Arabian,
Old_Permic,
Old_Persian,
Old_Sogdian,
Old_South_Arabian,
Old_Turkic,
Old_Uyghur,
Oriya,
Osage,
Osmanya,
Pahawh_Hmong,
Palmyrene,
Pau_Cin_Hau,
Phags_Pa,
Phoenician,
Psalter_Pahlavi,
Rejang,
Runic,
Samaritan,
Saurashtra,
Sharada,
Shavian,
Siddham,
SignWriting,
Sinhala,
Sogdian,
Sora_Sompeng,
Soyombo,
Sundanese,
Syloti_Nagri,
Syriac,
Tagalog,
Tagbanwa,
Tai_Le,
Tai_Tham,
Tai_Viet,
Takri,
Tamil,
Tangsa,
Tangut,
Telugu,
Thaana,
Thai,
Tibetan,
Tifinagh,
Tirhuta,
Toto,
Ugaritic,
Vai,
Vithkuqi,
Wancho,
Warang_Citi,
Yezidi,
Yi,
Zanabazar_Square.
Adlam (Adlm),
Ahom (Ahom),
Anatolian_Hieroglyphs (Hluw),
Arabic (Arab),
Armenian (Armn),
Avestan (Avst),
Balinese (Bali),
Bamum (Bamu),
Bassa_Vah (Bass),
Batak (Batk),
Bengali (Beng),
Bhaiksuki (Bhks),
Bopomofo (Bopo),
Brahmi (Brah),
Braille (Brai),
Buginese (Bugi),
Buhid (Buhd),
Canadian_Aboriginal (Cans),
Carian (Cari),
Caucasian_Albanian (Aghb),
Chakma (Cakm),
Cham (Cham),
Cherokee (Cher),
Chorasmian (Chrs),
Common (Zyyy),
Coptic (Copt),
Cuneiform (Xsux),
Cypriot (Cprt),
Cypro_Minoan (Cpmn),
Cyrillic (Cyrl),
Deseret (Dsrt),
Devanagari (Deva),
Dives_Akuru (Diak),
Dogra (Dogr),
Duployan (Dupl),
Egyptian_Hieroglyphs (Egyp),
Elbasan (Elba),
Elymaic (Elym),
Ethiopic (Ethi),
Georgian (Geor),
Glagolitic (Glag),
Gothic (Goth),
Grantha (Gran),
Greek (Grek),
Gujarati (Gujr),
Gunjala_Gondi (Gong),
Gurmukhi (Guru),
Han (Hani),
Hangul (Hang),
Hanifi_Rohingya (Rohg),
Hanunoo (Hano),
Hatran (Hatr),
Hebrew (Hebr),
Hiragana (Hira),
Imperial_Aramaic (Armi),
Inherited (Zinh),
Inscriptional_Pahlavi (Phli),
Inscriptional_Parthian (Prti),
Javanese (Java),
Kaithi (Kthi),
Kannada (Knda),
Katakana (Kana),
Kayah_Li (Kali),
Kharoshthi (Khar),
Khitan_Small_Script (Kits),
Khmer (Khmr),
Khojki (Khoj),
Khudawadi (Sind),
Lao (Laoo),
Latin (Latn),
Lepcha (Lepc),
Limbu (Limb),
Linear_A (Lina),
Linear_B (Linb),
Lisu (Lisu),
Lycian (Lyci),
Lydian (Lydi),
Mahajani (Majh),
Makasar (Maka),
Malayalam (Mlym),
Mandaic (Mand),
Manichaean (Mani),
Marchen (Marc),
Masaram_Gondi (Gonm),
Medefaidrin (Medf),
Meetei_Mayek (Mtei),
Mende_Kikakui (Mend),
Meroitic_Cursive (Merc),
Meroitic_Hieroglyphs (Mero),
Miao (Miao),
Modi (Modi),
Mongolian (Mong),
Mro (Mroo),
Multani (Mult),
Myanmar (Mymr),
Nabataean (Nbar),
Nandinagari (Nand),
New_Tai_Lue (Talu),
Newa (Newa),
Nko (Nkoo),
Nushu (Nshu),
Nyiakeng_Puachue_Hmong (Hmnp),
Ogham (Ogam),
Ol_Chiki (Olck),
Old_Hungarian (Hung),
Old_Italic (Olck),
Old_North_Arabian (Narb),
Old_Permic (Perm),
Old_Persian (Orkh),
Old_Sogdian (Sogo),
Old_South_Arabian (Sarb),
Old_Turkic (Orkh),
Old_Uyghur (Ougr),
Oriya (Orya),
Osage (Osge),
Osmanya (Osma),
Pahawh_Hmong (Hmng),
Palmyrene (Palm),
Pau_Cin_Hau (Pauc),
Phags_Pa (Phag),
Phoenician (Phnx),
Psalter_Pahlavi (Phli),
Rejang (Rjng),
Runic (Runr),
Samaritan (Samr),
Saurashtra (Saur),
Sharada (Shrd),
Shavian (Shaw),
Siddham (Sidd),
SignWriting (Sgnw),
Sinhala (Sinh),
Sogdian (Sogd),
Sora_Sompeng (Sora),
Soyombo (Soyo),
Sundanese (Sund),
Syloti_Nagri (Sylo),
Syriac (Syrc),
Tagalog (Tglg),
Tagbanwa (Tagb),
Tai_Le (Tale),
Tai_Tham (Lana),
Tai_Viet (Tavt),
Takri (Takr),
Tamil (Taml),
Tangsa (Tngs),
Tangut (Tang),
Telugu (Telu),
Thaana (Thaa),
Thai (Thai),
Tibetan (Tibt),
Tifinagh (Tfng),
Tirhuta (Tirh),
Toto (Toto),
Ugaritic (Ugar),
Vai (Vaii),
Vithkuqi (Vith),
Wancho (Wcho),
Warang_Citi (Wara),
Yezidi (Yezi),
Yi (Yiii),
Zanabazar_Square (Zanb).
</P>
<br><a name="SEC8" href="#TOC1">BIDI_PROPERTIES FOR \p AND \P</a><br>
<P>
@@ -743,7 +744,7 @@ Cambridge, England.
</P>
<br><a name="SEC30" href="#TOC1">REVISION</a><br>
<P>
Last updated: 22 December 2021
Last updated: 28 December 2021
<br>
Copyright &copy; 1997-2021 University of Cambridge.
<br>