total update

2026-07-24 10:43:07 -04:00 · 2019-06-26 17:54:16 +03:00
parent 5ff4c39b7c
commit 335df6d2d1
11731 changed files with 2582581 additions and 181485 deletions
@@ -0,0 +1 @@
+Aegean numbers was the numeral system used by the Minoan and Mycenaean civilizations. They are attested in several Aegean scripts ([BLOCK:linear-a Linear A], [BLOCK:linear-b-syllabary Linear B]). They may have survived in the Cypro-Minoan script where a single sign with "100" value is attested so far on a large clay tablet from Enkomi.
@@ -0,0 +1 @@
+Alchemical symbols, originally devised as part of alchemy, were used to denote some elements and some compounds until the 18th century. Note that while notation like this was mostly standardized, style and symbol varied between alchemists.
@@ -0,0 +1 @@
+Alphabetic Presentation Forms is a Unicode block containing standard ligatures for the [BLOCK:basic-latin Latin], [BLOCK:armenian Armenian], and [BLOCK:hebrew Hebrew] scripts.
@@ -0,0 +1,3 @@
+The musical system of ancient Greece evolved over a period of more than 500 years from simple scales of tetrachords, or divisions of the perfect fourth, to The Perfect Immutable System, encompassing a span of fifteen pitch keys.
+ 
+Any discussion of ancient Greek music, theoretical, philosophical or aesthetic, is fraught with two problems: there are few examples of written music, and there are many, sometimes fragmentary theoretical and philosophical accounts. This article provides an overview which includes examples of different kinds of classification while also trying to show the broader form evolving from the simple tetrachord to system as a whole.
@@ -0,0 +1,3 @@
+Ancient Greek Numbers is a Unicode block containing acrophonic [https://unicode-table.com/en/numerals/ numerals] used in ancient Greece.
+
+Attic numerals were used by the ancient Greeks, possibly from the 7th century BC. They were also known as Herodianic numerals because they were first described in a 2nd-century [https://unicode-table.com/en/1F4DC/ manuscript] by Herodian. They are also known as acrophonic numerals because the symbols derive from the first letters of the words that the symbols represent: five, ten, [https://unicode-table.com/en/1F4AF/ hundred], thousand and ten thousand. See Greek numerals and acrophony, click the numbers to find out their English names.
@@ -0,0 +1 @@
+Ancient Symbols is a Unicode block containing Roman characters for [https://unicode-table.com/en/sets/currency-symbols/ currency], [https://unicode-table.com/en/2696/ weights], and [https://unicode-table.com/en/1F4CF/ measures].
@@ -0,0 +1 @@
+[BLOCK:arabic Arabic] Extended-A is a Unicode block encoding Qur'anic annotations and letter variants used for various non-Arabic languages.
@@ -0,0 +1 @@
+[BLOCK:arabic Arabic] Mathematical Alphabetic Symbols is a Unicode block encoding characters used in Arabic mathematical expressions.
@@ -0,0 +1 @@
+[BLOCK:arabic Arabic] Presentation Forms-A is a Unicode block encoding contextual forms and ligatures of letter variants needed for Persian, Urdu, Sindhi and Central Asian languages. The presentation forms are present only for compatibility with older standards, and are not currently needed for coding text.
@@ -0,0 +1 @@
+[BLOCK:arabic Arabic] Presentation Forms-B is a Unicode block encoding spacing forms of Arabic diacritics, and contextual letter forms. The presentation forms are present only for compatibility with older standards, and are not currently needed for coding text.
@@ -0,0 +1 @@
+Arabic Supplement is a Unicode block that encodes [BLOCK:arabic Arabic letter] variants mostly used for writing African (non-Arabic) languages.
@@ -0,0 +1,5 @@
+Arabic is a Unicode block, containing the standard letters and the most common diacritics of the Arabic script, and the Arabic-Indic digits.
+
+The Arabic script is a writing system used for writing several languages of Asia and Africa, such as Arabic, the Sorani and Luri dialects of Kurdish, Persian, Pashto, and Urdu. Even until the 16th century, it was used to write some texts in Spanish. After the [BLOCK:basic-latin Latin] script, [BLOCK:cjk-unified-ideographs Chinese characters], and [BLOCK:devanagari Devanagari], it is the fourth-most widely used writing system in the world.
+The Arabic script is written from right to left in a cursive style. In most cases the letters transcribe consonants, or consonants and a few vowels, so most Arabic alphabets are abjads.
+The script was first used to write texts in Arabic, most notably the Qurʼān, the holy book of Islam. With the spread of Islam, it came to be used to write languages of many language families, leading to the addition of new letters and other symbols, with some versions, such as Kurdish, Uyghur, and old Bosnian being abugidas or true alphabets. It is also the basis for a rich tradition of Arabic calligraphy.
@@ -0,0 +1,5 @@
+Armenian is a Unicode block containing characters for writing the Armenian language, both the traditional Western Armenian and reformed Eastern Armenian orthographies. Five Armenian ligatures are encoded in the [BLOCK:alphabetic-presentation-forms Alphabetic Presentation Forms block].
+
+The Armenian language (classical: հայերէն; reformed: հայերեն [hɑjɛˈɾɛn] hayeren) is an Indo-European language spoken by the Armenians. It is the official language of the Republic of Armenia and the self-proclaimed Nagorno-Karabakh Republic. It has historically been spoken throughout the Armenian Highlands and today is widely spoken in the Armenian diaspora. Armenian has its own unique script, the Armenian alphabet, invented in 405 AD by Mesrop Mashtots.
+Linguists classify Armenian as an independent branch of the Indo-European language family. It is of interest to linguists for its distinctive phonological developments within the Indo-European languages. Armenian shares a number of major innovations with [BLOCK:greek-coptic Greek], and some linguists group these two languages with Phrygian and the Indo-Iranian family into a higher-level subgroup of Indo-European, which is defined by such shared innovations as the augment. More recently, others have proposed a Balkan grouping including Greek, Phrygian, [BLOCK:armenian Armenian], and [BLOCK:caucasian-albanian Albanian].
+Armenia was a monolingual country no later than by the second century BC. Its language has long literary history, with a fifth-century Bible translation as its oldest surviving text. There are two standardized modern literary forms, Eastern Armenian and Western Armenian, with which most contemporary dialects are mutually intelligible.
@@ -0,0 +1 @@
+Arrows is a Unicode block containing line, curve, and semicircle symbols terminating in barbs or arrows.
@@ -0,0 +1,3 @@
+The Avestan alphabet is a writing system developed during Iran's Sassanid era (AD 226–651) to render the Avestan language.
+ 
+As a side effect of its development, the script was also used for Pazend, a method of writing Middle Persian that was used primarily for the Zend commentaries on the texts of the Avesta. In the texts of Zoroastrian tradition, the alphabet is referred to as din dabireh or din dabiri, Middle Persian for "the religion's script".
@@ -0,0 +1,4 @@
+Balinese is a Unicode block containing characters for the basa Bali language.
+
+The Balinese script, natively known as Aksara Bali and Hanacaraka, is an abugida used in the island of Bali, Indonesia, commonly for writing the Austronesian Balinese language, Old Javanese, and the liturgical language Sanskrit. With some modifications, the script is also used to write the Sasak language, used in the neighboring island of Lombok. The script is a descendant of the [BLOCK:brahmi Brahmi script], and so has many similarities with the modern scripts of South and Southeast Asia. The Balinese script, along with the [BLOCK:javanese-alphabet Javanese script], is considered the most elaborate and ornate among Brahmic scripts of Southeast Asia.
+Though everyday use of the script has largely been supplanted by the [BLOCK:basic-latin Latin alphabet], the Balinese script has significant prevalence in many of the island's traditional ceremonies and is strongly associated with the Hindu religion. The script is mainly used today for copying lontar or palm leaf manuscripts containing religious texts.
@@ -0,0 +1 @@
+Bamum Supplement is a Unicode block containing the characters of the historic stage A-F of the Bamum script, used for writing the Bamum language of western Cameroon. The modern stage G characters, which include many characters used for stage A-F orthographies, are included in the [BLOCK:bamum Bamum] block.
@@ -0,0 +1,3 @@
+Bamum is a Unicode block containing the characters of stage-G Bamum script, used for modern writing of the Bamum language of western Cameroon. Characters for writing earlier orthographies (stages A–F) are contained in a Bamum Supplement block.
+
+The Bamum scripts are an evolutionary series of six scripts created for the Bamum language by King Njoya of Cameroon at the turn of the 20th century. They are notable for evolving from a pictographic system to a partially alphabetic syllabic script in the space of 14 years, from 1896 to 1910. Bamum type was cast in 1918, but the script fell into disuse around 1931.
@@ -0,0 +1,9 @@
+The Basic Latin (or C0 Controls and Basic Latin) Unicode block is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding.
+
+The Basic Latin block was included in its present from version 1.0.0 of the Unicode Standard, without addition or alteration of the character repertoire.
+
+The classical Latin alphabet, also known as the Roman alphabet, is a writing system that evolved from the visually similar Cumaean Greek version of the [BLOCK:greek-coptic Greek alphabet]. The Greek alphabet, including the Cumaean version, descended from the [BLOCK:phoenician Phoenician] abjad. The Etruscans who ruled early Rome adopted and modified the Cumaean Greek alphabet. The [BLOCK:old-italic Etruscan alphabet] was in turn adopted and further modified by the ancient Romans to write the Latin language.
+ 
+During the Middle Ages scribes adapted the Latin alphabet for writing Romance languages, direct descendants of Latin, as well as Celtic, Germanic, Baltic, and some Slavic languages. With the age of colonialism and Christian evangelism, the Latin script spread beyond Europe, coming into use for writing indigenous American, Australian, Austronesian, Austroasiatic, and African languages. More recently, linguists have also tended to prefer the Latin script or the [BLOCK:ipa-extensions International Phonetic Alphabet] (itself largely based on Latin script) when transcribing or creating written standards for non-European languages, such as the African reference alphabet.
+ 
+The term Latin alphabet may refer to either the alphabet used to write Latin (as described in this article), or other alphabets based on the Latin script, which is the basic set of letters common to the various alphabets descended from the classical Latin one, such as the English alphabet. These Latin alphabets may discard letters, like the Rotokas alphabet, or add new letters, like the Danish and Norwegian alphabets. Letter shapes have evolved over the centuries, including the creation for Medieval Latin of lower-case forms which did not exist in the Classical period.
@@ -0,0 +1,2 @@
+The Bassa script, known as Bassa vah or simply vah ('throwing a sign' in Bassa) was an alphabet designed by, or with the help of, Liberian missionaries in the 1920s. It is not clear what connection it may have had with neighboring scripts, or how much it was actually used, but type was cast for it, and an association for its promotion was formed in Liberia in 1959. It is not used and has been classified as a failed script.
+Vah is a true alphabet, with 23 consonant letters, 7 vowel letters, and 5 tone diacritics, which are placed inside the vowels.
@@ -0,0 +1,3 @@
+Batak is a Unicode block containing characters for writing the Batak dialects of Karo, Mandailing, Pakpak, Simalungun, and Toba.
+
+The Batak script, natively known as surat Batak, surat na sapulu sia (the nineteen letters), or si-sia-sia, is an abugida used to write the Austronesian Batak languages spoken by several million people on the Indonesian island of Sumatra. The script may derived from the Kawi and Pallava script, ultimately derived from the [BLOCK:brahmi Brahmi] script of India, or from the hypothetical Proto-Sumatran script influenced by Pallava.
@@ -0,0 +1,4 @@
+Bengali is a Unicode block containing characters for the Bengali, Assamese, Bishnupriya Manipuri, Daphla, Garo, Hallam, Khasi, Mizo, Munda, Naga, Rian, and Santali languages. In its original incarnation, the code points U+0981..U+09CD were a direct copy of the Bengali characters A1-ED from the 1988 ISCII standard, as well as several Assamese ISCII characters in the U+09F0 column. The [BLOCK:devanagari Devanagari], [BLOCK:gurmukhi Gurmukhi], [BLOCK:gujarati Gujarati], [BLOCK:oriya Oriya], [BLOCK:tamil], [BLOCK:telugu], [BLOCK:kannada], and [BLOCK:malayalam] blocks were similarly all based on ISCII encodings.
+
+The Bengali alphabet or Bangla alphabet is the writing system for the Bengali language and is the 6th most widely used writing system in the world. The script is shared by Assamese with minor variations, and is the basis for the other writing systems like Meithei and Bishnupriya Manipuri. Historically, the script has also been used to write the Sanskrit language in the region of Bengal.
+From a classificatory point of view, the Bengali script is an abugida, i.e. its vowel graphemes are mainly realized not as independent letters, but as diacritics attached to its consonant letters. It is written from left to right and lacks distinct letter cases. It is recognizable, as other [BLOCK:brahmi Brahmic] scripts, by a distinctive horizontal line running along the tops of the letters that links them together which is known as matra. The Bengali script is however less blocky and presents a more sinuous shape.
@@ -0,0 +1 @@
+Block Elements is a Unicode block containing square block symbols of various fill and shading.
@@ -0,0 +1 @@
+Bopomofo Extended is a Unicode block containing additional Bopomofo characters for writing phonetic Minnan, Hakka, Hmu, and Ge languages of China. The basic set of Bopomofo characters can be found in the [BLOCK:bopomofo] block.
@@ -0,0 +1,3 @@
+Bopomofo is a Unicode block containing phonetic characters for Chinese. It is based on the Chinese standard GB 2312. Additional Bopomofo characters can be found in the [BLOCK:bopomofo-extended Bopomofo Extended] block.
+
+Zhuyin fuhao, Zhuyin or Bopomofo is a system of phonetic notation for the transcription of spoken Chinese, particularly the Mandarin dialect. The first two are traditional terms, whereas Bopomofo is the colloquial term, also used by the ISO and Unicode. Consisting of 37 characters and four tone marks, it transcribes all possible sounds in Mandarin. Zhuyin was introduced in China by the Republican Government in the 1910s and used alongside the Wade-Giles system, which used a modified Latin alphabet. The Wade system was replaced by Hanyu Pinyin in 1958 by the Government of the People's Republic of China, and at the International Organization for Standardization (ISO) in 1982. Although Taiwan officially abandoned Wade-Giles in 2009, Bopomofo remains widely used as an educational tool and electronic input method in Taiwan.
@@ -0,0 +1,3 @@
+Box Drawing is a Unicode block containing characters for compatibility with legacy graphics standards that contained characters for making bordered charts and tables, i.e. box-drawing characters.
+
+Box-drawing characters, also known as line-drawing characters, are a form of semigraphics widely used in text user interfaces to draw various geometric frames and boxes. In graphical user interfaces, these characters are much less useful as it is much simpler to draw lines and rectangles directly with graphical APIs. Box-drawing characters work only with monospaced fonts; however, they are still useful for plaintext comments on websites.
@@ -0,0 +1,3 @@
+Brāhmī is the modern name given to one of the oldest writing systems used in the Indian subcontinent and in Central Asia during the final centuries BCE and the early centuries CE. Like its contemporary, [BLOCK:kharoshthi Kharoṣṭhī], which was used in what is now Afghanistan and Pakistan, was an abugida.
+The best-known Brahmi inscriptions are the rock-cut edicts of Ashoka in north-central India, dated to 250–232 BCE. The script was deciphered in 1837 by James Prinsep, an archaeologist, philologist, and official of the East India Company. The origin of the script is still much debated, with current Western academic opinion generally agreeing (with some exceptions) that Brahmi was derived from or at least influenced by one or more contemporary Semitic scripts, but a current of opinion in India favors the idea that it is connected to the much older and as-yet undeciphered Indus script. Brahmi was at one time referred to in English as the "pin-man" script, that is "stick figure" script.
+The Gupta script of the 5th century is sometimes called "Late Brahmi". The Brahmi script diversified into numerous local variants, classified together as the Brahmic scripts. Dozens of modern scripts used across South Asia have descended from Brahmi, making it one of the world's most influential writing traditions. One survey found 198 scripts that ultimately derive from it.
@@ -0,0 +1,7 @@
+In Unicode, braille is represented in a block called Braille Patterns (U+2800..U+28FF). The block contains all 256 possible patterns of an 8-dot braille cell, thereby including the complete 6-dot cell range.
+
+Braille /ˈbreɪl/ is a tactile writing system used by the blind and the visually impaired. It is traditionally written with embossed paper. Braille-users can read computer screens and other electronic supports thanks to refreshable braille displays. They can write braille with the original slate and stylus or type it on a braille writer, such as a portable braille note-taker, or on a computer that prints with a braille embosser.
+Braille is named after its creator, Frenchman Louis Braille, who lost his eyesight due to a childhood accident. In 1824, at the age of 15, Braille developed his code for the French alphabet as an improvement on night writing. He published his system, which subsequently included musical notation, in 1829. The second revision, published in 1837, was the first binary form of writing developed in the modern era.
+Braille characters are small rectangular blocks called cells that contain tiny palpable bumps called raised dots. The number and arrangement of these dots distinguish one character from another. Since the various braille alphabets originated as transcription codes of printed writing systems, the mappings (sets of character designations) vary from language to language. Furthermore, in English Braille there are three levels of encoding: Grade 1, a letter-by-letter transcription used for basic literacy; Grade 2, an addition of abbreviations and contractions; and Grade 3, various non-standardized personal shorthands.
+Braille cells are not the only thing to appear in embossed text. There may be embossed illustrations and graphs, with the lines either solid or made of series of dots, arrows, bullets that are larger than braille dots, etc.
+In the face of screen-reader software, braille usage has declined. However, braille education remains important for developing reading skills among blind and visually impaired children, and braille literacy correlates with higher employment rates.
@@ -0,0 +1,3 @@
+Buginese is a Unicode block containing characters for writing the Buginese language of Sulawesi.
+
+The Lontara script is a [BLOCK:brahmi Brahmic] script traditionally used for the Bugis, Makassarese, and Mandar languages of Sulawesi in Indonesia. It is also known as the Buginese script, as Lontara documents written in this language are the most numerous. It was largely replaced by the [BLOCK:basic-latin Latin alphabet] during the period of Dutch colonization, though it is still used today to a limited extent. The term Lontara is derived from the Malay name for palmyra palm, lontar, whose leaves are traditionally used for manuscripts. In Buginese, this script is called urupu sulapa eppa which means "four-cornered letters", referencing the Bugis-Makasar belief of the four elements that shaped the universe: fire, water, air, and earth.
@@ -0,0 +1,3 @@
+Buhid is a Unicode block containing characters for writing the Buhid language of the Philippines.
+
+Buhid is a [BLOCK:brahmi Brahmic] script of the Philippines, closely related to Baybayin, and is used today by the Mangyans to write their language, Buhid.
@@ -0,0 +1 @@
+Byzantine Musical Symbols is a Unicode block containing characters for representing Byzantine-era musical notation.
@@ -0,0 +1 @@
+The Carian alphabets are a number of regional scripts used to write the Carian language of western Anatolia. They consisted of some 30 alphabetic letters, with several geographic variants in Caria and a homogeneous variant attested from the Nile delta, where Carian mercenaries fought for the Egyptian pharaohs. They were written left-to-right in Caria (apart from the Carian–Lydian city of Tralleis) and right-to-left in Egypt. Carian was deciphered primarily through Egyptian–Carian bilingual tomb inscriptions, starting with John Ray in 1981; previously only a few sound values and the alphabetic nature of the script had been demonstrated. The readings of Ray and subsequent scholars were largely confirmed with a Carian–Greek bilingual inscription discovered in Kaunos in 1996, which for the first time verified personal names, but the identification of many letters remains provisional and debated, and a few are wholly unknown.
@@ -0,0 +1 @@
+The Caucasian Albanian alphabet, or the alphabet for the Gargareans, was an alphabet used by the Caucasian Albanians, one of the ancient and indigenous Northeast Caucasian peoples whose territory comprised parts of present-day Azerbaijan and Daghestan. It was one of only two indigenous alphabets ever developed for speakers of indigenous Caucasian languages (i.e. Caucasian languages that are not a part of larger groupings like the Turkic and Indo-European languages families) to represent any of their languages, the other being the [BLOCK:georgian Georgian alphabet]. The Armenian language, the third indigenous language of Caucasus with its own alphabet, is an independent branch of the Indo-European language family.
@@ -0,0 +1 @@
+The Chakma alphabet (Ajhā pāṭh), also called Ojhapath, Ojhopath, Aaojhapath, is an abugida used for the Chakma language and which is being adapted for the Tanchangya language. The forms of the letters are quite similar to those of the Burmese script.
@@ -0,0 +1,2 @@
+Cham is a Unicode block containing characters for writing the Cham language, primarily used for the Eastern dialect in Cambodia.
+The Cham alphabet is an abugida used to write Cham, an Austronesian language spoken by some 230,000 Cham people in Vietnam and Cambodia. It is written horizontally left to right, as is English.
@@ -0,0 +1,3 @@
+Cherokee is a Unicode block containing the syllabic characters for writing the Cherokee language.
+
+The Cherokee syllabary is a syllabary invented by Sequoyah to write the Cherokee language in the late 1810s and early 1820s. His creation of the syllabary is particularly noteworthy in that he could not previously read any script. He first experimented with logograms, but his system later developed into a syllabary. In his system, each symbol represents a syllable rather than a single phoneme; the 85 (originally 86) characters in the Cherokee syllabary provide a suitable method to write Cherokee. Some symbols do resemble the Latin, Greek and even the Cyrillic scripts' letters, but the sounds are completely different (for example, the sound /a/ is written with a letter that resembles Latin D).
@@ -0,0 +1 @@
+CJK Compatibility Forms is a Unicode block containing vertical glyph variants for east Asian compatibility.
@@ -0,0 +1 @@
+CJK Compatibility Ideographs is a Unicode block containing Han Ideographs that contained duplicate characters in the South Korean KS X 1001:1998 (U+F900-U+FA0B), Taiwanese Big5 (U+FA0C-U+FA0D), Japanese IBM 32 (CP932 variant; U+FA0E-U+FA2D), JIS X 0213 (U+FA30-U+FA6A), ARIB STD-B24 (U+FA6B-U+FA6D) and the North Korean KPS 10721-2000 (U+FA70-U+FAD9) source standards for CJK characters. In order to retain round-trip compatibility with that standard, the CJK Compatibility Ideographs block was created to hold those extra characters. In subsequent versions of the standard, more compatibility ideographs, and even a few regular ideographs that do not have duplicates, have been added to the block.
@@ -0,0 +1 @@
+CJK Compatibility is a Unicode block containing square symbols (both CJK and [BLOCK:enclosed-alphanumerics Latin alphanumeric]) encoded for compatibility with east Asian character sets.
@@ -0,0 +1 @@
+CJK Radicals Supplement is a Unicode block containing alternative, often positional, forms of the [BLOCK:kangxi-radicals Kangxi radicals]. They are used headers in dictionary indices and other [BLOCK:cjk-unified-ideographs CJK ideograph] collections organized by radical-stroke.
@@ -0,0 +1 @@
+CJKV strokes are the calligraphic strokes needed to write the [BLOCK:cjk-unified-ideographs Chinese characters] in regular script used in East Asia. CJK strokes are the classified set of line patterns that may be arranged and combined to form Chinese characters (also known as Hanzi) in use in China, Japan, Korea, and to a lesser extent in Vietnam.
@@ -0,0 +1,3 @@
+CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation in the [BLOCK:cjk-unified-ideographs unified Chinese, Japanese and Korean script].
+
+Chinese punctuation uses a different set of punctuation marks from European languages, although the concept of punctuation was adapted in the written language during the 20th century from Western punctuation marks. Before that, the concept of punctuation in Eastern Asian cultures did not exist at all. The first book to be printed with modern punctuation was Outline of the History of Chinese Philosophy (中國哲學史大綱) by Hu Shi (胡適), published in 1919. Scholars did, however, annotate texts with symbols resembling the modern '。' and '、' (see below) to indicate full-stops and pauses, respectively. Traditional poetry and calligraphy maintains the punctuation-free style. The usage of punctuation is regulated by the Chinese national standard GB/T 15834–2011 “General rules for punctuation” Chinese: 标点符号用法; pinyin: biāodiǎn fúhào yòngfǎ.
@@ -0,0 +1 @@
+[BLOCK:cjk-unified-ideographs CJK Unified Ideographs] Extension-A is a Unicode block containing rare Han ideographs.
@@ -0,0 +1,5 @@
+CJK Unified Ideographs is a Unicode block containing the most common CJK ideographs used in modern Chinese and Japanese.
+
+The Chinese, Japanese and Korean (CJK) scripts share a common background. In the process called Han unification the common (shared) characters were identified, and named "CJK Unified Ideographs". Unicode defines a total of 74,617 CJK Unified Ideographs.
+The terms ideographs or ideograms may be misleading, since the Chinese script is not strictly a picture writing system.
+Historically, Vietnam used Chinese ideographs too, so sometimes the abbreviation "CJKV" is used. This system was replaced by the Latin-based Vietnamese alphabet in the 1920s.
@@ -0,0 +1 @@
+Combining Diacritical Marks Extended is a Unicode block containing [BLOCK:combining-diacritical-marks diactritical marks] used in German dialectology.
@@ -0,0 +1 @@
+Combining Diacritical Marks for Symbols is a Unicode block containing [BLOCK:arrows], dots, enclosures, and overlays for modifying symbol characters.
@@ -0,0 +1 @@
+Combining Diacritical Marks Supplement is a Unicode block containing combining characters for the Uralic Phonetic Alphabet and Medievalist notations. It is an extension of the diacritic characters found in the [BLOCK:combining-diacritical-marks] block.
@@ -0,0 +1,11 @@
+Combining Diacritical Marks is a Unicode block containing the most common combining characters. It also contains the Combining Grapheme Joiner, which prevents canonical reordering of combining characters, and despite the name, actually separates characters that would otherwise be considered a single grapheme in a given context.
+
+A diacritic /daɪ.əˈkrɪtɨk/ – also diacritical mark, diacritical point, or diacritical sign – is a glyph added to a letter, or basic glyph. The term derives from the Greek διακριτικός (diakritikós, "distinguishing", from ancient Greek διά (diá, through) and κρίνω (krínein, to separate)). Diacritic is primarily an adjective, though sometimes used as a noun, whereas diacritical is only ever an adjective. Some diacritical marks, such as the acute (´) and grave (`), are often called accents. Diacritical marks may appear above or below a letter, or in some other position such as within the letter or between two letters.
+ 
+The main use of diacritical marks in the Latin script is to change the sound-values of the letters to which they are added. Examples from English are the diaereses in naïve and Noël, which show that the vowel with the diaeresis mark is pronounced separately from the preceding vowel; the acute and grave accents, which can indicate that a final vowel is to be pronounced, as in saké and poetic breathèd; and the cedilla under the "c" in the borrowed French word façade, which shows it is pronounced /s/ rather than /k/. In other Latin alphabets, they may distinguish between homonyms, such as the French là ("there") versus la ("the"), which are both pronounced [la]. In Gaelic type, a dot over a consonant indicates lenition of the consonant in question.
+ 
+In other alphabetic systems, diacritical marks may perform other functions. Vowel pointing systems, namely the Arabic harakat ( ـَ, ـُ, ـُ, etc.) and the Hebrew niqqud ( ַ, ֶ, ִ, ֹ , ֻ, etc.) systems, indicate sounds (vowels and tones) that are not conveyed by the basic alphabet. The Indic virama ( ् etc.) and the Arabic sukūn ( ـْـ ) mark the absence of a vowel. Cantillation marks indicate prosody. Other uses include the Early Cyrillic titlo ( ◌҃ ) and the Hebrew gershayim ( ״ ), which, respectively, mark abbreviations or acronyms, and Greek diacritical marks, which showed that letters of the alphabet were being used as numerals. In the Hanyu Pinyin official romanization system for Chinese, diacritics are used to mark the tones of the syllables in which the marked vowels occur.
+ 
+In orthography and collation, a letter modified by a diacritic may be treated either as a new, distinct letter or as a letter–diacritic combination. This varies from language to language, and may vary from case to case within a language.
+ 
+In some cases, letters are used as "in-line diacritics" in place of ancillary glyphs, because they modify the sound of the letter preceding them, as in the case of the "h" in English "sh" and "th".
@@ -0,0 +1 @@
+Combining Half Marks is a Unicode block containing [BLOCK:combining-diacritical-marks diacritic mark] parts for spanning multiple characters.
@@ -0,0 +1 @@
+Common Indic Number Forms is a Unicode block containing characters for representing fractions in north India, Pakistan, and Nepal.
@@ -0,0 +1 @@
+Part of The [BLOCK:basic-latin] (or C0 Controls and Basic Latin) Unicode block.
@@ -0,0 +1,5 @@
+Control Pictures is a Unicode block containing graphic characters for representing the C0 control codes, and other control characters.
+
+The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use the ISO/IEC 2022 system of specifying control and graphic characters. Most character encodings, in addition to representing printable characters, also have characters such as these that represent additional information about the text, such as the position of a cursor, an instruction to start a new line, or a message that the text has been received.
+
+The C0 set defines codes in the range 00HEX–1FHEX and the C1 set defines codes in the range 80HEX–9FHEX. The default C0 set was originally defined in ISO 646 (ASCII), while the default C1 set was originally defined in ECMA-48 (harmonized later with ISO 6429). While other C0 and C1 sets are available for specialized applications, they are rarely used
@@ -0,0 +1 @@
+[BLOCK:coptic Coptic] Epact Numbers is a Unicode block containing old Coptic number forms.
@@ -0,0 +1,3 @@
+Coptic is a Unicode block used with the [BLOCK:greek-coptic Greek and Coptic] block to write the Coptic language. Prior to version 4.1 of the Unicode Standard, Greek and Coptic was used exclusively to write Coptic text, but Greek and Coptic letter forms are contrastive in many scholarly works, necessitating their disunification. Any specifically Coptic letters in the Greek and Coptic block are not reproduced in the Coptic Unicode block.
+
+The Coptic alphabet is the script used for writing the Coptic language. The repertoire of glyphs is based on the Greek alphabet augmented by letters borrowed from the Egyptian Demotic and is the first alphabetic script used for the Egyptian language. There are several Coptic alphabets, as the Coptic writing system may vary greatly among the various dialects and subdialects of the Coptic language.
@@ -0,0 +1,3 @@
+Counting rods (simplified Chinese: 筹; traditional Chinese: 籌; pinyin: chóu; Japanese: 算木, sangi) are small bars, typically 3–14 cm long, that were used by mathematicians for calculation in ancient China, Japan, Korea, and Vietnam. They are placed either horizontally or vertically to represent any integer or rational number.
+ 
+The written forms based on them are called rod numerals. They are a true positional numeral system with digits for 1–9 and a blank for 0, from the Warring states period (circa 475 BCE) to the 16th century.
@@ -0,0 +1,6 @@
+In Unicode, the Sumero-Akkadian Cuneiform script is covered in two blocks:
+U+12000–U+1237F [BLOCK:cuneiform]
+U+12400–U+1247F Cuneiform Numbers and Punctuation
+These blocks, in version 6.0, are in the Supplementary Multilingual Plane (SMP).
+The sample glyphs in the chart file published by the Unicode Consortium show the characters in their Classical Sumerian form (Early Dynastic period, mid 3rd millennium BCE). The characters as written during the 2nd and 1st millennia BCE, the era during which the vast majority of cuneiform texts were written, are considered font variants of the same characters.
+The character set as published in version 5.2 has been criticized, mostly because of its treatment of a number of common characters as ligatures, omitting them from the encoding standard.
@@ -0,0 +1,4 @@
+Cuneiform script is one of the earliest known systems of writing, distinguished by its wedge-shaped marks on clay tablets, made by means of a blunt reed for a stylus. The name cuneiform itself simply means "wedge shaped", from the Latin cuneus "wedge" and forma "shape," and came into English usage probably from Old French cunéiforme.
+Emerging in Sumer in the late 4th millennium B.C.E. (the Uruk IV period), cuneiform writing began as a system of pictographs. In the third millennium, the pictorial representations became simplified and more abstract as the number of characters in use grew smaller, from about 1,000 in the Early Bronze Age to about 400 in Late Bronze Age (Hittite cuneiform). The system consists of a combination of logophonetic, consonantal alphabetic and syllabic signs.
+The original Sumerian script was adapted for the writing of the Akkadian, Eblaite, Elamite, Hittite, Luwian, Hattic, Hurrian, and Urartian languages, and it inspired the [BLOCK:ugaritic] and [BLOCK:old-persian] alphabets. Cuneiform writing was gradually replaced by the [BLOCK:phoenician] alphabet during the Neo-Assyrian Empire. By the 2nd century C.E., the script had become extinct, and all knowledge of how to read it was lost until it began to be deciphered in the 19th century.
+Between half a million and two million cuneiform tablets are estimated to have been excavated in modern times, of which only approximately 30,000 – 100,000 have been read or published. The British Museum holds the largest collection, c. 130,000, followed by the Vorderasiatisches Museum Berlin, the Louvre, the Istanbul Archaeology Museums, the National Museum of Iraq, the Yale Babylonian Collection (c.40,000) and Penn Museum. Most of these have "lain in these collections for a century without being translated, studied or published," as there are only a few hundred qualified cuneiformists in the world.
@@ -0,0 +1 @@
+This is a Unicode block containing characters for representing unique monetary signs. Many currency signs can be found in other unicode blocks, especially when the [b][https://unicode-table.com/en/sets/currency-symbols/ currency symbols][/b] is unique to a country that uses a script not generally used outside that country.
@@ -0,0 +1 @@
+The Cypriot or Cypriote syllabary is a syllabic script used in Iron Age Cyprus, from about the 11th to the 4th centuries BCE, when it was replaced by the [BLOCK:greek-coptic Greek alphabet]. A pioneer of that change was king Evagoras of Salamis. It is descended from the Cypro-Minoan syllabary, in turn a variant or derivative of [BLOCK:linear-a Linear A]. Most texts using the script are in the Arcadocypriot dialect of Greek, but some bilingual (Greek and Eteocypriot) inscriptions were found in Amathus.
@@ -0,0 +1 @@
+Cyrillic Extended-A is a Unicode block containing combining [BLOCK:cyrillic Cyrillic] letters used in Old Church Slavonic texts.
@@ -0,0 +1 @@
+Cyrillic Extended-B is a Unicode block containing [BLOCK:cyrillic Cyrillic] characters for writing Old Cyrillic and Old Abkhazian, and combining numeric signs.
@@ -0,0 +1 @@
+Cyrillic Supplement is a Unicode block containing [BLOCK:cyrillic Cyrillic] letters for writing several minority languages, including Abkhaz, Kurdish, Komi, Mordvin, Aleut, Azerbaijani, and Jakovlev's Chuvash orthography.
@@ -0,0 +1,5 @@
+Cyrillic is a Unicode block containing the characters used to write the most widely used languages with a Cyrillic orthography. The core of the block is based on the ISO 8859-5 standard, with additions for minority languages and historic orthographies.
+
+The Cyrillic script /sɨˈrɪlɪk/ is an alphabetic writing system employed across Eastern Europe, North and Central Asian countries. It is based on the Early Cyrillic, which was developed in the First Bulgarian Empire during the 9th century AD at the Preslav Literary School. It is the basis of alphabets used in various languages, past and present, in parts of Southeastern Europe and Northern Eurasia, especially those of Slavic origin, and non-Slavic languages influenced by Russian. As of 2011, around 252 million people in Eurasia use it as the official alphabet for their national languages. About half of them are in Russia. Cyrillic is one of the most used writing systems in the world.
+Cyrillic is derived from the [BLOCK:greek-coptic Greek] uncial script, augmented by letters from the older [BLOCK:glagolitic Glagolitic alphabet], including some ligatures. These additional letters were used for [BLOCK:cyrillic-extended-a Old Church Slavonic] sounds not found in Greek. The script is named in honor of the two Byzantine brothers, Saints Cyril and Methodius, who created the Glagolitic alphabet earlier on. Modern scholars believe that Cyrillic was developed and formalized by early disciples of Cyril and Methodius.
+With the accession of Bulgaria to the European Union on 1 January 2007, Cyrillic became the third official script of the European Union, following the Latin and Greek scripts.
@@ -0,0 +1,3 @@
+The Deseret alphabet (/dɛz.əˈrɛt./) is a phonemic English spelling reform developed in the mid-19th century by the board of regents of the University of Deseret (later the University of Utah) under the direction of Brigham Young, second president of The Church of Jesus Christ of Latter-day Saints.
+In public statements, Young claimed the alphabet was intended to replace the traditional [BLOCK:basic-latin Latin alphabet] with an alternative, more phonetically accurate alphabet for the English language. This would offer immigrants an opportunity to learn to read and write English, he said, the orthography of which is often less phonetically consistent than those of many other languages. Similar experiments were not uncommon during the period, the most well-known of which is the Shavian alphabet.
+Young also prescribed the learning of Deseret to the school system, stating "It will be the means of introducing uniformity in our orthography, and the years that are now required to learn to read and spell can be devoted to other studies".
@@ -0,0 +1 @@
+Devanagari Extended is a Unicode block containing cantilation marks for writing the Samaveda, and nasalization marks for the [BLOCK:devanagari Devanagari script].
@@ -0,0 +1,3 @@
+Devanagari is a Unicode block containing characters for writing Hindi, Marathi, Sindhi, Nepali and Sanskrit. In its original incarnation, the code points U+0900..U+0954 were a direct copy of the characters A0-F4 from the 1988 ISCII standard. The [BLOCK:bengali], [BLOCK:gurmukhi], [BLOCK:gujarati], [BLOCK:oriya], [BLOCK:tamil], [BLOCK:telugu], [BLOCK:kannada], and [BLOCK:malayalam] blocks were similarly all based on their ISCII encodings.
+
+Devanagari, also called Nagari, is an abugida alphabet of India and Nepal. It is written from left to right, does not have distinct letter cases, and is recognisable (along with most other North Indic scripts, with a few exceptions like [BLOCK:gujarati] and [BLOCK:oriya]) by a horizontal line that runs along the top of full letters. Since the 19th century, it has been the most commonly used script for writing Sanskrit. Devanagari is used to write Hindi, Nepali, Marathi, Konkani, Bodo and Maithili among other languages and dialects. It was formerly used to write Gujarati. Because it is the standardised script for the Hindi, Nepali, Marathi, Konkani and Bodo languages, Devanagari is one of the most used and adopted writing systems in the world.
@@ -0,0 +1 @@
+A dingbat is an ornament, character, or spacer used in typesetting, sometimes more formally known as a printer's ornament or printer's character often employed for the creation of box frames. The term continues to be used in the computer industry to describe fonts that have symbols and shapes in the positions designated for alphabetical or numeric characters.
@@ -0,0 +1 @@
+Domino Tiles is a Unicode block containing characters for representing game situations in dominoes. The block includes symbols for the standard six dot tile set and backs in horizontal and vertical orientations.
@@ -0,0 +1 @@
+The Duployan shorthand, or Duployan stenography (French: Sténographie Duployé), was created by father Émile Duployé(fr) in 1860 for writing French. Since then, it has been expanded and adapted for writing English, German, Spanish, Romanian, and Chinook Jargon. The Duployan stenography is classified as a geometric, alphabetic, stenography and is written left-to-right in connected stenographic style. The Duployan shorthands, including Chinook writing, Pernin's Universal Phonography, Perrault's English Shorthand, the Sloan-Duployan Modern Shorthand, and Romanian stenography, were included as a single script in version 7.0 of the Unicode Standard / ISO 10646.
@@ -0,0 +1,5 @@
+The native writing systems of Ancient Egypt used to record the Egyptian language include both the Egyptian hieroglyphics and Hieratic from Protodynastic times, the 13th century BC cursive variants of the hieroglyphs which became popular, then the latest Demotic script developed from Hieratic, from 3500 BC onward.
+
+Most remaining texts in the Egyptian language are primarily written in the hieroglyphic script. However, in antiquity, the majority of texts were written on perishable papyrus in hieratic and (later) demotic, which are now lost. There was also a form of cursive hieroglyphic script used for religious documents on papyrus, such as the multi-authored Books of the Dead in the Ramesside Period; this script was closer to the stone-carved hieroglyphs, but was not as cursive as hieratic, lacking the wide use of ligatures. Additionally, there was a variety of stone-cut hieratic known as lapidary hieratic. In the language's final stage of development, the [BLOCK:coptic] alphabet replaced the older writing system. The native name for Egyptian hieroglyphic writing is "writing of the words of god." Hieroglyphs are employed in two ways in Egyptian texts: as ideograms that represent the idea depicted by the pictures; and more commonly as phonograms denoting their phonetic value.
+
+For example, the hieroglyph representing the biliteral pr is typically used as an ideogram to denote the word 'house'. In addition, the same glyph is used as a phonogram to write the word pr(y) 'to go out' due to the similarity in pronunciation. To leave no doubt as to which word is actually meant, a vertical stroke is drawn underneath the glyph to mean 'house', whereas a pair of walking legs is added next to the same glyph to clarify that pr(y) 'go out' is meant instead. To further clarify the pronunciation, the hieroglyph for mouth (ro) is typically added in between the house and the walking legs, so that the whole combination encodes the word pr(y) like this: "Word that sounds like a word for house which ends in an r and is related to walking => to go out". Hieroglyphic writing is thus an intricate mixture of phonetic and semantic components.
@@ -0,0 +1,3 @@
+The Elbasan script is a mid 18th-century alphabetic script used for the Albanian language. It was named after the city of Elbasan where it was invented. It was mainly used in the area of Elbasan and Berat.
+The primary document associated with the alphabet is the Elbasan Gospel Manuscript, known in Albanian as the Anonimi i Elbasanit (The Anonymous of Elbasan). The document was created at St. Jovan Vladimir's Church in central Albania, but is preserved today at the National Archives of Albania in Tirana. Its 59 pages contain Biblical content written in an alphabet of 40 letters.
+Another original script used for Albanian, was Beitha Kukju's script of the 19th century. This script did not have much influence either.
@@ -0,0 +1 @@
+Emoticons is a Unicode block containing graphic representations of faces, which are often associated with classic emoticons. They exist largely for compatibility with Japanese telephone carriers' implementations of Shift JIS.
@@ -0,0 +1 @@
+[BLOCK:enclosed-alphanumerics Enclosed Alphanumeric] Supplement is a Unicode block consisting mostly of [BLOCK:basic-latin Latin alphabet] characters enclosed in circles, ovals or boxes, used for a variety of purposes.
@@ -0,0 +1 @@
+Enclosed alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop. There is another block for these characters (U+1F100—U+1F1FF), encoded in the Supplementary Multilingual Plane,[3] which contains the set of Regional Indicator Symbols as of Unicode 6.0.
@@ -0,0 +1 @@
+Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized [BLOCK:katakana Katakana], [BLOCK:hangul-syllables Hangul], and [BLOCK:cjk-unified-ideographs CJK ideographs]. During the unification with ISO 10646 for version 1.1, the Japanese Industrial Standard Symbol was reassigned from the code point U+32FF at the end of the block to U+3004. Also included in the block are miscellaneous glyphs that would more likely fit in [BLOCK:cjk-compatibility CJK Compatibility] or [BLOCK:enclosed-alphanumerics Enclosed Alphanumerics]: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares (representing speed limit signs).
@@ -0,0 +1 @@
+Enclosed Ideographic Supplement is a Unicode block containing characters for compatibility with the Japanese ARIB STD-B24 standard. It contains a squared kana word, and many [BLOCK:cjk-unified-ideographs CJK ideographs] enclosed with squares, brackets, or circles.
@@ -0,0 +1 @@
+[BLOCK:ethiopic Ethiopic] Extended-A is a Unicode block containing Ge'ez characters for the Gamo-Gofa-Dawro, Basketo, and Gumuz languages of Ethiopia.
@@ -0,0 +1 @@
+[BLOCK:ethiopic Ethiopic] Extended is a Unicode block containing Ge'ez characters for the Me'en, Blin, and Sebatbeit languages.
@@ -0,0 +1 @@
+[BLOCK:ethiopic Ethiopic] Supplement is a Unicode block containing extra Ge'ez characters for writing the Sebatbeit language, and Ethiopic tone marks.
@@ -0,0 +1,4 @@
+Ethiopic is a Unicode block containing characters for writing the Ge'ez, Tigrinya, Amharic, Tigre, and Oromo languages.
+Ge'ez (ግዕዝ Gəʿəz), (also known as Ethiopic) is a script used as an abugida (syllable alphabet) for several languages of Ethiopia and Eritrea. It originated as an abjad (consonant-only alphabet) and was first used to write Ge'ez, now the liturgical language of the Ethiopian Orthodox Tewahedo Church and the Eritrean Orthodox Tewahedo Church. In Amharic and Tigrinya, the script is often called fidäl (ፊደል), meaning "script" or "alphabet".
+The Ge'ez script has been adapted to write other, mostly Semitic, languages, particularly Amharic in Ethiopia, and Tigrinya in both Eritrea and Ethiopia. It is also used for Sebatbeit, Me'en, and most other languages of Ethiopia. In Eritrea it is used for Tigre, and it has traditionally been used for Blin, a Cushitic language. Tigre, spoken in western and northern Eritrea, is considered to resemble Ge'ez more than do the other derivative languages. Some other languages in the Horn of Africa, such as Oromo, used to be written using Ge'ez, but have migrated to Latin-based orthographies.
+For the representation of sounds, this article uses a system that is common (though not universal) among linguists who work on Ethiopian Semitic languages. This differs somewhat from the conventions of the [BLOCK:ipa-extensions International Phonetic Alphabet].
@@ -0,0 +1 @@
+General Punctuation is a Unicode block containing punctuation, [BLOCK:spacing-modifier-letters spacing], and formatting characters for use with all scripts and writing systems. Included are the defined-width spaces, joining formats, directional formats, smart quotes, archaic and novel punctuation such as the interobang, and invisible [BLOCK:mathematical-operators mathematical operators].
@@ -0,0 +1 @@
+[BLOCK:geometric-shapes Geometric Shapes] Extended is a Unicode block containing webdings/wingdings symbols, mostly different weights of squares, crosses, and saltires, and different weights of variously spoked asterisks and stars.
@@ -0,0 +1 @@
+Geometric Shapes is a Unicode block of 96 symbols at codepoint range U+25A0-25FF.
@@ -0,0 +1 @@
+Georgian Supplement is a Unicode block containing characters for the ecclesiastical form of the Georgian script, Nuskhuri. To write the full ecclesiastical Khutsuri orthography, the Asomtavruli capitals encoded in the [BLOCK:georgian] block.
@@ -0,0 +1,4 @@
+Georgian is a Unicode block containing the Mkhedruli and Asomtavruli Georgian characters used to write Modern Georgian, Svan, and Mingrelian languages. Another lower case, Nuskhuri, is encoded in a separate Georgian Supplement block, which is used with the Asomtavruli to write the ecclesiastical Khutsuri Georgian script.
+
+The Georgian scripts are the three writing systems used to write the Georgian language: Asomtavruli, Nuskhuri and Mkhedruli. Their letters are equivalent, sharing the same names and alphabetical order and all three are unicameral (make no distinction between upper and lower case). Although each continues to be used, Mkhedruli (see below) is taken as the standard for Georgian and its related Kartvelian languages.
+The scripts originally had 38 letters. Georgian is currently written in a 33-letter alphabet, as five of the letters are obsolete in that language. The Mingrelian alphabet uses 36: the 33 of Georgian, one letter obsolete for that language, and two additional letters specific to Mingrelian and Svan. That same obsolete letter, plus a letter borrowed from [BLOCK:greek-coptic Greek], are used in the 35-letter Laz alphabet. The fourth Kartvelian language, Svan, is not commonly written, but when it is it uses the letters of the Mingrelian alphabet, with an additional obsolete Georgian letter and sometimes supplemented by diacritics for its many vowels.
@@ -0,0 +1,2 @@
+Glagolitic is a Unicode block containing the characters invented by Saint Cyril for translating scripture into Slavonic. The Glagolitic script is the precursor of [BLOCK:cyrillic].
+The Glagolitic alphabet /ˌɡlæɡɵˈlɪtɨk/, also known as Glagolitsa, is the oldest known Slavic alphabet, from the 9th century.
@@ -0,0 +1,2 @@
+The Gothic alphabet is an alphabet for writing the Gothic language, created in the 4th century by Ulfilas (or Wulfila) for the purpose of translating the Bible.
+The alphabet is essentially an uncial form of the [BLOCK:greek-coptic Greek alphabet], with a few additional letters to account for Gothic phonology: Latin F, two Runic letters to distinguish the /j/ and /w/ glides from vocalic /i/ and /u/, and the letter ƕair to express the Gothic labiovelar. It is completely different from the 'Gothic script' of the Middle Ages, a script used to write the [BLOCK:basic-latin Latin alphabet].
@@ -0,0 +1,2 @@
+The Grantha script was widely used between the 6th century and the 19th century CE by Tamil speakers in South India, particularly in Tamil Nadu and Kerala, to write Sanskrit and classical Manipravalam, and is still in restricted use in traditional vedic schools (veda pāṭhaśālā). It is a Brahmic script, having evolved from the [BLOCK:brahmi Brāhmī script] in Tamil Nadu. The [BLOCK:malayalam Malayalam alphabet] is a direct descendant of Grantha as are the Tigalari and [BLOCK:sinhala Sinhala alphabets].
+The rising popularity of [BLOCK:devanagari Devanagari] for Sanskrit and the political pressure created by the Tanittamil Iyakkam for its complete replacement by the modern [BLOCK:tamil Tamil] script led to its gradual disuse and abandonment in Tamil Nadu in the early 20th century.
@@ -0,0 +1,11 @@
+Greek and Coptic is the Unicode block for representing modern (monotonic) Greek. It was originally used for writing Coptic, using the similar Greek letters, in addition to the uniquely Coptic additions. Beginning with version 4.1 of the Unicode Standard, a separate [BLOCK:coptic Coptic block] has been included in Unicode, allowing for mixed Greek/Coptic text that is stylistically contrastive, as is convention in scholarly works. Writing polytonic Greek requires the use of combining characters or the precomposed vowel + tone characters in the [BLOCK:greek-extended Greek Extended] character block.
+
+The Greek alphabet is the script that has been used to write the Greek language since the 8th century BC. It was derived from the earlier [BLOCK:phoenician Phoenician alphabet], and was the first alphabetic script to have distinct letters for vowels as well as consonants. As such, it became the ancestor of numerous other European and Middle Eastern alphabets, including [BLOCK:basic-latin Latin] and [BLOCK:cyrillic Cyrillic]. Apart from its use in writing the Greek language, both in its ancient and its modern forms, the Greek alphabet today also serves as a source of technical symbols and labels in many domains of mathematics, science and other fields.
+ 
+In its classical and modern forms, the alphabet has 24 letters, ordered from alpha to omega. Like [BLOCK:latin] and [BLOCK:cyrillic], Greek originally had only a single form of each letter; it developed the letter case distinction between upper-case and lower-case forms in parallel with Latin during the modern era.
+ 
+Sound values and conventional transcriptions for some of the letters differ between Ancient Greek and Modern Greek usage, owing to phonological changes in the language.
+ 
+In traditional ("polytonic") Greek orthography, vowel letters can be combined with several diacritics, including accent marks, so-called "breathing" marks, and the iota subscript. In common present-day usage for Modern Greek since the 1980s, this system has been simplified to a so-called "monotonic" convention
+
+The Coptic alphabet is the script used for writing the Coptic language. The repertoire of glyphs is based on the Greek alphabet augmented by letters borrowed from the Egyptian Demotic and is the first alphabetic script used for the Egyptian language. There are several [BLOCK:coptic alphabets], as the Coptic writing system may vary greatly among the various dialects and subdialects of the Coptic language.
@@ -0,0 +1 @@
+Greek Extended is a Unicode block containing the accented vowels necessary for writing polytonic Greek. The regular, unaccented Greek characters can be found in the [BLOCK:greek-coptic Greek and Coptic] (Unicode block). Greek Extended was encoded in version 1.1 of the Unicode Standard as is, having had no additions up to 6.2. As an alternative to Greek Extended, combining characters can be used to represent the tones and breath marks of polytonic Greek.
@@ -0,0 +1,5 @@
+Gujarati is a Unicode block containing characters for writing the Gujarati language. In its original incarnation, the code points U+0A81..U+0AD0 were a direct copy of the Gujarati characters A1-F0 from the 1988 ISCII standard. The [BLOCK:devanagari], [BLOCK:bengali], [BLOCK:gurmukhi], [BLOCK:oriya], [BLOCK:tamil], [BLOCK:telugu], [BLOCK:kannada], and [BLOCK:malayalam] blocks were similarly all based on their ISCII encodings.
+
+The Gujarati script, which like all Nāgarī writing systems is strictly speaking an abugida rather than an alphabet, is used to write the Gujarati and Kutchi languages. It is a variant of [BLOCK:devanagari Devanāgarī script] differentiated by the loss of the characteristic horizontal line running above the letters and by a small number of modifications in the remaining characters.
+With a few additional characters, added for this purpose, the Gujarati script is also often used to write Sanskrit and Hindi.
+Gujarati numerical digits are also different from their [BLOCK:devanagari Devanagari] counterparts.
@@ -0,0 +1,8 @@
+Gurmukhi is a Unicode block containing characters for the Punjabi language, as it is written in India. In its original incarnation, the code points U+0A02..U+0A4C were a direct copy of the [https://unicode-table.com/en/alphabets/gurmukhi/ Gurmukhi] characters A2-EC from the 1988 ISCII standard. The [BLOCK:devanagari], [BLOCK:bengali], [BLOCK:gujarati], [BLOCK:oriya], [BLOCK:tamil], [BLOCK:telugu], [BLOCK:kannada], and [BLOCK:malayalam] blocks were similarly all based on their ISCII encodings.
+
+Gurmukhi is the most common script used for writing the Punjabi language in India. An abugida derived from the Laṇḍā script and ultimately descended from [BLOCK:brahmi Brahmi], Gurmukhi was standardised by the second Sikh guru, Guru Angad, in the 16th century. The whole of the Guru Granth Sahib's 1430 pages are written in this script. The name Gurmukhi is derived from the Old Punjabi term "gurumukhī", meaning "from the mouth of the Guru".
+
+Modern Gurmukhi has thirty-eight consonants (vianjan), nine vowel symbols (lāga mātrā), two symbols for nasal sounds (bindī and ṭippī), and one symbol which duplicates the sound of any consonant (addak). In addition, four conjuncts are used: three subjoined forms of the consonants Rara, Haha and Vava, and one half-form of Yayya. Use of the conjunct forms of Vava and Yayya is increasingly scarce in modern contexts.
+
+Gurmukhi is primarily used in the Punjab state of India where it is the sole official script for all official and judicial purposes. The script is also widely used in the Indian states of Haryana, Himachal Pradesh, Jammu and the national capital of Delhi, with Punjabi being one of the official languages in these states. Gurmukhi has been adapted to write other languages, such as Braj Bhasha, Khariboli (and other Hindustani dialects), Sanskrit and Sindhi.
+
@@ -0,0 +1,5 @@
+In [BLOCK:cjk-unified-ideographs CJK] (Chinese, Japanese and Korean) computing, graphic characters are traditionally classed into fullwidth (in Taiwan and Hong Kong: 全形; in CJK and Japanese: 全角) and halfwidth (in Taiwan and Hong Kong: 半形; in CJK and Japanese: 半角) characters. With fixed-width fonts, a halfwidth character occupies half the width of a fullwidth character, hence the name.
+ 
+In the days of computer terminals and text mode computing, characters were normally laid out in a grid, often 80 columns by 24 or 25 lines. Each character was displayed as a small dot matrix, often about 8 pixels wide, and an SBCS (single byte character set) was generally used to encode characters of western languages.
+ 
+For a number of practical and aesthetic reasons, Han characters would need to be twice as wide as these fixed-width SBCS characters. These "fullwidth characters" were typically encoded in a DBCS (double byte character set), although less common systems used other variable-width character sets that used more bytes per character.
@@ -0,0 +1 @@
+Hangul Compatibility Jamo is a Unicode block containing [BLOCK:hangul-syllables Hangul characters] for compatibility with Korean Standard KS X 1001:1998.
@@ -0,0 +1 @@
+[BLOCK:hangul-jamo Hangul Jamo] Extended-A is a Unicode block containing positional (Choseong, Jungseong, and Jongseong) forms of archaic Hangul consonant and vowel clusters. They can be used to dynamically compose syllables that are not available as precomposed archaic [BLOCK:hangul-syllables Hangul syllables] in Unicode containing sounds that have since merged phonetically with other sounds in modern pronunciation.
@@ -0,0 +1 @@
+[BLOCK:hangul-jamo Hangul Jamo] Extended-B is a Unicode block containing positional (Choseong, Jungseong, and Jongseong) forms of archaic Hangul consonant and vowel clusters. They can be used to dynamically compose syllables that are not available as precomposed archaic [BLOCK:hangul-syllables Hangul syllables] in Unicode containing sounds that have since merged phonetically with other sounds in modern pronunciation.
@@ -0,0 +1 @@
+Hangul Jamo is a Unicode block containing positional (Choseong, Jungseong, and Jongseong) forms of the Hangul consonant and vowel clusters. They can be used to dynamically compose syllables that are not available as precomposed [BLOCK:hangul-syllables Hangul syllables] in Unicode, specifically archaic syllables containing sounds that have since merged phonetically with other sounds in modern pronunciation.
@@ -0,0 +1 @@
+Hangul Syllables is a Unicode block containing precomposed Hangul syllable blocks for Modern Korean. The syllables can be directly mapped by algorithm to sequences of characters in the [BLOCK:hangul-jamo Hangul Jamo] Unicode block.
@@ -0,0 +1,3 @@
+Hanunoo is a Unicode block containing characters used for writing the Hanunó'o language.
+
+Hanunó’o is one of the indigenous scripts of the Philippines and is used by the Mangyan peoples of southern Mindoro to write the Hanunó'o language. It is an abugida descended from the [BLOCK:brahmi Brahmic] scripts, closely related to Baybayin, and is famous for being written vertical but written upward, rather than downward as nearly all other scripts (however, it's read horizontally left to right). It is usually written on bamboo by incising characters with a knife. Most known Hanunó'o inscriptions are relatively recent because of the perishable nature of bamboo. It is therefore difficult to trace the history of the script.
@@ -0,0 +1,6 @@
+Hebrew is a Unicode block containing characters for writing the Hebrew, Yiddish, Ladino, and other Jewish diaspora languages.
+
+Hebrew is a West Semitic language of the Afroasiatic language family. Historically, it is regarded as the language of the Hebrew Israelites and their ancestors, although the language was not referred to by the name Hebrew in the Tanakh. The earliest examples of written Paleo-Hebrew date from the 10th century BCE, in the form of primitive drawings, although "the question of the language used in this inscription remained unanswered, making it impossible to prove whether it was in fact Hebrew or another local language".
+Hebrew had ceased to be an everyday spoken language somewhere between 200 and 400 CE, declining since the aftermath of the Bar Kochba War. Aramaic and to a lesser extent Greek were already in use as international languages, especially among elites and immigrants. It survived into the medieval period as the language of Jewish liturgy, rabbinic literature, intra-Jewish commerce, and poetry. Then, in the 19th century, it was revived as a spoken and literary language, and, according to Ethnologue, is now the language of 9 million people worldwide, of whom 7 million are from Israel. The United States has the second largest Hebrew speaking population, with about 221,593 fluent speakers, mostly from Israel.
+Modern Hebrew is one of the two official languages of Israel (the other being Arabic), while pre-modern Hebrew is used for prayer or study in Jewish communities around the world today. Ancient Hebrew is also the liturgical tongue of the Samaritans, while modern Hebrew or Arabic is their vernacular. As a foreign language, it is studied mostly by Jews and students of Judaism and Israel, and by archaeologists and linguists specializing in the Middle East and its civilizations, as well as by theologians in Christian seminaries.
+The Torah (the first five books), and most of the rest of the Hebrew Bible, is written in Biblical Hebrew, with much of its present form specifically in the dialect that scholars believe flourished around the 6th century BCE, around the time of the Babylonian exile. For this reason, Hebrew has been referred to by Jews as Leshon HaKodesh (לשון הקדש), "The Holy Language", since ancient times.
--- a/Show More
+++ b/Show More
				`@@ -0,0 +1 @@`
				`Aegean numbers was the numeral system used by the Minoan and Mycenaean civilizations. They are attested in several Aegean scripts ([BLOCK:linear-a Linear A], [BLOCK:linear-b-syllabary Linear B]). They may have survived in the Cypro-Minoan script where a single sign with "100" value is attested so far on a large clay tablet from Enkomi.`
				`@@ -0,0 +1 @@`
				`Alchemical symbols, originally devised as part of alchemy, were used to denote some elements and some compounds until the 18th century. Note that while notation like this was mostly standardized, style and symbol varied between alchemists.`
				`@@ -0,0 +1 @@`
				`Alphabetic Presentation Forms is a Unicode block containing standard ligatures for the [BLOCK:basic-latin Latin], [BLOCK:armenian Armenian], and [BLOCK:hebrew Hebrew] scripts.`
				`@@ -0,0 +1 @@`
				`Ancient Symbols is a Unicode block containing Roman characters for [https://unicode-table.com/en/sets/currency-symbols/ currency], [https://unicode-table.com/en/2696/ weights], and [https://unicode-table.com/en/1F4CF/ measures].`
				`@@ -0,0 +1 @@`
				`[BLOCK:arabic Arabic] Extended-A is a Unicode block encoding Qur'anic annotations and letter variants used for various non-Arabic languages.`
				`@@ -0,0 +1 @@`
				`[BLOCK:arabic Arabic] Mathematical Alphabetic Symbols is a Unicode block encoding characters used in Arabic mathematical expressions.`
				`@@ -0,0 +1 @@`
				`[BLOCK:arabic Arabic] Presentation Forms-A is a Unicode block encoding contextual forms and ligatures of letter variants needed for Persian, Urdu, Sindhi and Central Asian languages. The presentation forms are present only for compatibility with older standards, and are not currently needed for coding text.`
				`@@ -0,0 +1 @@`
				`Arabic Supplement is a Unicode block that encodes [BLOCK:arabic Arabic letter] variants mostly used for writing African (non-Arabic) languages.`
				`@@ -0,0 +1 @@`
				`Arrows is a Unicode block containing line, curve, and semicircle symbols terminating in barbs or arrows.`
				`@@ -0,0 +1 @@`
				`Bamum Supplement is a Unicode block containing the characters of the historic stage A-F of the Bamum script, used for writing the Bamum language of western Cameroon. The modern stage G characters, which include many characters used for stage A-F orthographies, are included in the [BLOCK:bamum Bamum] block.`