Linguistics Asked on December 14, 2021
Preemptive note: This question is about sound-based writing systems, excluding logographic systems like Chinese. Transitional systems like Egyptian hieroglyphs, Maya script or Man’yōgana are also excluded, as are heterograms, etc. I’m only asking about glyphs that are purely phonetic in value/intention: essentially alphabets, syllabaries, abugidas and abjads (for abjads, assume optional vowel marking to always be present).
The individual glyphs in many writing systems derive from originally logographic depictions but are now used exclusively for their phonetic (or phonemic) value, determined by the language the system is being used to express. In some writing systems, like the Latin alphabet, each glyph has been simplified enormously, to the point that few consist of more than two or three strokes when written. In others, the simplification has been less dramatic, and though rarely recognisable as any erstwhile logogram, many glyphs remains more complex than just a couple of strokes.1
On the other hand, in alphabetic writing, each glyph roughly (very roughly) represents a single phonetic entity, whereas in abugidas, abjads and various other systems, each glyph represents multiple phonetic entities. As a result, if you compare an alphabetic writing system with a high number of average strokes per glyph to an abugida with a low number of average strokes per glyph, the same phonetic sequence – assuming it’s expressible in both – would require far fewer strokes in the abugida than in the alphabet.
As an example, this is the random sequence ⟨kanita⟩ (or close equivalent) expressed in a selection of scripts:
┌─────────────────────┬────────────┬─────────┐ │ Script (Language) │ String │ Strokes │ ├─────────────────────┼────────────┼─────────┤ │ Zhuyin/Bopomofo │ ㄎㄚㄋㄧㄊㄚ │ 19 │ │ Latin (English) │ kanita │ 13 │ │ Ge’ez (Amharic) │ ካኒታ │ 11 │ │ Devanagari (Sanskrit) │ कनित │ 11 │ │ Cherokee Syllabary │ ᎧᏂᏔ │ 8 │ │ Inuktitut Syllabics │ ᑲᓂᑕ │ 3 │ └─────────────────────┴────────────┴─────────┘
Even from this very cursory sample (and even with the ambiguity of deciding exactly when something is one or two strokes), it’s clear that there’s quite a huge variation in how much graphite a pen will have to deposit on the paper in order to write the same phonetic sequence.
Some writing systems have inordinately complex glyphs to write very simple sounds, like simple vowels in some Brahmi scripts: Malayalam ആ ā (5 strokes), Javanese ꦈꦴ ū (6), Tibetan ཨུ i (7). But this may not necessarily give the system as a whole a high stroke-to-sound ratio – for instance, those three examples all have identical or lower stroke counts if you add a preceding consonant: കാ kā (4), ꦏꦹ kū (6), ཀཱི kī (7). Conversely, some systems may have only quite simple glyphs, but encode very little information in each glyph, requiring more glyphs in total, like Zhuyin/Bopomofo marking each consonant, vowel and tone as a separate glyph – or perhaps a script that marks things like place and manner of articulation, vowel height, phonation, etc., with separate glyphs or markers (if such a script even exists).
This sort of general state of affairs made me wonder – which writing systems have the highest and lowest average stroke counts overall when used to write representatively in a language it’s regularly used to write? Or more broadly, which language/writing system pairs are most/least economical in writing strings of comparable phonetic length when it comes to how far the pen will have to travel across the paper?
I realise, of course, that precise numbers are likely impossible here, but rough approximations will do fine as well. If they can be backed up by some sort of data, all the better, though I don’t expect there really is any hard data available.
1 ‘Strokes’ are well-defined in CJK writing systems, but not elsewhere. I don’t want to get too bogged down in brushstroke technicalities here, so I’ll use a simplistic definition: a stroke is any continuous, non-intersecting pen movement that does not include sharp corners; so the letter S is one stroke, while ɣ is two (intersection) and Z is three (sharp corners). It can still often be a coin toss whether something should count as one or two strokes, of course.
1 Asked on October 23, 2021 by user29231
3 Asked on March 4, 2021 by dragas
1 Asked on February 22, 2021 by david-moravec
2 Asked on February 10, 2021 by akshat-sharma
1 Asked on February 6, 2021 by noemie
1 Asked on January 8, 2021
0 Asked on January 5, 2021 by misha-monahov
computational linguistics parsing punctuation theoretical linguistics
1 Asked on January 2, 2021
1 Asked on December 7, 2020 by wodemingzi
1 Asked on December 1, 2020 by curiousdannii
4 Asked on November 27, 2020 by ibug
1 Asked on November 20, 2020 by potatoking
1 Asked on September 4, 2020 by sten
0 Asked on August 16, 2020 by matthew-fulton
Get help from others!
© 2023 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP