Thursday, 27 May 2021

� or code point blank

Specials or replacement characters are shunted to the very end of Unicode allocations to act as a substitute for an otherwise unrepresentable glyph (see previously here and here). The garbled text that can result from bad decoding and false rendering is referred to mojibake (文字化け). Though the effects are most catastrophic across different writing systems, languages that use the extended Latin alphabet assigned the character set “Western” or ISO-8859-1 encounter problems as well with the Icelandic praise for outstanding hospitality þjóðlöð transmitted as the unintelligible mess þjóðlöð or some other character string likely to break one’s computer.