prog

Everything is Unicode, until the exploits started rolling in

38 2021-01-24 19:00 *: >>35
As I see it is keeping a list of code points (like Unicode but simplified) and then using dictionaries of words for compression of semi-rich-text documents. The bloat may not come from the system built-in dictionaries since they can be compressed and structured in a back&forward-compatible way. The additional dictionaries bundled with each document may add up to a great amount but the total storage efficiency is still to be estimate.
51: VIP:

do not edit these