Fowler's Law on Unicode: There's always another bug, you just haven't found it yet.
Dr Drang's script counts the number of _characters_ not the number of _glyphs_. This matters because there's more than one way to represent é: Either just as unicode character \x{e9} ("NFC") or as a combination of "e" and the combining character that adds the accent ("NFD")
For example for "léon" this prints out "l3n" for me.
For fun, I wrote some Javascript that will numeronymize text, but it will also "de-numeronymize" it again by converting the result back into random words that also match. (if a match can't be found, it returns the original word, and unlike the article, it doesn't handle non-English characters)
https://www.timpark.org/n10e-s2e-t2t/
I dislike numeronyms. They may be shorter to type, but unlike acronyms, where the acronym itself is a valid pronounciation, numeronyms cannot usually be pronounced. The only way to know how is to know what the original word is, so you have to ask every time.
Indeed, it is strictly worse than an abbreviation made by removing the medial part, e.g. intl'n or acc'y, because numeronyms only keep the first and last character.
I spent some time staring at l4h, after quickly reading o4e as 'obese' on the way there. I suppose this might be a good Freudian slip generation scheme?
I'm loving the perl one liners. I fear its a dying art!
Tangent:
I worked at a large financial news site for a number of years.
One of our best engineers spun up an "a11y" sub team. As it was quite involved and they went team to team doing things, I assume it was some sort of dev tool initiative.
It was only after I left and I was describing it as the "ally" team that I was told what it meant.
Its like "banal" its only when you say it out load amongst (hopefully) friends do you realise that you've not got it quite right....
> e14n -> "Andreesen Horowitz" is not a typo, it is a bit of an easter egg/joke (Sorry, I can't help myself.):
> "e14n" has recently shown up in social meda as shorthand for @pluralistic's "enshittification" coinage. Andreesen Horowitz often refers to themselves using a numeronym: "a16z".
Dr Drang's script counts the number of _characters_ not the number of _glyphs_. This matters because there's more than one way to represent é: Either just as unicode character \x{e9} ("NFC") or as a combination of "e" and the combining character that adds the accent ("NFD")
For example for "léon" this prints out "l3n" for me.
What you need to do is normalize to NFC.
> /usr/bin/perl -C -MUnicode::Normalize -pe '$_=NFC($_);s/(.)(.+)(.)/$1 . length($2) . $3/e'
Ex: "accessibility localization internationalization multilingualization globalization" becomes "a11y l10n i18n m17n g11n" becomes "applicability locomutation intercrystallization metaphenylenediamin gastrocnemian"
(Lack of an obvious pronunciation is a good objection, tho.)
To this day I don't know what Eli8 means, and I'm not going to even bother to look it up. It's not communication.
ally?
But yes, like every abbreviation, do use the full word the first time you use it “Considering the topic of internationalisation (i18n)…”
It works ok as long as there’s nobody named Katherine.
> perl -C -pe 's/(\w)(\w+)(\w)/$1 . length($2) . $3/ge'
Or for the less o4e among us, this v5n will only n10e words with l4h six and up:
> perl -C -pe 's/(\w)(\w\w\w\w+)(\w)/$1 . length($2) . $3/ge'
F3l v5n:
perl -C -pe 's/(\p{L})(\p{L}*)(\p{L})/$1@{[length($2)]}$3/g'
N12g w5t i18n w3d n1t b0e c6e, t2s t2s a u1f-8 c8e v5n. I c2l i0t I16r-v1.0
새0로 오0신 모0든 분1께 인3고 싶2다.
There is no better way to write with long words than to numeronymise them everywhere!
Or for the less obtuse among us, this version will only numeronymise words with length six and up:
Tangent:
I worked at a large financial news site for a number of years.
One of our best engineers spun up an "a11y" sub team. As it was quite involved and they went team to team doing things, I assume it was some sort of dev tool initiative.
It was only after I left and I was describing it as the "ally" team that I was told what it meant.
Its like "banal" its only when you say it out load amongst (hopefully) friends do you realise that you've not got it quite right....
[1]: https://megatokyo.com/strip/9
https://mastodon.hccp.org/@igb/112734767519719978
> e14n -> "Andreesen Horowitz" is not a typo, it is a bit of an easter egg/joke (Sorry, I can't help myself.):
> "e14n" has recently shown up in social meda as shorthand for @pluralistic's "enshittification" coinage. Andreesen Horowitz often refers to themselves using a numeronym: "a16z".