February 15th, 2005

Don’t disable IDN

»

I couldn’t put it better so I won’t. From Paul Hoffman:

Reading the ensuing Slashdot and other coverage gave me the feeling that nearly everyone talking was from the US, UK, or Australia, the three countries that have the least native need for IDNs.

It also became clear that few of the folks in the discussion knew much about Unicode (and, in some cases, the DNS…). Suggestions like “find all the homographs and map them together” and “ban all domain names that have more than one language in them” reminded me of discussions four years ago with people who were also unfamiliar with the basic topics but felt empowered to speak anyway.

For completeness, I should explain why both of those proposals are silly. The number of homographs in Unicode is in the thousands under the best of situations, and much higher in the worst…

Banning all domain names with more than one “language” would ban names that include both non-ASCII and ASCII characters. This ignores how deeply English and French have mixed with other languages; it is common to find businesses with the word “shop” or “cafĂ©” in their names throughout the world…

Given that the problem is that domain names with more than one script can cause homograph confusion, the solution should highlight names that have more than one script and say what script the characters come from. This can be done with a hover-over pop-up like this:

idnspoof-art.gif

It is clear that what would be best is that the proposed solutions come from people who have both a reasonable understanding of internationalization and a reasonable amount of care about languages other than English.

Comments are closed.