Punycode Encoder/Decoder

Encoder and decoder of Punycode, that is used for internationalized domain names (IDN).


Example: xn----dqo34k.com









Punycode Definition

Punycode is a representation of Unicode with the limited ASCII character subset used for Internet hostnames. Using Punycode, host names containing Unicode characters are transcoded to a subset of ASCII consisting of letters, digits, and hyphens, which is called the letter–digit–hyphen (LDH) subset.

How Punycode Works?

  1. Unicode Characters: Unicode is a standard for encoding all the world's writing systems, including languages that use non-Latin scripts like Arabic, Chinese, Cyrillic, and others, as well as special characters.
  2. ASCII Limitation: The DNS was originally designed to support only a subset of ASCII characters (letters 'a' to 'z', digits '0' to '9', and the hyphen '-'). This means that domain names with characters outside of this range, such as those with accents or in non-Latin scripts, cannot be directly used in DNS queries.
  3. Punycode Conversion: Punycode comes into play to encode these non-ASCII characters into a sequence of ASCII characters. It starts with a prefix (usually "xn--") to indicate that the domain name contains encoded characters. The rest of the domain name is encoded using a special algorithm that represents Unicode characters as ASCII characters.
  4. Example: Consider the domain name "münchen.de" (Munich in German, where "ü" is a character outside of the ASCII range). The Punycode representation of this domain name would be "xn--mnchen-3ya.de".