Character Encodings

This page documents the character encodings supported by the tailor.iconv API in the Function service. Understanding these encodings is essential for handling text data conversion between different character sets, especially when working with Japanese, Chinese, and Korean text.

Overview

The Function service provides comprehensive character encoding conversion capabilities through the tailor.iconv API. This includes support for over 100 character encodings, with special focus on Japanese business systems and enterprise environments.

Unicode

Encoding NameAliasesDescriptionCharacter Types
UTF-8UTF8Unicode (UTF-8)Full-width/half-width alphanumeric, kana, kanji, symbols
UTF-16UTF16Unicode (UTF-16)Full-width/half-width alphanumeric, kana, kanji, symbols
UTF-16BE-Unicode (UTF-16 Big Endian)Full-width/half-width alphanumeric, kana, kanji, symbols
UTF-16LE-Unicode (UTF-16 Little Endian)Full-width/half-width alphanumeric, kana, kanji, symbols

Japanese Character Encodings

Shift_JIS Family Encodings

Encoding NameAliasesDescriptionCharacter Types
Shift_JISSHIFT_JIS, SJISShift JIS codeHalf-width alphanumeric, half-width katakana, full-width kana, JIS Level 1 & 2 kanji
SJIS-Shift_JIS aliasHalf-width alphanumeric, half-width katakana, full-width kana, JIS Level 1 & 2 kanji
CP932Windows-31J, MS932Microsoft extended Shift_JISHalf-width alphanumeric, half-width katakana, full-width kana, JIS Level 1 & 2 kanji, NEC special characters, IBM extended characters

EUC-JP Family Encodings

Encoding NameAliasesDescriptionCharacter Types
EUC-JPEUCJP, eucJPJapanese EUC codeHalf-width alphanumeric, full-width kana, JIS Level 1 & 2 kanji
EUC-JP-MSeucJP-msMicrosoft extended EUC-JPHalf-width alphanumeric, full-width kana, JIS Level 1 & 2 kanji, NEC special characters, IBM extended characters

ISO-2022 Family Encodings

Encoding NameAliasesDescriptionCharacter Types
ISO-2022-JPISO2022JP, JISJIS code (7-bit)ASCII, JIS Roman, half-width katakana, JIS Level 1 & 2 kanji

Enterprise System Encodings

IBM EBCDIC Japanese Encodings

Encoding NameAliasesDescriptionCharacter Types
IBM037EBCDIC-CP-US, CP037EBCDIC US/CanadaAlphanumeric
IBM290EBCDIC-JP-KANA, CP290Japanese EBCDIC katakanaAlphanumeric, katakana
IBM930CP930Japanese EBCDIC (kanji)Alphanumeric, katakana, hiragana, JIS Level 1 & 2 kanji
IBM939CP939Japanese EBCDIC extendedAlphanumeric, katakana, hiragana, JIS Level 1 & 2 kanji, extended characters
IBM943CP943Japanese PC codeHalf-width alphanumeric, half-width katakana, full-width kana, JIS Level 1 & 2 kanji
EBCDIC-JP-E-Japanese EBCDIC alphanumeric katakanaAlphanumeric, katakana
EBCDIC-JP-KANA-Japanese EBCDIC katakanaAlphanumeric, katakana

Enterprise System Aliases

Alias NameActual EncodingDescriptionCharacter Types
HitachiKEIS83IBM290Hitachi KEIS83 compatibleAlphanumeric, katakana
HitachiKEIS90IBM290Hitachi KEIS90 compatibleAlphanumeric, katakana
NECJIPSIBM290NEC JIPS compatibleAlphanumeric, katakana
NECJISIBM290NEC JIS compatibleAlphanumeric, katakana

Chinese Encodings

Encoding NameAliasesDescriptionCharacter Types
GB2312EUC-CNSimplified Chinese (basic)ASCII, simplified Chinese
GBKCP936Simplified Chinese (extended)ASCII, simplified Chinese (extended)
GB18030-Simplified Chinese (complete)ASCII, simplified Chinese (all characters)
Big5BIG5Traditional ChineseASCII, traditional Chinese
BIG5HKSCSBig5-HKSCSHong Kong extended traditional ChineseASCII, traditional Chinese, Hong Kong additional characters

Korean Encodings

Encoding NameAliasesDescriptionCharacter Types
EUC-KREUCKRKorean EUCASCII, Korean
UHCCP949Unified Hangul CodeASCII, Korean (extended)
JOHAB-Combinatorial HangulASCII, Korean (combinatorial)
ISO-2022-KRISO2022KRKorean ISO-2022ASCII, Korean characters

Other Languages

Encoding NameAliasesDescriptionCharacter Types
ISO-8859-1Latin-1Western EuropeanASCII, Western European characters
ASCIIUS-ASCIIAmerican Standard CodeBasic ASCII characters (0-127)

API Usage Examples

Basic Conversion

<span><span style="color: var(--shiki-token-comment)">// UTF-8 to Shift_JIS conversion</span></span>
<span><span style="color: var(--shiki-token-keyword)">const</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">sjisData</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-keyword)">=</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">tailor</span><span style="color: var(--shiki-token-function)">.</span><span style="color: var(--shiki-token-constant)">iconv</span><span style="color: var(--shiki-token-function)">.convert</span><span style="color: var(--shiki-color-text)">(</span><span style="color: var(--shiki-token-string-expression)">&quot;日本語テキスト&quot;</span><span style="color: var(--shiki-token-punctuation)">,</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-string-expression)">&quot;UTF-8&quot;</span><span style="color: var(--shiki-token-punctuation)">,</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-string-expression)">&quot;Shift_JIS&quot;</span><span style="color: var(--shiki-color-text)">);</span></span>
<span></span>
<span><span style="color: var(--shiki-token-comment)">// EUC-JP to UTF-8 conversion</span></span>
<span><span style="color: var(--shiki-token-keyword)">const</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">utf8Text</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-keyword)">=</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">tailor</span><span style="color: var(--shiki-token-function)">.</span><span style="color: var(--shiki-token-constant)">iconv</span><span style="color: var(--shiki-token-function)">.convert</span><span style="color: var(--shiki-color-text)">(eucjpData</span><span style="color: var(--shiki-token-punctuation)">,</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-string-expression)">&quot;EUC-JP&quot;</span><span style="color: var(--shiki-token-punctuation)">,</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-string-expression)">&quot;UTF-8&quot;</span><span style="color: var(--shiki-color-text)">);</span></span>
<span></span>

Using Enterprise Aliases

<span><span style="color: var(--shiki-token-comment)">// HitachiKEIS83 (actually IBM290) conversion</span></span>
<span><span style="color: var(--shiki-token-keyword)">const</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">keisData</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-keyword)">=</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">tailor</span><span style="color: var(--shiki-token-function)">.</span><span style="color: var(--shiki-token-constant)">iconv</span><span style="color: var(--shiki-token-function)">.convert</span><span style="color: var(--shiki-color-text)">(</span><span style="color: var(--shiki-token-string-expression)">&quot;カタカナ&quot;</span><span style="color: var(--shiki-token-punctuation)">,</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-string-expression)">&quot;UTF-8&quot;</span><span style="color: var(--shiki-token-punctuation)">,</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-string-expression)">&quot;HitachiKEIS83&quot;</span><span style="color: var(--shiki-color-text)">);</span></span>
<span></span>

Custom Replacement Characters

<span><span style="color: var(--shiki-token-comment)">// Replace unconvertible characters with asterisk</span></span>
<span><span style="color: var(--shiki-token-keyword)">const</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">asciiText</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-keyword)">=</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">tailor</span><span style="color: var(--shiki-token-function)">.</span><span style="color: var(--shiki-token-constant)">iconv</span><span style="color: var(--shiki-token-function)">.convert</span><span style="color: var(--shiki-color-text)">(</span><span style="color: var(--shiki-token-string-expression)">&quot;Hello 世界!&quot;</span><span style="color: var(--shiki-token-punctuation)">,</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-string-expression)">&quot;UTF-8&quot;</span><span style="color: var(--shiki-token-punctuation)">,</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-string-expression)">&quot;ASCII//TRANSLIT:*&quot;</span><span style="color: var(--shiki-color-text)">);</span></span>
<span><span style="color: var(--shiki-token-comment)">// Result: &quot;Hello **!&quot;</span></span>
<span></span>
<span><span style="color: var(--shiki-token-comment)">// Custom string replacement</span></span>
<span><span style="color: var(--shiki-token-keyword)">const</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">asciiText2</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-keyword)">=</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">tailor</span><span style="color: var(--shiki-token-function)">.</span><span style="color: var(--shiki-token-constant)">iconv</span><span style="color: var(--shiki-token-function)">.convert</span><span style="color: var(--shiki-color-text)">(</span><span style="color: var(--shiki-token-string-expression)">&quot;Test 日本語&quot;</span><span style="color: var(--shiki-token-punctuation)">,</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-string-expression)">&quot;UTF-8&quot;</span><span style="color: var(--shiki-token-punctuation)">,</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-string-expression)">&quot;ASCII//TRANSLIT:[?]&quot;</span><span style="color: var(--shiki-color-text)">);</span></span>
<span><span style="color: var(--shiki-token-comment)">// Result: &quot;Test [?][?][?]&quot;</span></span>
<span></span>
<span><span style="color: var(--shiki-token-comment)">// Underscore replacement</span></span>
<span><span style="color: var(--shiki-token-keyword)">const</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">asciiText3</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-keyword)">=</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">tailor</span><span style="color: var(--shiki-token-function)">.</span><span style="color: var(--shiki-token-constant)">iconv</span><span style="color: var(--shiki-token-function)">.convert</span><span style="color: var(--shiki-color-text)">(</span><span style="color: var(--shiki-token-string-expression)">&quot;abc世界xyz&quot;</span><span style="color: var(--shiki-token-punctuation)">,</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-string-expression)">&quot;UTF-8&quot;</span><span style="color: var(--shiki-token-punctuation)">,</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-string-expression)">&quot;ASCII//TRANSLIT:_&quot;</span><span style="color: var(--shiki-color-text)">);</span></span>
<span><span style="color: var(--shiki-token-comment)">// Result: &quot;abc__xyz&quot;</span></span>
<span></span>

TRANSLIT with Special Characters

<span><span style="color: var(--shiki-token-comment)">// Convert special characters using TRANSLIT</span></span>
<span><span style="color: var(--shiki-token-keyword)">const</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">specialText</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-keyword)">=</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-string-expression)">&quot;abc ß α € àḃç&quot;</span><span style="color: var(--shiki-color-text)">;</span></span>
<span></span>
<span><span style="color: var(--shiki-token-comment)">// Default TRANSLIT (uses ? for unconvertible characters)</span></span>
<span><span style="color: var(--shiki-token-keyword)">const</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">result1</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-keyword)">=</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">tailor</span><span style="color: var(--shiki-token-function)">.</span><span style="color: var(--shiki-token-constant)">iconv</span><span style="color: var(--shiki-token-function)">.convert</span><span style="color: var(--shiki-color-text)">(specialText</span><span style="color: var(--shiki-token-punctuation)">,</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-string-expression)">&quot;UTF-8&quot;</span><span style="color: var(--shiki-token-punctuation)">,</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-string-expression)">&quot;ASCII//TRANSLIT&quot;</span><span style="color: var(--shiki-color-text)">);</span></span>
<span><span style="color: var(--shiki-token-comment)">// Result: &quot;abc ? ? ? ???&quot;</span></span>
<span></span>
<span><span style="color: var(--shiki-token-comment)">// Custom replacement with asterisk</span></span>
<span><span style="color: var(--shiki-token-keyword)">const</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">result2</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-keyword)">=</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">tailor</span><span style="color: var(--shiki-token-function)">.</span><span style="color: var(--shiki-token-constant)">iconv</span><span style="color: var(--shiki-token-function)">.convert</span><span style="color: var(--shiki-color-text)">(specialText</span><span style="color: var(--shiki-token-punctuation)">,</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-string-expression)">&quot;UTF-8&quot;</span><span style="color: var(--shiki-token-punctuation)">,</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-string-expression)">&quot;ASCII//TRANSLIT:*&quot;</span><span style="color: var(--shiki-color-text)">);</span></span>
<span><span style="color: var(--shiki-token-comment)">// Result: &quot;abc * * * ***&quot;</span></span>
<span></span>
<span><span style="color: var(--shiki-token-comment)">// Custom replacement with underscore</span></span>
<span><span style="color: var(--shiki-token-keyword)">const</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">result3</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-keyword)">=</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">tailor</span><span style="color: var(--shiki-token-function)">.</span><span style="color: var(--shiki-token-constant)">iconv</span><span style="color: var(--shiki-token-function)">.convert</span><span style="color: var(--shiki-color-text)">(specialText</span><span style="color: var(--shiki-token-punctuation)">,</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-string-expression)">&quot;UTF-8&quot;</span><span style="color: var(--shiki-token-punctuation)">,</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-string-expression)">&quot;ASCII//TRANSLIT:_&quot;</span><span style="color: var(--shiki-color-text)">);</span></span>
<span><span style="color: var(--shiki-token-comment)">// Result: &quot;abc _ _ _ ___&quot;</span></span>
<span></span>

Checking Available Encodings

<span><span style="color: var(--shiki-token-comment)">// Get all Japanese-related encodings</span></span>
<span><span style="color: var(--shiki-token-keyword)">const</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">allEncodings</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-keyword)">=</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">tailor</span><span style="color: var(--shiki-token-function)">.</span><span style="color: var(--shiki-token-constant)">iconv</span><span style="color: var(--shiki-token-function)">.encodings</span><span style="color: var(--shiki-color-text)">();</span></span>
<span><span style="color: var(--shiki-token-keyword)">const</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">jpEncodings</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-keyword)">=</span><span style="color: var(--shiki-color-text)"> </span><span style="color: var(--shiki-token-constant)">allEncodings</span><span style="color: var(--shiki-token-function)">.filter</span><span style="color: var(--shiki-color-text)">(enc </span><span style="color: var(--shiki-token-keyword)">=&gt;</span><span style="color: var(--shiki-color-text)"> </span></span>
<span><span style="color: var(--shiki-color-text)">    </span><span style="color: var(--shiki-token-constant)">enc</span><span style="color: var(--shiki-token-function)">.match</span><span style="color: var(--shiki-color-text)">(</span><span style="color: var(--shiki-token-string-expression)">/JP</span><span style="color: var(--shiki-token-keyword)">|</span><span style="color: var(--shiki-token-string-expression)">JIS</span><span style="color: var(--shiki-token-keyword)">|</span><span style="color: var(--shiki-token-string-expression)">Shift</span><span style="color: var(--shiki-token-keyword)">|</span><span style="color: var(--shiki-token-string-expression)">KEIS</span><span style="color: var(--shiki-token-keyword)">|</span><span style="color: var(--shiki-token-string-expression)">93[029]</span><span style="color: var(--shiki-token-keyword)">|</span><span style="color: var(--shiki-token-string-expression)">94[39]</span><span style="color: var(--shiki-token-keyword)">|</span><span style="color: var(--shiki-token-string-expression)">EBCDIC.</span><span style="color: var(--shiki-token-keyword)">*</span><span style="color: var(--shiki-token-string-expression)">JP/</span><span style="color: var(--shiki-token-keyword)">i</span><span style="color: var(--shiki-color-text)">)</span></span>
<span><span style="color: var(--shiki-color-text)">);</span></span>
<span></span>

Error Handling

The tailor.iconv API supports special flags for handling characters that cannot be converted:

  • //IGNORE: Silently ignores characters that cannot be converted
  • //TRANSLIT: Attempts to transliterate characters to similar ones in target encoding (default replacement: ?)
  • //TRANSLIT:char: Custom replacement character extension
    • Example: ASCII//TRANSLIT:* replaces unconvertible characters with *
    • Example: ASCII//TRANSLIT:[?] replaces unconvertible characters with [?]

Best Practices

  1. Encoding Selection

    • For text with kanji: Use UTF-8, Shift_JIS, EUC-JP, or IBM930/939
    • For katakana-only text: IBM290 is also available
    • For email transmission: ISO-2022-JP is recommended
  2. Preventing Character Corruption

    • Use the same encoding for both sender and receiver
    • Specify the source encoding accurately
  3. Error Handling

    • Use //IGNORE to skip unconvertible characters
    • Use //TRANSLIT for character substitution
    • Use custom replacement characters for specific requirements

Related Resources