RFC1557 Korean Character Encoding for Internet Messages

1557 Korean Character Encoding for Internet Messages. U. Choi, K.Chon, H. Park. December 1993. (Format: TXT=8736 bytes) (Status: INFORMATIONAL)

日本語訳
RFC一覧

参照

Network Working Group                                            U. Choi
Request for Comments: 1557                                       K. Chon
Category: Informational                                            KAIST
                                                                 H. Park
                                                     Solvit Chosun Media
                                                           December 1993


            Korean Character Encoding for Internet Messages

Status of this Memo

   This memo provides information for the Internet community.  This memo
   does not specify an Internet standard of any kind.  Distribution of
   this memo is unlimited.

Introduction

   This document describes the encoding method being used to represent
   Korean characters in both header and body part of the Internet mail
   messages [RFC822].  This encoding method was specified in 1991, and
   has since then been used.  It has now widely being used in Korean IP
   networks.

   This document also describes the name of the encoding method which is
   to be used in order to match the message header and body format of
   MIME [MIME1, MIME2].

   This document describes only the encoding method for plain text.
   Other text subtypes, rich text and similar forms of text, are beyond
   the scope of this document.

Description

   It is assumed that the starting code of the message is ASCII.  ASCII
   and Korean characters can be distinguished by use of the shift
   function.  For example, the code SO will alert us that the upcoming
   bytes will be a Korean character as defined in KSC 5601.  To return
   to ASCII the SI code is used.

   Therefore, the escape sequence, shift function and character set used
   in a message are as follows:

           SO           KSC 5601
           SI           ASCII
           ESC $ ) C    Appears once in the beginning of a line
                            before any appearance of SO characters.




Choi, Chon & Park                                               [Page 1]

RFC 1557               Korean Character Encoding           December 1993


   The KSC 5601 [KSC5601] character set that includes Hangul, Hanja
   (Chinese ideographic characters), graphic and foreign characters,
   etc., is two bytes long for each character.

   For more information about Korean character sets please refer to the
   KSC 5601-1987 document.  Also, for more detailed information about
   the escape sequence and the shift function you can look for the ISO
   2022 [ISO2022] document.

Formal Syntax

   Where this document in its formal syntax does not agree with the
   description part, priority should be given to the formal syntax of
   the document.

   The notations used in this section of the document are according to
   those used in STD 11, RFC 822 [RFC822] with the same meaning.

        * (asterisk) has the following meaning :
             l*m "anything"

   The above means that "anything" has to be used at least l times and
   at most m times.  Default values for l and m are 0 and infinitive,
   respectively.

   body            = *e-line *1( designator *( e-line / h-line ))

   designator      = ESC "$" ")" "C"

   e-line          = *text CRLF

   h-line          = *text 1*( segment *text ) CRLF




   segment         = SO 1*(one-of-94 one-of-94 SI

                                               ; ( Octal, Decimal.)

   ESC             =     ; ( 33, 27.)

   SO              =      ; ( 16, 14.)

   SI              =       ; ( 17, 15.)

   SP              =          ; ( 40, 32.)




Choi, Chon & Park                                               [Page 2]

RFC 1557               Korean Character Encoding           December 1993


   one-of-94       =  ; (41-176, 33.-126.)

   CHAR            =      ; ( 0-177, 0.-127.)

   text            = 

MIME and RFC 1522 Considerations

   The name to be used for the Hangul encoding scheme in the contents is
   "ISO-2022-KR".  This name when used in MIME message form would be:

                Content-Type: text/plain; charset=iso-2022-kr

   Since the Hangul encoding is done with 7 bit format in nature, the
   Content-Transfer-Encoding-header does not need to be used. However,
   while using the Hangul encoding, current Hangul message softwares
   does not support Base64 or Quoted-Printable encoding applied on
   already encoded Hangul messages.

   The Hangul encoded in the header part of the message is Korean EUC
   [EUC-KR].  In the EUC-KR encoding, the bytes with 8th bit set will be
   recognized as KSC-5601 characters.  To use Hangul in the header part,
   according to the method proposed in RFC 1522, the encoded Hangul are
   "B" or "Q" encoded. When doing so, the name to be used will be EUC-
   KR.

Background Information

   The Hangul encoding system is based on the ISO 2022 [ISO2022]
   environment according to its 4/4 announcement.  However, the Hangul
   encoding does not include the announcement's escape sequence.

   The KSC 5601 used in this document is, in definition, identical to
   the KSC 5601-1987, KSC 5601-1989 and KSC 5601-1992's 94x94 octet
   definition.  Therefore, any revision that refers to KSC-5601 after
   1992 is to be considered as having the same meaning.

   At present, the Hangul encoding system is based on the experience
   acquired from the former widely used "N-Byte Hangul" among UNIX
   users.  Actually, the encoding method, "N-Byte Hangul", using SO and
   SI was the encoding method used in SDN before KSC 5601 was made a
   national standard.

   This code is intended to be used for the information interchange of
   Hangul messages; any other use of the code is not considered
   appropriate.




Choi, Chon & Park                                               [Page 3]

RFC 1557               Korean Character Encoding           December 1993


References

   [ASCII] American National Standards Institute, "Coded character set
           -- 7-bit American national standard code for information
           interchange", ANSI X3.4-1968

   [ISO2022] International Organization for Standardization (ISO),
             "Information processing -- ISO 7-bit and 8-bit coded
             character sets -- Code extension techniques",
             International Standard, 1986, Ref. No. ISO 2022-1986 (E).

   [KSC5601] Korea Industrial Standards Association, "Code for
             Information Interchange (Hangul and Hanja)," Korean
             Industrial Standard, 1987, Ref. No. KS C 5601-1987.

   [EUC-KR] Korea Industrial Standards Association, "Hangul Unix
            Environment," Korean Industrial Standard, 1992, Ref. No.
            KS C 5861-1992.

   [RFC822] Crocker, D., "Standard for the Format of ARPA Internet
            Text Messages", STD 11, RFC 822, UDEL, August 1982.

   [MIME1] Borenstein, N., and N. Freed, "MIME (Multipurpose
           Internet Mail Extensions): Part One: Mechanisms for
           Specifying and Describing the Format of Internet Message
           Bodies", RFC 1521, Bellcore, Innosoft, September 1993.

   [MIME2] Moore, K., "MIME (Multipurpose Internet Mail Extensions)
           Part Two: Message Header Extensions for Non-ASCII Text",
           RFC 1522, University of Tennessee, September 1993.

Security Considerations

   Security issues are not discussed in this memo.

Acknowledgments

   The authors wants to thank all the people who assisted in writing
   this document.  In particular, we thank Erik von der Poel, Felix M.
   Villarreal, Ienup Sung, Kyoung Namgoong, and Kyuho Kim.











Choi, Chon & Park                                               [Page 4]

RFC 1557               Korean Character Encoding           December 1993


Authors' Addresses

   Uhhyung Choi
   Korea Advanced Institute of Science and Technology
   Department of Computer Science
   Taejon, 305-701, Republic of Korea

   Phone: +82-42-869-8718
   Fax: +82-42-869-3510
   EMail: uhhyung@kaist.ac.kr


   Kilnam Chon
   Korea Advanced Institute of Science and Technology
   Department of Computer Science
   Taejon, 305-701, Republic of Korea

   Phone: +82-42-869-3514
   Fax: +82-42-869-3510
   EMail: chon@cosmos.kaist.ac.kr


   Hyunje Park
   Solvit Chosun Media, Inc.
   748-16 Yeoksam-Dong, Kangnam-Gu
   Seoul, 135-080, Republic of Korea

   Phone: +82-2-561-0361
   Fax: +82-2-569-4847
   EMail: hjpark@dino.media.co.kr





















一覧

 RFC 1〜100  RFC 1401〜1500  RFC 2801〜2900  RFC 4201〜4300 
 RFC 101〜200  RFC 1501〜1600  RFC 2901〜3000  RFC 4301〜4400 
 RFC 201〜300  RFC 1601〜1700  RFC 3001〜3100  RFC 4401〜4500 
 RFC 301〜400  RFC 1701〜1800  RFC 3101〜3200  RFC 4501〜4600 
 RFC 401〜500  RFC 1801〜1900  RFC 3201〜3300  RFC 4601〜4700 
 RFC 501〜600  RFC 1901〜2000  RFC 3301〜3400  RFC 4701〜4800 
 RFC 601〜700  RFC 2001〜2100  RFC 3401〜3500  RFC 4801〜4900 
 RFC 701〜800  RFC 2101〜2200  RFC 3501〜3600  RFC 4901〜5000 
 RFC 801〜900  RFC 2201〜2300  RFC 3601〜3700  RFC 5001〜5100 
 RFC 901〜1000  RFC 2301〜2400  RFC 3701〜3800  RFC 5101〜5200 
 RFC 1001〜1100  RFC 2401〜2500  RFC 3801〜3900  RFC 5201〜5300 
 RFC 1101〜1200  RFC 2501〜2600  RFC 3901〜4000  RFC 5301〜5400 
 RFC 1201〜1300  RFC 2601〜2700  RFC 4001〜4100  RFC 5401〜5500 
 RFC 1301〜1400  RFC 2701〜2800  RFC 4101〜4200 

スポンサーリンク

eval

ホームページ製作・web系アプリ系の製作案件募集中です。

上に戻る