Page last modified 07:59, 26 Nov 2013 by juhako

Encoding

    Table of contents
    No headers

    Encoding in text file context (such as XML file) is a map between Binary code and Character.  For example character a is presented with same binary value in the common encodings as the table below shows:

    Binary value Encoding Character
    0110 0001 ISO 8859-1 a
    0110 0001 ISO 8859-15 a
    0110 0001 UTF-8 a
     110 0001 US-ASCII a

     

    Problems can emerge when in different encodings same binary code represents different characters. As the table below shows the file might contain invalid binary values if actual encoding is something else than the one given in XML declaration.

    Binary value Encoding Character
    1010 0100 ISO 8859-1 ¤
    1010 0100 ISO 8859-15
    1010 0100 UTF-8 invalid sequence alone
    1010 0100 US-ASCII invalid 

     

     

    ISO 20022 has decided to restrict character set usage to only UTF-8 on the fact that it is the most efficient (length-wise) way to transport characters http://www.iso20022.org/FAQ.page.

    Menu