[Pharo-project] XML packages, CDATA and encoding

Cédrick Béler cdrick65 at gmail.com
Tue Jan 11 20:04:28 EST 2011





Le 11 janv. 2011 à 20:20, Henrik Sperre Johansen <henrik.s.johansen at veloxit.no> a écrit :

> On 11.01.2011 16:58, Cédrick Béler wrote:
>>> 
>>>> Create a file "test.xml" with the following
>>>> contents (german umlaut):
>>>> 
>>>> 
>>>>  <?xml version="1.0" encoding="iso-8859-1"?>
>>>>  <test><![CDATA[Zaunkönig]]></test>
>>>> 
>>>> After loading ConfigurationOfXML try to parse it:
>>>> 
>>>> |fs|
>>>> fs := FileStream fileNamed: 'test.xml'.
>>>> XMLDOMParser parseDocumentFrom: 

Actually, I tried from a file, and it works too.
Pharo 1.1, Cog, OSX, recent version of XML support (from squeaksource)

Cheers,

Cédrick





>>>> 
>>>> 
>>>> =>  gives an error: 'Invalid utf8 input detected'
>>>> =>  it works if you remove the CDATA section
>>>> 
>>>> Looks like UTF8TextConverter is used independent
>>>> from the encoding of the XML...
>>>> 
>>> the problem seems not to be the xml parser. If you use FileStream>>fileNamed: the fileNamed: is delegated to FileStream class>>concreteStream which is MultiByteStream. This stream initializes itself with the utf8 converter if it isn't set intentionally.
>>> 
>>> Besides that I'm not sure if the parsing of the xml parser works correctly if the setup is properly done for latin1 encoding.
>>> 
>>> Norbert
>> yes, I think the same as the following works without problem (note I have the last squeaksource version for XML related stuff)
>> 
>> string :=  '<?xml version="1.0" encoding="iso-8859-1"?>
>>   <test><![CDATA[Zaunkönig]]></test>'.
>> 
>> XMLDOMParser parseDocumentFrom: fs contents.
>> 
>> hth,
>> 
>> Cédrick
> Of course it is the job of the parser:
> http://www.w3.org/TR/REC-xml/#charencoding
> 
> The XMLSupport package is oblivious to this however, and only works on internal streams.
> 
> Cheers,
> Henry
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.pharo.org/pipermail/pharo-dev_lists.pharo.org/attachments/20110112/ab465460/attachment-0001.html>


More information about the Pharo-dev mailing list