[Pharo-project] Error parsing XML File

Alexandre Bergel alexandre at bergel.eu
Fri Mar 26 18:21:17 EDT 2010


Hi Fabrizo,

I think you're in the right place to talk about that.

I haven't been able to reproduce your error.
I added a test:

XMLParserTest>>testNonUTF8Characters

	self shouldnt: [XMLDOMParser parseDocumentFrom:
		'<foo>Bean BLABLABLA Eidgenössisches Institut für BLABLALBLA</foo>'  
readStream] raise: Error.

It goes green in my image. Do you have a different way to get the  
readStream from the String?

Cheers,
Alexandre

On 26 Mar 2010, at 12:14, Fabrizio Perin wrote:

> Hi,
> I was parsing an XML File with the last version of XML Parser (XML- 
> Parser-JAAyer.68) and i get an error related to a not UTF-8  
> character that the parser found into the document. The XML document  
> contains some german character:
>
> <![CDATA[SES: Bean BLABLABLA Eidgenössisches Institut für  
> BLABLALBLA]]>
>
> Actually i'm not sure if the error is which is in the  
> UTF8TextConverter or something is wrong in the invokation from the  
> parser. Anyway i parse several time the same document with older  
> versions of the XML-Parser (XML-Parser-JAAyer.57) and it always  
> works well. I'm not sure if the mailing list of Pharo is the right  
> place to report this problem in the case i'm i'm sorry.
>
> Here the trace from the log:
>
> Error: Invalid utf8 input detected
> 26 March 2010 4:14:07 pm
>
> VM: Mac OS - intel - 1062 - Squeak3.8.1 of '28 Aug 2006' [latest  
> update: #6747] Squeak VM 4.2.2b1
> Image: Pharo-1.0-10515-rc3 [Latest update: #10515]
>
> SecurityManager state:
> Restricted: false
> FileAccess: true
> SocketAccess: true
> Working Dir /Users/fabrizioperin/development/Pharo/WORKINGONNOW/ 
> MooseJEE_64
> Trusted Dir /foobar/tooBar/forSqueak/bogus
> Untrusted Dir /Users/fabrizioperin/Library/Preferences/Squeak/ 
> Internet/My Squeak
>
> UTF8TextConverter(Object)>>error:
> 	Receiver: an UTF8TextConverter
> 	Arguments and temporary variables:
> 		aString: 	'Invalid utf8 input detected'
> 	Receiver's instance variables:
> an UTF8TextConverter
>
> UTF8TextConverter>>errorMalformedInput
> 	Receiver: an UTF8TextConverter
> 	Arguments and temporary variables:
>
> 	Receiver's instance variables:
> an UTF8TextConverter
>
> UTF8TextConverter>>nextFromStream:
> 	Receiver: an UTF8TextConverter
> 	Arguments and temporary variables:
> 		aStream: 	MultiByteFileStream: '/Users/fabrizioperin/development/ 
> Pharo/WORKINGON...etc...
> 		character1: 	$¶
> 		value1: 	182
> 		character2: 	$s
> 		value2: 	115
> 		unicode: 	nil
> 		character3: 	$s
> 		value3: 	115
> 		character4: 	nil
> 		value4: 	nil
> 	Receiver's instance variables:
> an UTF8TextConverter
>
> MultiByteFileStream>>next
> 	Receiver: MultiByteFileStream: '/Users/fabrizioperin/development/ 
> Pharo/WORKINGONNOW/MooseJEE_64/src/...etc...
> 	Arguments and temporary variables:
> 		char: 	nil
> 		secondChar: 	nil
> 		state: 	nil
> 	Receiver's instance variables:
>
>
> XMLStreamReader>>basicNext
> 	Receiver: a XMLStreamReader
> 	Arguments and temporary variables:
> 		nextChar: 	nil
> 	Receiver's instance variables:
> 		stream: 	MultiByteFileStream: '/Users/fabrizioperin/development/ 
> Pharo/WORKINGONN...etc...
> 		nestedStreams: 	nil
> 		peekChar: 	nil
> 		buffer: 	a WriteStream 'SES: Bean zum Einlesen und updaten der  
> Stako relevanten ...etc...
>
> XMLStreamReader>>next
> 	Receiver: a XMLStreamReader
> 	Arguments and temporary variables:
> 		nextChar: 	nil
> 	Receiver's instance variables:
> 		stream: 	MultiByteFileStream: '/Users/fabrizioperin/development/ 
> Pharo/WORKINGONN...etc...
> 		nestedStreams: 	nil
> 		peekChar: 	nil
> 		buffer: 	a WriteStream 'SES: Bean zum Einlesen und updaten der  
> Stako relevanten ...etc...
>
> XMLStreamReader>>upToAll:
> 	Receiver: a XMLStreamReader
> 	Arguments and temporary variables:
> 		aDelimitingString: 	']]>'
> 	Receiver's instance variables:
> 		stream: 	MultiByteFileStream: '/Users/fabrizioperin/development/ 
> Pharo/WORKINGONN...etc...
> 		nestedStreams: 	nil
> 		peekChar: 	nil
> 		buffer: 	a WriteStream 'SES: Bean zum Einlesen und updaten der  
> Stako relevanten ...etc...
>
> SAXDriver(XMLTokenizer)>>nextCDataContent
> 	Receiver: a SAXDriver
> 	Arguments and temporary variables:
> 		cdata: 	nil
> 	Receiver's instance variables:
> 		streamReader: 	a XMLStreamReader
> 		streamWriter: 	a XMLStreamWriter
> 		entities: 	nil
> 		externalEntities: 	nil
> 		parameterEntities: 	nil
> 		isValidating: 	false
> 		parsingMarkup: 	false
> 		saxHandler: 	an OPOpaxHandler
> 		openTags: 	<ejb-jar>, <enterprise-beans>, <session>, <description>
> 		nestedScopes: 	nil
> 		useNamespaces: 	false
> 		validateAttributes: 	nil
> 		languageEnvironment: 	nil
>
> SAXDriver(XMLTokenizer)>>nextCDataOrConditional
> 	Receiver: a SAXDriver
> 	Arguments and temporary variables:
> 		nextChar: 	$C
> 		conditionalKeyword: 	nil
> 	Receiver's instance variables:
> 		streamReader: 	a XMLStreamReader
> 		streamWriter: 	a XMLStreamWriter
> 		entities: 	nil
> 		externalEntities: 	nil
> 		parameterEntities: 	nil
> 		isValidating: 	false
> 		parsingMarkup: 	false
> 		saxHandler: 	an OPOpaxHandler
> 		openTags: 	<ejb-jar>, <enterprise-beans>, <session>, <description>
> 		nestedScopes: 	nil
> 		useNamespaces: 	false
> 		validateAttributes: 	nil
> 		languageEnvironment: 	nil
>
> SAXDriver(XMLTokenizer)>>nextMarkupToken
> 	Receiver: a SAXDriver
> 	Arguments and temporary variables:
> 		nextChar: 	$[
> 	Receiver's instance variables:
> 		streamReader: 	a XMLStreamReader
> 		streamWriter: 	a XMLStreamWriter
> 		entities: 	nil
> 		externalEntities: 	nil
> 		parameterEntities: 	nil
> 		isValidating: 	false
> 		parsingMarkup: 	false
> 		saxHandler: 	an OPOpaxHandler
> 		openTags: 	<ejb-jar>, <enterprise-beans>, <session>, <description>
> 		nestedScopes: 	nil
> 		useNamespaces: 	false
> 		validateAttributes: 	nil
> 		languageEnvironment: 	nil
>
> SAXDriver(XMLTokenizer)>>nextToken
> 	Receiver: a SAXDriver
> 	Arguments and temporary variables:
> 		whitespace: 	''
> 	Receiver's instance variables:
> 		streamReader: 	a XMLStreamReader
> 		streamWriter: 	a XMLStreamWriter
> 		entities: 	nil
> 		externalEntities: 	nil
> 		parameterEntities: 	nil
> 		isValidating: 	false
> 		parsingMarkup: 	false
> 		saxHandler: 	an OPOpaxHandler
> 		openTags: 	<ejb-jar>, <enterprise-beans>, <session>, <description>
> 		nestedScopes: 	nil
> 		useNamespaces: 	false
> 		validateAttributes: 	nil
> 		languageEnvironment: 	nil
>
> OPOpaxHandler(SAXHandler)>>parseDocument
> 	Receiver: an OPOpaxHandler
> 	Arguments and temporary variables:
>
> 	Receiver's instance variables:
> 		driver: 	a SAXDriver
> 		eod: 	false
> 		stack: 	an OrderedCollection(<?xml version="1.0" encoding="utf-8"?>
> <ejb-jar id=...etc...
> _______________________________________________
> Pharo-project mailing list
> Pharo-project at lists.gforge.inria.fr
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

-- 
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel  http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.









More information about the Pharo-dev mailing list