[Pharo-dev] problem while parsing xml

Stéphane Ducasse stephane.ducasse at inria.fr
Fri Dec 27 11:21:11 EST 2013


I do not know but monty adding worth behavior.
Now we could always revert to the previous configuration version. 
Stef

Here is the mail that he sent a while ago:


These updates add validation against internal and external DTDs, proper replacement of general and parameter entites, customizable resolution of external parsed entities using Zinc and FileSystem, awareness of notations and unparsed entities, preservation of the internal DTD subset by the DOM parser (so printing a parsed doc with a DTD will produce approximately what was input), line numbers reporting in error messages, and better well-formed and validity constraints.

I had to largely rewrite the tokenizer to make everything work, but I followed the spec closely, and it is about the same speed as long as there is no DTD to validate against.

One problem is that while the tests I added and the existing tests all pass, for some reason helper messages in some test classes starting with "should" (in the style of should:raise:) are being interpreted as tests and run by TestRuner, even though they don't begin with "test" and take arguments! This is possibly a bug in TestRunner.

Another problem is that there are so many deprecated methods cluttering up classes, some of which have been deprecated for years! It is confusing and hard to see which methods to use just by browsing the protocols. I would really suggest using this code to get rid of the XML-Parser methods that have been deprecated for at least a year:

expiry := 1 year.
(SystemNavigation default allClassesInPackageNamed: 'XML-Parser')
do: [:class |
class selectors do: [:selector | | compiledMethod timeStamp |
compiledMethod := class compiledMethodAt: selector.
timeStamp := compiledMethod timeStamp copyAfter: Character space.
(compiledMethod isDeprecated
and: [(DateAndTime now - (DateAndTime fromString: timeStamp)) > expiry])
ifTrue: [class removeSelector: selector]]].

I ran and checked it myself and it doesn't break anything. Running the above with XML-Writer-Core might not be a bad idea either.

I also updated BitmapCharacterSet to use less memory.




> In an image with XML-Parser-NorbertHartl.141 both expression work.
> 
> I see that in PharoExtras/XMLParser there are many commits by ‘monty’ all with empty log messages - how is that possible ??
> 
> On 27 Dec 2013, at 16:57, Stéphane Ducasse <stephane.ducasse at inria.fr> wrote:
> 
>> XMLDOMParser parse: '<?xml version="1.0" encoding="UTF-8" standalone="no"?>
>> <!DOCTYPE score-partwise PUBLIC
>> "-//Recordare//DTD MusicXML 3.0 Partwise//EN"
>> "http://www.musicxml.org/dtds/partwise.dtd">
>> <score-partwise version="3.0">
>> <part-list>
>> <score-part id="P1">
>>   <part-name>Music</part-name>
>> </score-part>
>> </part-list>
>> </score-partwise>
>> ' readStream.
>> 
>> produces the error 
>> while 
>> 
>> XMLDOMParser parse: '<?xml version="1.0" encoding="UTF-8" standalone="no"?>
>> <score-partwise version="3.0">
>> <part-list>
>> <score-part id="P1">
>>   <part-name>Music</part-name>
>> </score-part>
>> </part-list>
>> </score-partwise>
>> ' readStream.
>> 
>> does not.
>> 
>> Stef
>> 
> 
> 





More information about the Pharo-dev mailing list