[Pharo-project] ConfigurationOfXMLSupport...

Alexandre Bergel alexandre at bergel.eu
Wed Feb 17 13:22:14 EST 2010

Hi George,

I will be delighted to incorporate some improvements in XML-Support.
Currently, the simplest way to contribute is to send me an email with  
the location of some improvement. I look at them, and if they come  
with unit tests and they make sense to me, then I include them.  
Someone deeply committed will become administrator of the squeaksource  
I saw your fix (0007441), I will look at them right after sending this  

I haven't checked bugs.squeak.org for a long time, but apparently the  
last improvements proposed by jaayer at zoho.com (thanks for this!)  
includes some of the change proposed on bugs.squeak.org (e.g.,  
0007465). The comments of the new revision are included below.

* All searching and enumerating methods have been moved from XMLNode  
to XMLNodeWithElements. Unless you can think of a scenario where it  
would make sense to send any of them to a string or PI node, I think  
this is both a safe and sensible move.

* XMLNodeWithElements no longer contains an "elementsAndContents"  
collection. Instead it has a single "nodes" OrderedCollection that  
contains all child nodes in the order in which they appeared. The  
collection returned by #elementsAndContents (and enumerated via  
#elementsAndContentsDo:) is recreated using #nodes.

* #addElement: is now just for elements, and #addNode can be used for  
everything else (#addElement can still handle non-element nodes,  
though). #removeNode: and #removeElement: have also been added. Since,  
for backwards compatibility, the #add* methods don't return the object  
added, neither will the #remove* method return that which was removed.

* Because the costly depth-first traversing #firstTag* and #tagsNamed*  
methods send #elementsDo: so many times, it makes sense to speed-up  
#elementsDo: by having it enumerate a separate collection just  
containing elements; thus, an "elements" collection has been added for  
that purpose. The repetitive identity checks in the XMLElement  
versions of those methods have been moved into a single #isNamed:  
method, which expects a symbol but can also handle strings by sending  
its argument #asSymbol before making #== comparisons. (#asSymbol is  
free for symbols, so you only pay if you use strings to specify tag  

* #tagsNamed:ifReceiverDo:, despite what its name implies (the plural  
"tags"), does no searching/enumeration. Outside of the implementations  
of other #tagsNamed:* messages, I doubt anyone uses this, so I have  
renamed it #ifNamed:do:.

* #tagsNamed:ifReceiverOrChildDo: does not work. The XMLElement  
version evaluates the block if it is so named and then invokes the  
superclass version of #tagsNamed:ifReceiverDo:, which is an empty  
method. Since it doesn't work and is ridiculously named, I have  
removed it.

* #tagsNamed:ifReceiverDoAndRecurse: is, oddly enough, exactly  
equivalent to #tagsNamed:do:. I have removed the XMLElement version  
and reimplemented the XMLNode(WithElements) version to just send  
#tagsNamed:do: (although I'd prefer to remove it altogether).

* XMLNodeWithElements has a new instance variable:  
"elementsDictionary," an IndentityDictionay where keys are the  
qualified and unqualified names of child elements and values are  
OrderedCollections of so-named elements. This makes #elementAt: have  
O(1) complexity rather than O(n) and a->b->c traversal have O(n)  
complexity (where n is the number of nodes counting the root, the  
target, and all ancestors between) rather than the previous worst-case  
of O(n^2). This comes at the cost of additional memory and time to add/ 
remove elements. You can rewrite Pastell accessors in terms of  
#elementsAt: and #elementAt:.

* The #contentsDo: and #elementsAndContentsDo: messages can now only  
be sent to XMLElements. It makes no sense to send them to documents or  
other kinds of nodes, as no other kind of node has text content.

* I moved #parent and #parent: out of XMLNodeWithElements and put them  
in XMLNode instead, as every node in the DOM tree save the root has a  
parent. They actually work now, too, meaning that "((XMLDOMParser  
parseDocumentFrom: '<p>foo<b>bar</b></p>' readStream) firstTagNamed:  
#b) parent" will return the "p" element, not nil as it would before.

* How often have you sent #parseDocumentFrom: with an XML string as  
its argument (instead of a stream) and gotten a "message not  
understood" error? XMLDOMParser's #parseDocumentFrom:useNamespaces:  
method now sends #readStream to "aStream" if it is not already a  
stream, so you will no longer receive such errors.

* #addEntity:value: has been removed. It has not been touched in ten  
years and sends a message (#entities) that is not understood.

* Confusing "entityName" parameter names in methods expecting element/ 
tag names were replaced with "aSymbol."

* (This package also contains the #=/#hash methods submitted as a  
feature addition to bugs.squeak.)

Lastly, this codebase has had too many hands touching it and looks too  
inconsistent. I suggest configuring your formatter like so:

     maxLineLength: 80;
     newLinesAfterMethodPattern: 2;
     newLinesAfterMethodComment: 2;
     retainBlankLinesBetweenStatements: true;
     stringInsideBlocks: ''.

And then reformatting XMLNode and its subclasses


On 17 Feb 2010, at 08:18, George Herolyants wrote:

> Btw, how can I contribute to XML-Support? Is there some process?
> Because I've created two issues on bugs.squeak.org a month ago and
> still don't know if they were noticed or not?
> _______________________________________________
> Pharo-project mailing list
> Pharo-project at lists.gforge.inria.fr
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Alexandre Bergel  http://www.bergel.eu

More information about the Pharo-dev mailing list