[Pharo-project] Fastest utf-8 encoder contest

Sven Van Caekenberghe sven at beta9.be
Wed Jun 13 05:30:38 EDT 2012


On 13 Jun 2012, at 10:29, Marcus Denker wrote:

> We shoud standardize on *one* converter... what is the use of everyone doing it again?

Ultimately, yes, there should be one.

However, it does not hurt that multiple people are working on the same subject even if that sometimes means multiple implementations coexist in one image. It is one of the ways that open source software advances.

As for my rationale for ZnUTF8Encoder:

http://zn.stfx.eu/zn/zinc-http-components-paper.html#characterencoding

<< ZnCharacterEncoding is an extension and reimplementation of regular TextConverter. It only works on binary input and generated binary output. It adds the ability to compute the encoded length of a source character, a crucial operation for HTTP. It is more correct and will throw proper exceptions when things go wrong. >>

Throwing out and replacing fundamental system classes is hard. New designs often are not 100% API compatible or change the semantics, on purpose. Look at the #isBinary test in UTF8TextConverter>>nextFromStream and nextPut:toStream, it is so broken. Can it be fixed ?  I don't know. 


Sven


--
Sven Van Caekenberghe
http://stfx.eu
Smalltalk is the Red Pill








More information about the Pharo-dev mailing list