[Pharo-users] Slow object printOn: with EURO symbol

Sven Van Caekenberghe sven at stfx.eu
Tue Jan 5 07:20:33 EST 2016

String with ByteString and WideString subclasses has been a standard feature of Squeak/Pharo for a long time. The transparent automatic conversion between the two is a feature, not a limitation.

In se, there is nothing wrong with it.

Yes, other representations of Strings are possible, but is is far from sure that they would be faster overall. The current implementation favours Latin1 (and thus ASCII), because that is so common. In my work image I count them as follows:

ByteString allInstances size. "301498"

WideString allInstances size. "136"

That is less than 0.05%.

> On 05 Jan 2016, at 13:08, Hilaire <hilaire at drgeo.eu> wrote:
> Le 04/01/2016 11:05, Henrik Johansen a écrit :
>> In the fallback code for WriteStream >> #nextPut:, at:put: is called,  so yes, streaming a wide char causes the streams collection to be converted from Byte to WideString.
>> Conversion is done using become, which currently triggers a full heap scan for references, and is thus very slow.
>> One could add a fast-path along the lines of #pastEndPut: (which has already broken any assumption that a reference to the collection will reflect all writes for the lifetime of stream, for the same performance problems one would face using #become:); if collection is a ByteString and anObject is a wide characters, replace collection with a WideString, and *then* call at:put:
>> But, it is not a very nice thing to add to a generic streaming class, nor is it a very attractive at this point in time considering that making become: a fast operation is one of the problems solved by Spur.
> So wait and see for Spur?
> To not forget about it, it is recorded here, and it should be kept open
> for later check:
> https://pharo.fogbugz.com/f/cases/17315/Slow-object-printOn-with-EURO-symbol
> It is  possible to turn around this problem, but this sort of annoyance
> with Pharo internal encoding regularly arises, so I am not sure what to
> think about the state of Pharo regarding internal encoding. Now days is
> not supposed to be all utf-8?
> Thanks
> Hilaire
> -- 
> Dr. Geo
> http://drgeo.eu

More information about the Pharo-users mailing list