[Pharo-users] Slow object printOn: with EURO symbol
Sven Van Caekenberghe
sven at stfx.eu
Tue Jan 5 08:38:00 EST 2016
> On 05 Jan 2016, at 14:26, Hilaire <hilaire at drgeo.eu> wrote:
> Nevertheless, the performance impact I got on OOP way of doing things
> can't be wiped out: when you ask an object to give its text
> representation but use non latin1 character, you get an important
> penalty. In the long term, it looks like a problem for Pharo.
No, that is not 100% correct.
You can use any Unicode anywhere tranparantly and the performance penalty is low. Pharo supports Unicode everywhere for 100% (given you use the right font).
The problem occurs only when you take a collection of 1000s of these objects in a tool that wants to convert them all at once but separately to strings. Then the cumulative performance penalty becomes quite noticeable, true.
The problem can does also be restated: it is really necessary for a tool to convert 1000s of items to strings, even if only 10s are shown at the same time on a screen ?
I believe that fast table tries to do better here.
> Le 05/01/2016 13:20, Sven Van Caekenberghe a écrit :
>> 0450822759String with ByteString and WideString subclasses has been a standard feature of Squeak/Pharo for a long time. The transparent automatic conversion between the two is a feature, not a limitation.
>> In se, there is nothing wrong with it.
>> Yes, other representations of Strings are possible, but is is far from sure that they would be faster overall. The current implementation favours Latin1 (and thus ASCII), because that is so common. In my work image I count them as follows:
>> ByteString allInstances size. "301498"
>> WideString allInstances size. "136"
>> That is less than 0.05%.
>>> On 05 Jan 2016, at 13:08, Hilaire <hilaire at drgeo.eu> wrote:
>>> Le 04/01/2016 11:05, Henrik Johansen a écrit :
>>>> In the fallback code for WriteStream >> #nextPut:, at:put: is called, so yes, streaming a wide char causes the streams collection to be converted from Byte to WideString.
>>>> Conversion is done using become, which currently triggers a full heap scan for references, and is thus very slow.
>>>> One could add a fast-path along the lines of #pastEndPut: (which has already broken any assumption that a reference to the collection will reflect all writes for the lifetime of stream, for the same performance problems one would face using #become:); if collection is a ByteString and anObject is a wide characters, replace collection with a WideString, and *then* call at:put:
>>>> But, it is not a very nice thing to add to a generic streaming class, nor is it a very attractive at this point in time considering that making become: a fast operation is one of the problems solved by Spur.
>>> So wait and see for Spur?
>>> To not forget about it, it is recorded here, and it should be kept open
>>> for later check:
>>> It is possible to turn around this problem, but this sort of annoyance
>>> with Pharo internal encoding regularly arises, so I am not sure what to
>>> think about the state of Pharo regarding internal encoding. Now days is
>>> not supposed to be all utf-8?
>>> Dr. Geo
> Dr. Geo
More information about the Pharo-users