[Pharo-dev] Efficient string concatenation - proposed new

Sven Van Caekenberghe sven at stfx.eu
Sun Nov 17 11:03:09 EST 2013


On 17 Nov 2013, at 12:49, Andres Valloud <avalloud at smalltalk.comcastbiz.net> wrote:

> Well, maybe, although it's interesting to consider how many strings streams need to beat string concatenation.  Moreover, streams aren't hyper efficient either…

Yes, indeed. It also depends on what is being concatenated (constants vs objects).

Quick and dirty (totally unscientific):

[ 'foo', 'bar' ] bench. '4,650,000 per second.'
[ '' join: #( 'foo' 'bar' ) ] bench. '814,000 per second.'
[ String streamContents: [ :out | out nextPutAll: 'foo'; nextPutAll: 'bar' ] ] bench. '1,900,000 per second.'

[ 'foo', 'bar', 'baz' ] bench. '2,470,000 per second.'
[ '' join: #( 'foo' 'bar' 'baz' ) ] bench. '719,000 per second.'
[ String streamContents: [ :out | out nextPutAll: 'foo'; nextPutAll: 'bar'; nextPutAll: 'baz' ] ] bench. '1,710,000 per second.'

[ 'foo', 'bar', 'baz', 'foobar' ] bench. '2,030,000 per second.'
[ '' join: #( 'foo' 'bar' 'baz' 'foobar' ) ] bench. '665,000 per second.'
[ String streamContents: [ :out | out nextPutAll: 'foo'; nextPutAll: 'bar'; nextPutAll: 'baz'; nextPutAll: 'foobar' ] ] bench. '1,580,000 per second.' 

[ Date today asString, $- asString, Time now asString ] bench. '61,800 per second.'
[ String streamContents: [ :out | out print: Date today; nextPut: $-; print: Time now ] ] bench. '71,700 per second.'

We’re only measuring execution speed, not memory allocation, which is important too.

The length of the strings is a variable as well, of course.

Furthermore, many #printOn: implementations are not very efficient while they should be.

Conclusion, let’s be careful with a too simple advice.

Sven

> On 11/16/13 10:09 , btc at openinworld.com wrote:
>> 
>> Code Critic rule Optimization > String concatenation instead of streams
>> says:
>> 
>> "Check for string concatenation inside some iteration message. Since
>> string concatenation is O(n^2), it is better to use streaming since it
>> is O(n) - assuming that n is large enough. As a general principal avoid
>> , since the receiver is copied. Therefore chaining , messages will lead
>> to multiple useless copies of the receiver."
>> 
>> That is,
>>     String streamContents: [:s |
>>         #('abc' 'def' 'ghi')  do: [:each | s nextPutAll: each asString]]
>> 
>> should be used instead of...
>>     'abc' , 'def' , 'ghi'.
>> 
>> However the first clutters the code.  What about something like...
>>     { 'abc' . 'def' . 'ghi' } asStreamString
>> where
>>     Collection>>asStreamString
>>         ^ String streamContents: [:s | self do: [:each | s nextPutAll:
>> each asString]]
>> 
>> cheers -ben
>> 
> 





More information about the Pharo-dev mailing list