I thought streamContents: was faster than using a comma binary message...
I was wrong. Pharo is not Java :-)
Noury
"Run in P11"
a := 'aaaaa'.
b := 'bbbbb'.
c := 'ccccc'.
d := 'ddddd'.
e := 'eeeeee'.
[ a , b , c , d , e ] bench.
"'3958888.090 per second'"
"'3808242.503 per second'"
[
String streamContents: [ :str |
str
<< a;
<< b;
<< c;
<< d;
<< e ] ] bench
"'3083603.838 per second'"
"'2927641.144 per second'" a := 'aaaaa'.
b := 'bbbbb'.
c := 'ccccc'.
d := 'ddddd'.
e := 'eeeeee'.
[ a , b , c , d , e ] bench.
"'3958888.090 per second'"
"'3808242.503 per second'"
[
String streamContents: [ :str |
str
<< a;
<< b;
<< c;
<< d;
<< e ] ] bench
"'3083603.838 per second'"
"'2927641.144 per second'"
a := 'aaaaa'.
b := 'bbbbb'.
c := 'ccccc'.
d := 'ddddd'.
e := 'eeeeee'.
[ a , b , c , d , e ] bench.
"'3958888.090 per second'"
"'3808242.503 per second'"
[
String streamContents: [ :str |
str
<< a;
<< b;
<< c;
<< d;
<< e ] ] bench
"'3083603.838 per second'"
"'2927641.144 per second'"
The test is using string literals, which may be optimized in various ways.
Is that representative of your use case?
On Fri, Mar 15, 2024 at 3:12 PM Noury Bouraqadi bouraqadi@gmail.com wrote:
I thought streamContents: was faster than using a comma binary message...
I was wrong. Pharo is not Java :-)
Noury
"Run in P11"
a := 'aaaaa'.
b := 'bbbbb'.
c := 'ccccc'.
d := 'ddddd'.
e := 'eeeeee'.
[ a , b , c , d , e ] bench.
"'3958888.090 per second'"
"'3808242.503 per second'"
[
String streamContents: [ :str |
str
<< a;
<< b;
<< c;
<< d;
<< e ] ] bench
"'3083603.838 per second'"
"'2927641.144 per second'" a := 'aaaaa'.
b := 'bbbbb'.
c := 'ccccc'.
d := 'ddddd'.
e := 'eeeeee'.
[ a , b , c , d , e ] bench.
"'3958888.090 per second'"
"'3808242.503 per second'"
[
String streamContents: [ :str |
str
<< a;
<< b;
<< c;
<< d;
<< e ] ] bench
"'3083603.838 per second'"
"'2927641.144 per second'"
a := 'aaaaa'.
b := 'bbbbb'.
c := 'ccccc'.
d := 'ddddd'.
e := 'eeeeee'.
[ a , b , c , d , e ] bench.
"'3958888.090 per second'"
"'3808242.503 per second'"
[
String streamContents: [ :str |
str
<< a;
<< b;
<< c;
<< d;
<< e ] ] bench
"'3083603.838 per second'"
"'2927641.144 per second'"
Let me start by giving some figures from my Smalltalk, on an Intel
core I5-6200U @ 2.3 Ghz CPU laptop with 8GB of memory running Ubuntu
22.04 and gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0. Smalltalk is
compiled to C then finished with the system C compiler. Static
whole-program compilation is allowed by the ANSI standard and the
system was originally written to serve as a baseline for Bryce's JIT.
nsec technique
249 replaceFrom:to:with:startingAt:*5
128 withAll:*5
486 ,,,,
492 (,,),(,)
521 streamContents:
367 StringWriteStream
860 StringBuffer>>addAllLast:
385 StringBuffer>>nextPutAll:
replaceFrom:to:with:startingAt:*5 makes a string the right size then
fills it in using #replaceAll:from:to:startingAt:.
withAll:*5 is String withAll: a withAll: b withAll: c withAll: d
withAll: e (supported up to 6 withAlls.)
This is interesting because the result can be a [ReadOnly](ByteArray
-- UTF8 -- or ShortArray -- UTF16 or String -- UTF32) and each of the
up to 6 operands can independently be these things. It wasn't
intended as a fast alternative to #, .
.... is a,b,c,d,e
(,,),(,) is (a,b,c),(d,e).
streamContents: is what you had
StringWriteStream is basically the same as streamContents: but using a
WriteStream specialised to Strings with some extra primitive support.
There are also StringReadStream and StringReadWriteStream.
StringBuffer is my version of Java's StringBuilder; it's a cross
between a String, an OrderedCollection, and a WriteStream. It can
change size like an OrderedCollection; it has most of the "writing"
methods (but not the "position" ones) of a WriteStream, and at all
times you can use it as a String without having to copy the contents.
You would expect #addAllLast: and #nextPutAll: to have the same
result, and they do, but they were written a different times and
#nextPutAll: was optimised for the case where the operand is a string
while #addAllLast: wasn't.
What does all that mean in practice?
It means that a benchmark like this is VERY SENSITIVE to the details
of how the library is written.
Even just bracketing the commas differently gives you a different time.
It means that techniques which are more efficient for LARGE volumes of
data may have startup costs
that make them less efficient for SMALL volumes of data, and that this
is a very small benchmark.
The cost of a,b,c,d,e is proportional to |a|*5 + |b|*4 + |c|*3 + |d|*2
Well, that was astc. What about Pharo?
1,950,528 per second' ,,,,
6,509,256 per second' withAll:*5
Here it is. I've added withAll:*2 to withAll:*6 to ArrayedCollection class.
withAll: c1 withAll: c2 withAll: c3 withAll: c4 withAll: c5
|e1 e2 e3 e4 e5|
e1 := c1 size.
e2 := c2 size + e1.
e3 := c3 size + e2.
e4 := c4 size + e3.
e5 := c5 size + e4.
^(self new: e5)
replaceFrom: 1 to: e1 with: c1 startingAt: 1;
replaceFrom: e1+1 to: e2 with: c2 startingAt: 1;
replaceFrom: e2+1 to: e3 with: c3 startingAt: 1;
replaceFrom: e3+1 to: e4 with: c4 startingAt: 1;
replaceFrom: e4+1 to: e5 with: c5 startingAt: 1;
yourself
What's the lesson here? Just because A is faster than B doesn't mean
there isn't a fairly obvious C, D, ..., that will beat A.
Now what is the real argument in favour of StringBuilder in Java and
streamContents: in Smalltalk?
s := ''.
1 to: n do: [:i | s := s , 'X'].
makes a string of n Xs but takes O(n2) time and turns over O(n2) memory.
s := String streamContents: [:o | 1 to: n do: [:i | o nextPut: $X]
makes a string of n Xs while taking O(n) time and turning over O(n) memory.
n does not have to be very big before this gets to be a HUGE difference.
For what it's worth, the Java compiler turns a+b+c+d+e into code that creates
a StringBuilder, stuffs a ... e into it, and then pulls a string out.
There is no point
in benchmarking a fixed number of concatenations against a StringBuilder in
Java because they're the same thing. Smalltalk compilers don't do that.
In Java and in Smalltalk you should seldom concatenation strings, but should
send the fragments directly to their final destination. I've never
quite made up
my mind whether being toString()-centric was Java's biggest blunder or just the
second biggest, but it was a pretty darned big one for sure. Smalltalk go this
right: #printOn: is the basic notion and #printString the derived and
best avoided
one.
On Sat, 16 Mar 2024 at 08:12, Noury Bouraqadi bouraqadi@gmail.com wrote:
I thought streamContents: was faster than using a comma binary message...
I was wrong. Pharo is not Java :-)
Noury
"Run in P11"
a := 'aaaaa'.
b := 'bbbbb'.
c := 'ccccc'.
d := 'ddddd'.
e := 'eeeeee'.
[ a , b , c , d , e ] bench.
"'3958888.090 per second'"
"'3808242.503 per second'"
[
String streamContents: [ :str |
str
<< a;
<< b;
<< c;
<< d;
<< e ] ] bench
"'3083603.838 per second'"
"'2927641.144 per second'" a := 'aaaaa'.
b := 'bbbbb'.
c := 'ccccc'.
d := 'ddddd'.
e := 'eeeeee'.
[ a , b , c , d , e ] bench.
"'3958888.090 per second'"
"'3808242.503 per second'"
[
String streamContents: [ :str |
str
<< a;
<< b;
<< c;
<< d;
<< e ] ] bench
"'3083603.838 per second'"
"'2927641.144 per second'"
a := 'aaaaa'.
b := 'bbbbb'.
c := 'ccccc'.
d := 'ddddd'.
e := 'eeeeee'.
[ a , b , c , d , e ] bench.
"'3958888.090 per second'"
"'3808242.503 per second'"
[
String streamContents: [ :str |
str
<< a;
<< b;
<< c;
<< d;
<< e ] ] bench
"'3083603.838 per second'"
"'2927641.144 per second'"
Thank you Richard for the detailed response.
Noury
On Mar 18 2024, at 8:06 am, Richard O'Keefe raoknz@gmail.com wrote:
Let me start by giving some figures from my Smalltalk, on an Intel
core I5-6200U @ 2.3 Ghz CPU laptop with 8GB of memory running Ubuntu
22.04 and gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0. Smalltalk is
compiled to C then finished with the system C compiler. Static
whole-program compilation is allowed by the ANSI standard and the
system was originally written to serve as a baseline for Bryce's JIT.
nsec technique
249 replaceFrom:to:with:startingAt:*5
128 withAll:*5
486 ,,,,
492 (,,),(,)
521 streamContents:
367 StringWriteStream
860 StringBuffer>>addAllLast:
385 StringBuffer>>nextPutAll:
replaceFrom:to:with:startingAt:*5 makes a string the right size then
fills it in using #replaceAll:from:to:startingAt:.
withAll:*5 is String withAll: a withAll: b withAll: c withAll: d
withAll: e (supported up to 6 withAlls.)
This is interesting because the result can be a [ReadOnly](ByteArray
-- UTF8 -- or ShortArray -- UTF16 or String -- UTF32) and each of the
up to 6 operands can independently be these things. It wasn't
intended as a fast alternative to #, .
.... is a,b,c,d,e
(,,),(,) is (a,b,c),(d,e).
streamContents: is what you had
StringWriteStream is basically the same as streamContents: but using a
WriteStream specialised to Strings with some extra primitive support.
There are also StringReadStream and StringReadWriteStream.
StringBuffer is my version of Java's StringBuilder; it's a cross
between a String, an OrderedCollection, and a WriteStream. It can
change size like an OrderedCollection; it has most of the "writing"
methods (but not the "position" ones) of a WriteStream, and at all
times you can use it as a String without having to copy the contents.
You would expect #addAllLast: and #nextPutAll: to have the same
result, and they do, but they were written a different times and
#nextPutAll: was optimised for the case where the operand is a string
while #addAllLast: wasn't.
What does all that mean in practice?
It means that a benchmark like this is VERY SENSITIVE to the details
of how the library is written.
Even just bracketing the commas differently gives you a different time.
It means that techniques which are more efficient for LARGE volumes of
data may have startup costs
that make them less efficient for SMALL volumes of data, and that this
is a very small benchmark.
The cost of a,b,c,d,e is proportional to |a|*5 + |b|*4 + |c|*3 + |d|*2
Well, that was astc. What about Pharo?
1,950,528 per second' ,,,,
6,509,256 per second' withAll:*5
Here it is. I've added withAll:*2 to withAll:*6 to ArrayedCollection class.
withAll: c1 withAll: c2 withAll: c3 withAll: c4 withAll: c5
|e1 e2 e3 e4 e5|
e1 := c1 size.
e2 := c2 size + e1.
e3 := c3 size + e2.
e4 := c4 size + e3.
e5 := c5 size + e4.
^(self new: e5)
replaceFrom: 1 to: e1 with: c1 startingAt: 1;
replaceFrom: e1+1 to: e2 with: c2 startingAt: 1;
replaceFrom: e2+1 to: e3 with: c3 startingAt: 1;
replaceFrom: e3+1 to: e4 with: c4 startingAt: 1;
replaceFrom: e4+1 to: e5 with: c5 startingAt: 1;
yourself
What's the lesson here? Just because A is faster than B doesn't mean
there isn't a fairly obvious C, D, ..., that will beat A.
Now what is the real argument in favour of StringBuilder in Java and
streamContents: in Smalltalk?
s := ''.
1 to: n do: [:i | s := s , 'X'].
makes a string of n Xs but takes O(n2) time and turns over O(n2) memory.
s := String streamContents: [:o | 1 to: n do: [:i | o nextPut: $X]
makes a string of n Xs while taking O(n) time and turning over O(n) memory.
n does not have to be very big before this gets to be a HUGE difference.
For what it's worth, the Java compiler turns a+b+c+d+e into code that creates
a StringBuilder, stuffs a ... e into it, and then pulls a string out.
There is no point
in benchmarking a fixed number of concatenations against a StringBuilder in
Java because they're the same thing. Smalltalk compilers don't do that.
In Java and in Smalltalk you should seldom concatenation strings, but should
send the fragments directly to their final destination. I've never
quite made up
my mind whether being toString()-centric was Java's biggest blunder or just the
second biggest, but it was a pretty darned big one for sure. Smalltalk go this
right: #printOn: is the basic notion and #printString the derived and
best avoided
one.
On Sat, 16 Mar 2024 at 08:12, Noury Bouraqadi bouraqadi@gmail.com wrote:
I thought streamContents: was faster than using a comma binary message...
I was wrong. Pharo is not Java :-)
Noury
"Run in P11"
a := 'aaaaa'.
b := 'bbbbb'.
c := 'ccccc'.
d := 'ddddd'.
e := 'eeeeee'.
[ a , b , c , d , e ] bench.
"'3958888.090 per second'"
"'3808242.503 per second'"
[
String streamContents: [ :str |
str
<< a;
<< b;
<< c;
<< d;
<< e ] ] bench
"'3083603.838 per second'"
"'2927641.144 per second'" a := 'aaaaa'.
b := 'bbbbb'.
c := 'ccccc'.
d := 'ddddd'.
e := 'eeeeee'.
[ a , b , c , d , e ] bench.
"'3958888.090 per second'"
"'3808242.503 per second'"
[
String streamContents: [ :str |
str
<< a;
<< b;
<< c;
<< d;
<< e ] ] bench
"'3083603.838 per second'"
"'2927641.144 per second'"
a := 'aaaaa'.
b := 'bbbbb'.
c := 'ccccc'.
d := 'ddddd'.
e := 'eeeeee'.
[ a , b , c , d , e ] bench.
"'3958888.090 per second'"
"'3808242.503 per second'"
[
String streamContents: [ :str |
str
<< a;
<< b;
<< c;
<< d;
<< e ] ] bench
"'3083603.838 per second'"
"'2927641.144 per second'"