[Pharo-dev] Integer arithmetic and bit counting performance in Squeak and Pharo

Benoit St-Jean bstjean at yahoo.com
Fri Feb 10 07:21:33 EST 2017


Oh!  Many thanks for all the details!  I thought about special selectors and remembered that #bitAnd: and #bitOr: were those special kind of "beasts" but I falsely assumed that #bitXor: was also a special selector... :(  I should have looked at the code!

So that also explains the "strange" behavior of the 64bit image.  Lots of the tests in the 32bit image fell into the LargeInteger "range" and then, executing the same code on the 64bit image, those numbers suddenly became considered SmallInteger...  That explains it all and why some operations "suddenly" became so fast for no apparent reason !!!
Now, just for my personal curiosity, what is going to be the impact of a 64bit VM on such code? (Unfortunately, I am on Windows so I will have to wait...) ?
P.S. I should have persevered in my google searches as I just came across this *excellent* post :  Arithmetic, inlined and special selectors
:)
Thanks a lot!

  
|  
|   
|   
|   |    |

   |

  |
|  
|   |  
Arithmetic, inlined and special selectors
 Today we are going to discuss how message sends are dispatched in the VM and/or compiled in the image for arithm...  |   |

  |

  |

 
 ----------------- 
Benoît St-Jean 
Yahoo! Messenger: bstjean 
Twitter: @BenLeChialeux 
Pinterest: benoitstjean 
Instagram: Chef_Benito
IRC: lamneth 
Blogue: endormitoire.wordpress.com 
"A standpoint is an intellectual horizon of radius zero".  (A. Einstein)

      From: Clément Bera <bera.clement at gmail.com>
 To: Benoit St-Jean <bstjean at yahoo.com>; Pharo Development List <pharo-dev at lists.pharo.org> 
Cc: The General-purpose Squeak Developers List <squeak-dev at lists.squeakfoundation.org>
 Sent: Friday, February 10, 2017 5:14 AM
 Subject: Re: [Pharo-dev] Integer arithmetic and bit counting performance in Squeak and Pharo
   
Hi,
This is a lovely post. Thanks for posting about Smalltalk / Pharo / Squeak.
   
   - Why exactly does the override work (in 32bit images)?
   - What changed so that things are different in Squeak 5.1 64bit image (overrides partially work)?
If I am correct your questions are exclusively for bitOr: and bitAnd:. I don't know what you are aware of and what you're not aware of, hence I am going to give you some details on how bitAnd: is executed (bitOr: is done exactly the same way), and hopefully there will be something you don't know yet that should help you understand what's going on. 
Execution of bitAnd:
bitAnd: a special selector. If you run Smalltalk specialSelectors you get this array:  #(#+ 1 #- 1 #< 1 #> 1 #<= 1 #>= 1 #= 1 #~= 1 #* 1 #/ 1 #\\ 1 #@ 1 #bitShift: 1 #// 1 #bitAnd: 1 #bitOr: 1 #at: 1 #at:put: 2 #size 0 #next 0 #nextPut: 1 #atEnd 0 #== 1 #class 0 #blockCopy: 1 #value 0 #value: 1 #do: 1 #new 0 #new: 1 #x 0 #y 0) which includes bitAnd: .
Special selectors are encoded in the bytecode differently to save memory, so they can be executed differently without inducing overhead. In Cog, bitAnd: is executed differently using type-prediction. 
In the Stack Interpreter, the execution of bitAnd: is done as follows:    If both operands are SmallIntegers,         push the bitAnd: value of the 2 SmallIntegers    else         Try the primitive BitAnd for SmallIntegers (14)        on success            push the result        on failure            perform the send.
In the JIT, bitAnd: sends are compiled differently that normal send, the compilation logic is as follows:    If both operands are SmallInteger literals        compute the result during compilation and use this value    else        If one of the 2 operands is a SmallInteger literal            generate 2 paths, a quick inlined path testing at runtime that the other operand is a SmallInteger, and if so, Ands the operands and use that result, else use the slow path performing a send        else            generate a normal send
Primitives
primitive 14 -> Works for SmallIntegers and LargePositiveInteger fitting in an unsigned machine word. (Slower than direct SmallInteger bitAnd performed by the special selector)primitive 34 -> Works for SmallIntegers and LargePositiveInteger fitting in an unsigned int64. (Slower than primitive 14 in 32 bits but covering more cases, equivalent in 64 bits)primDigitBitAnd (in Integer) -> Works for any integer. (Slower than the primitive 14 and 34)
SmallInteger encoding
In 32 bits, SmallIntegers are signed 31bits integer.In 64 bits, SmallIntegers are signed 61 bits integer.
Discussion
Let's make some assertions based on previous information:
1) If you have the exact same bitAnd: implementation for #bitAnd: and #bitAnd2:, #bitAnd2: is slower because it is not a special selector.
2) Still with the exact same bitAnd: implementation, using the JIT, things like that:16r1 bitAnd: 1. "1 bits"
16r2 bitAnd: 2. "2 bit"
16r4 bitAnd: 4. "3 bit"
16r8 bitAnd: 3. "4 bit"Are bitAnd: operations between SmallInteger constants and the result is unused. Hence this whole portion of code does not generate any native instructions.On the other hand, things like that:16r1 bitAnd2: 1. "1 bits"
16r2 bitAnd2: 2. "2 bit"
16r4 bitAnd2: 4. "3 bit"
16r8 bitAnd2: 3. "4 bit"Are sends, generating multiple native instructions including calls.

3) Because of the SmallInteger encoding,16r200000000000000 bitAnd: 98490667780264321. "58 bit"
is compiled by the JIT to nothing in 64 bits but to a send in 32 bits (the JIT can't solve LargeInteger arithmetic at compilation time).
4) primitive 34 provides some speed-up over primitive 14 in 32 bits, covering bitAnd: operations from unsigned 33 bits integer to unsigned 64 bits integer, but makes no difference in 64 bits. In 64bits all the cases covered by primitive 34 are covered by primitive 14, so it should even slow down things.
So...
Does this answer your questions ?
Don't hesitate to ask further questions. 
If you want more details, I wrote something about special selectors a while ago: https://clementbera.wordpress.com/2014/08/12/arithmetic-inlined-and-special-selectors/ .
Best,
Clement
PS1: If I'm correct, you referenced my blog a couple times, thank you for doing this, I really appreciate that.PS2: I enjoy chess myself, so if you have an open-source/MIT working algorithm at some point please send it to me, I will play against the AI and look at the code.

On Fri, Feb 10, 2017 at 8:18 AM, Benoit St-Jean via Pharo-dev <pharo-dev at lists.pharo.org> wrote:



---------- Forwarded message ----------
From: Benoit St-Jean <bstjean at yahoo.com>
To: The General-purpose Squeak Developers List <squeak-dev at lists. squeakfoundation.org>, "pharo-dev at lists.pharo.org" <pharo-dev at lists.pharo.org>
Cc: 
Date: Fri, 10 Feb 2017 07:18:02 +0000 (UTC)
Subject: Integer arithmetic and bit counting performance in Squeak and Pharo
For those interested in performance of bit operations and integer arithmetic in Squeak and Pharo :

Bit operations in general : https://endormitoire.wordpress .com/2017/02/10/bits-and- pieces/

Bit counting algorithms : https://endormitoire.wordpress .com/2017/02/10/1001001-sos/

Some questions, some results, some answers... ----------------- 
Benoît St-Jean 
Yahoo! Messenger: bstjean 
Twitter: @BenLeChialeux 
Pinterest: benoitstjean 
Instagram: Chef_Benito
IRC: lamneth 
Blogue: endormitoire.wordpress.com 
"A standpoint is an intellectual horizon of radius zero".  (A. Einstein)




   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.pharo.org/pipermail/pharo-dev_lists.pharo.org/attachments/20170210/6f115bb4/attachment.html>


More information about the Pharo-dev mailing list