[Pharo-dev] Class allSubInstances size takes 23 seconds to run on Pharo 5 on OS X

Eliot Miranda eliot.miranda at gmail.com
Thu Apr 21 10:48:09 EDT 2016


Hi Bernhard,

> On Apr 20, 2016, at 9:56 PM, Bernhard Pieber <bernhard at pieber.com> wrote:
> 
> Hi Eliot,
> 
> Right, but in the latest Squeak5.1-trunk with the latest CogSpur.r3663.app it takes only 3 milliseconds. I guess it’s just an optimized implementation, there. (Can’t look right now.)

Indeed ;-).  In most Smalltalk implementations an object's header contains a direct reference (a pointer) to its class, but in Spur, an object's header contains a "class index" (as does the Dart VM).  IIRC this is an approach first tried in a system at Xerox PARC, but what follows, I invented for 64-bit VW, to avoid having 64-bits of class reference in every object.

The object header contains both a 22-bit classIndex, for up to 4 million classes, and a 22-bit identityHash field for 4 million hash values.  The classIndex is used in all method caches, which has advantages for the GC, and means that when sending a message via an inline cache, the object's index is fetched, not its class.

Classes are stored in a hidden two-level sparse array (array of arrays), the class table.  If the object's class object is required (for the #class message or to do a full message lookup searching the inheritance hierarchy) the classIndex is used to index the class table.

The classes known to the VM, Array, Message, Context, LargeNegativeInteger, LargePositiveInteger have known indices and occupy the first page of the class table. So the VM can allocate instances of these by simply writing a header containing the relevant classIndex.  A conventional implementation would have to fetch the class from the specialObjectsArray and write it into the header (and in pre-Spur, there are 31 "compact class indices" of which a handful are used for a subset of the classes).

But how does the VM find the classIndex of an arbitrary class?  Searching the table would be very slow.  So Spur arranges that a class's identityHash is also its index in the classTable.  Classes have their own special identityHash primitive, see Behavior>>#identityHash.  Whenever the VM tries to Instantiate a class, or whenever the class's identityHash primitive is run, if the class's identityHash is zero, the VM assigns an unused index in the class table as the identityHash and stores the class in the table.  So a class's identityHash is its index in the class table.

To Instantiate a class its identityHash is copied into the classIndex field of the new instance's header.  To search for all instances of a class, the VM searches for objects whose classIndex is that class's identityHash.  A consequence is that if a class doesn't yet have an identityHash (identityHash = 0) it can't have instances, and so there is no need to scan the heap.


Note that you can't (easily) tell from the image.  If you send identityHash to a class that will enter it into the table.  There's a hidden hasIdentityHash primitive that can be used, and if you're curious I can post code to get at it.  Note that this implies all the metaclasses are in the table because they have as their single instance a normal named class, but Metaclasses holds onto its instance via thisClass and so can also avoid traversing the heap.

Some things to try:

{Array. ByteString. Character. Context. LargeNegativeInteger. LargePositiveInteger. Message. SmallFloat64. BoxedFloat64. SmallInteger} collect: #identityHash

and in a pre-Spur system

Smalltalk compactClassesArray select: [:c| Smalltalk specialObjectsArray includes: c] thenCollect: [:c| {c. c indexIfCompact]

Hope this explains what's going on under the covers.

> Cheers,
> Bernhard
> 
>> Am 20.04.2016 um 17:43 schrieb Eliot Miranda <eliot.miranda at gmail.com>:
>> 
>> Hi Bernhard,
>> 
>> 
>>> On Apr 20, 2016, at 5:26 AM, Bernhard Pieber <bernhard at pieber.com> wrote:
>>> 
>>> Dear Pharoers,
>>> 
>>> I found something strange:
>>> Time millisecondsToRun: [ Class allSubInstances size ]. „23617"
>> 
>> Because it does an allInstances for Class and all its subclasses, and allInstances visits every object in the heap, so this does thousands of scans of the heap.
>> 
>> 
>>> I did this on a new Pharo 5 image
>>> curl get.pharo.org/alpha+vmLatest | bash
>>> 
>>> Can somebody confirm this on their machine? What might be the reason?
>>> 
>>> Cheers,
>>> Bernhard
> 
> 



More information about the Pharo-dev mailing list