[Pharo-project] [Tuning] Set class>>new:

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Fri Jan 13 17:13:25 EST 2012


2012/1/11 Stéphane Ducasse <stephane.ducasse at inria.fr>:
> no time to think about it but I found that interesting.
>
>
> Begin forwarded message:
>
> From: Paul Baumann <paul.baumann at theice.com>
> Subject: [vwnc] [Tuning] Set class>>new:
> Date: January 11, 2012 1:11:19 AM GMT+01:00
> To: "'VWNC'" <vwnc at cs.uiuc.edu>
>
> Inspect this code to see hashed collection instances grouped by their free
> space percentage:
>
>                 ^((ObjectMemory allInstancesOfAny: Set withAllSubclasses
> asArray) asOrderedCollection
>                                 reject: [:ea | ea size < 10 ])
>                                                 groupedBy: [:ea | ea
> capacity / ea size - 1.0 ]
>
> Hashing collections in VW need to have extra space allocated to act as
> search limits between logical groups. Sets performance drops there are
> aren't many free slots allocated.  When there are only 25% free slots
> allocated then the collection grows. VW rounds up to a prime number for
> instance creation and then targets 33% rounded up to a prime for growths
> afterwards. A target of 33% rounded to prime skimps on memory at the expense
> of performance.
>
> As a collection grows to the minimum 25% free space goal (in #fullCheck)
> there is a 75% chance of crossing into the next logical bucket before
> finding a nil value. Crossing into the next logical bucket just once can
> double the number of slots searched for a miss. Four logical buckets may be
> traversed for each search-miss by the time the 25% minimum is reached.
>
> When VW skimps on allocation then it has to grow collections more frequently
> and has to do more comparisons for each lookup. That is a general
> performance problem, but there is another problem with how VW allocates
> memory for pre-sized collections...
>
> When you see collections created for an expected size like these:
>
>                 anArray := Array new: 1000.
>                 aSet := Set new: 1000.
>                 anOc := OrderedCollection new: 1000.
>                 aDict := Dictionary new: 1000.
>
> How many items would you expect each of these collections to efficiently
> contain? Clearly the Array will be exactly 1000. The OrderedCollection will
> take up to 1000 additions before it needs to grow. Pre-sized collections are
> created to avoid incremental growth costs. Is it not common and natural to
> think that the Set and Dictionary would also be created to efficiently
> contain 1000 items too? Would you be surprised to learn that all subclasses
> of Set in VW (and likely only VW) are inefficient with this kind of
> pre-allocation when you add 1000 objects to them?
>
> No application code should be expected to know how Sets are implemented and
> to pad sizes having knowledge of VW implementation details! Look at the
> capacity variances that VW creates even though the Set has been created each
> time to anticipate 1000 additions:
>
> (Set withAll: (Array new: 1000)) capacity
> 1511
>
> (Set withAll: ((1 to: 1000) collect: [:i | Object new ])) capacity
> 1511
>
> (Set new: 1000) capacity
> 1009
>
> ((Set new: 1000) addAll: ((1 to: 1000) collect: [:i | Object new ]);
> yourself) capacity
> 2027
>
> ((Set new) addAll: ((1 to: 1000) collect: [:i | Object new ]); yourself)
> capacity
> 1361
>

In st-80, (Set new: 1000) did mean create a Set with a capacity of
1000 slots, not a capacity large enough for increasing size up to 1000
without growing.
So I guess VW have been modified since but not really cleanly refactored.

In squeak, (Set new: 1000) means allocate enough capacity for holding
1000 elements. So here is Squeak answer, should be the same in Pharo I
hope:

(Set withAll: (Array new: 1000)) capacity
 1549

(Set withAll: ((1 to: 1000) collect: [:i | Object new ])) capacity
 1549

(Set new: 1000) capacity
 1549

((Set new: 1000) addAll: ((1 to: 1000) collect: [:i | Object new ]);
yourself) capacity
 1549

((Set new) addAll: ((1 to: 1000) collect: [:i | Object new ]);
yourself) capacity
 1549


> The first four should all have the same result. The last one can be
> different because it is not pre-sized. The third example shows that only
> 0.9% free space is allocated in addition to the requested size. The fourth
> example shows that the under-allocation caused a growth that then
> over-allocated to end at 102.7% free space. The costs of the third and
> fourth example can be seen in more detail with this test:
>
> | objects set timings cap0 t0 t1 |
> objects := (1 to: ObjectMemory maximumIdentityHashValue) collect: [:i |
> Object new].
> timings := OrderedCollection new: objects size.
> t0 := t1 := Time microsecondClock.
> set := Set new: objects size.
> cap0 := set capacity.
> objects size to: 1 by: -1 do: [ :i |
>                 set add: (objects at: i).
>                 i \\ 1000 = 1 ifTrue: [timings add: (t1 negated + (t1 :=Time
> microsecondClock))].
> ].
> (Array new: 5)
>                 at: 1 put: cap0; "Starting capacity"
>                 at: 2 put: set capacity;    "Ending capacity"
>                 at: 3 put: (1 - (set capacity / set size) * -100.0);
> "Percentage of free slots"
>                 at: 4 put: timings asArray;             "Timings of each
> 1000"
>                 at: 5 put: t1 - t0;                "Total time"
>                 yourself
>
> #(17417 34939 113.264 #(147 264 262 261 262 262 265 261 261 260 261 260 260
> 2474 184 184 184) 6312)       "primes"
>
> A difference between the first and second numbers indicates growth costs
> were incurred while adding. The third number shows there were 1.13 free
> slots allocated for each logical bucket. The last number is total execution
> time. Notice that performance of each group improves 41% just after the
> collection grows.
>
> Load a package named ICXVwHashTuning from the Cincom Public Code Repository
> and try the test again.
>
> #(28670 28670 74.9985 #(104 182 179 179 180 263 180 183 185 184 190 189 188
> 188 189 254 188) 3205)        "After loading ICXVwHashTuning"
>

Squeak only has 12 bits for identity hash... SO I'm not sure the
timings are comparable at all, and primMicrosecondClock does not seem
to work either, but I see no problem in Squeak (and 33% free slot).

| objects set timings cap0 t0 t1 |
objects := (1 to: 1<<12) collect: [:i | Object new].
timings := OrderedCollection new: objects size.
t0 := t1 := Time primMillisecondClock.
set := Set new: objects size.
cap0 := set capacity.
objects size to: 1 by: -1 do: [ :i |
                set add: (objects at: i).
                i \\ 1000 = 1 ifTrue: [timings add: (t1 negated + (t1
:=Time primMillisecondClock))].
].
(Array new: 5)
                at: 1 put: cap0; "Starting capacity"
                at: 2 put: set capacity;    "Ending capacity"
                at: 3 put: (1 - (set capacity / set size) * -100.0);
"Percentage of free slots"
                at: 4 put: timings asArray;             "Timings of each 1000"
                at: 5 put: t1 - t0;                "Total time"
                yourself
 #(5471 5471 33.5693359375 #(0 2 0 0 0) 2)

With 1<<14 objects, I get
 #(23159 23159 41.351318359375 #(0 0 0 2 0 0 0 0 0 0 0 0 2 0 0 0 0) 4)

> Total execution time was cut by 49% while consuming 18% less memory and
> ending with a (very efficient) 75% free slots.
>
> When application code says "Set new: 1000" then they mean that they want a
> Set instance that should expect to contain 1000 additions. It is the
> responsibility of the Set to know how much physical space should be
> allocated to contain 1000 items.  These are simple changes that make a broad
> performance improvement while also having intuitive performance. Cincom
> should take notice of opportunities like this but at least you can improve
> the performance of your existing code if they don't.
>
> You might want to also execute "(ObjectMemory allInstancesOfAny: Set
> withAllSubclasses asArray) do: [:ea | ea trim ]." to optimize free space
> allocation for all the hashed instances. You may notice a performance boost
> from doing that once after loading ICXVwHashTuning.
>
> Paul Baumann
>
>
> ________________________________
> This message may contain confidential information and is intended for
> specific recipients unless explicitly noted otherwise. If you have reason to
> believe you are not an intended recipient of this message, please delete it
> and notify the sender. This message may not represent the opinion of
> IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and
> does not constitute a contract or guarantee. Unencrypted electronic mail is
> not secure and the recipient of this message is expected to provide
> safeguards from viruses and pursue alternate means of communication where
> privacy or a binding message is desired.
> _______________________________________________
> vwnc mailing list
> vwnc at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
>
>
>
>
>




More information about the Pharo-dev mailing list