[Pharo-project] Another fixes to finalization (was Re: [update 1.2] #12161 - #12172)

Levente Uzonyi leves at elte.hu
Mon Oct 4 18:25:20 EDT 2010

On Mon, 4 Oct 2010, Henrik Sperre Johansen wrote:

> On 04.10.2010 22:26, Levente Uzonyi wrote:
>> On Mon, 4 Oct 2010, Igor Stasenko wrote:
>>> On 4 October 2010 22:51, Henrik Sperre Johansen
>> <henrik.s.johansen at veloxit.no> wrote:
>>>  On 04.10.2010 21:47, Igor Stasenko wrote:
>>>> On 4 October 2010 22:09, Stéphane Ducasse<stephane.ducasse at inria.fr>
>>>>  wrote:
>>>>> so let us know in the bug entry what is the conclusion :)
>>>> I think someone should verify my benchmarks i.e.
>>>> [ self loadsomething ] timeToRun
>>>> before and after patch.
>>>> And conclusion is better be written by Henrik, because he's having
>>>> concerns about speed,
>>>> while i don't. :)
>>> I already have:
>>> http://code.google.com/p/pharo/issues/detail?id=1628#c13
>> emm.. wait.
>> WeakKeyDict should not delete associations with nil-ed keys
>> automatically, because otherwise
>> you won't be able to use it in weak finalization scheme.
> It could in 1.0, without anyone noticing.
> #rehash truncated duplicate nil keys, thus if it was triggered (f.ex. by a 
> manual removal, or adding enough to cause growth) in a thread with priority 
> higher than finalization process (ie ran after GC but before finalization 
> thread), the nil keys could be truncated before finalization being run. An 
> unlikely scenario, I admit, which I guess is why noone encountered it.

That's not true. People experienced the issues (leaking Sockets), but 
didn't find the cause of the problem, because it was hardly reproducible. 
Creating a new Socket could trigger #grow in the WeakRegistry's dictionary 
which could trigger a GC when the new array was allocated. For example 
ConnectionQueue's #listenLoop uses #highIOPriority (70) and creates new 

> #grow suffered the same problem. (well, actually, it didn't add any of the 
> nil keyed associations)
>> There is two distinct use cases of weak-key dicts:
>> a) for weak finalization
>> b) for attaching some extra info(value) per object(key), which can be
>> automatically discarded once object become garbage
>> so, while in case (b) you can mercilessly kill/reuse associations with
>> nil keys, once they discovered
>> in case (a) you should preserve them until there is explicit request
>> from outside to finalize values.
> Yes, that's the point I've been trying to make these past couple of months :)
>> The need in (a), apparently prohibits from using (b) in most efficient 
>> manner.
>> So, i think, the solution would be to introduce a specialized weak-key
>> dicts, which can work much better for (b).
> The difference between a and b, and the fact Pharo 1.1's implementation 
> severly screws with the performance if b) (and honestly, isn't that good for 
> a) either) is the point I've been trying to make for months now...
> In Pharo 1.1 we've moved to the other extreme compared to 1.0, nil keys are 
> never removed. b) always works, but a) HAS to be registered to work 
> satisfactory at all.

Um, no. Associations with nil keys are lost during #rehash.

objects := (1 to: 1000) collect: [ :each | Object new ].
dictionary := WeakKeyDictionary new.
objects do: [ :each | dictionary at: each put: 1 ].
objects := nil.
Smalltalk garbageCollect.
counter := 0.
dictionary allAssociationsDo: [ :each | counter := counter + 1 ].
self assert: counter = 1000.
dictionary rehash.
counter := 0.
dictionary allAssociationsDo: [ :each | counter := counter + 1 ].
self assert: counter = 1000 "This will fail, because counter = 0."

> It was changed to do an extra step of keeping finalized assocs in a special 
> state where they can be simply replaced without rehashing after finalization 
> instead of doing rehash, but that's not really the major result of the 
> change.
>> Squeak's implementation works well in both cases.
>> Levente
> The way I read it (correct me if I'm wrong), the solution in Squeak was 
> adding a finalizer inst var to WeakKeyDictionary, that way you can 
> distinguish the two cases and handle them appropriately. (in addition there 
> were speed improvements for rehashing, growing, etc.)

The original goal of the finalizer was to make finalization run in O(n) 
time instead of O(n^2). But it's also handy for doing proper finalization. 
And there are other improvements, like #remove: triggers the finalizer and 
cleans up the slots in the removed element's "chain". So both addition 
(via #grow) and removal of elements can free slots held by GC'd keys.

The implementation in Pharo 1.1 still has an O(n^2) issue. Not 
expired associations with nil keys, are added to the front of the array 
during #grow (see #noCheckAdd:). The free slot for the association is 
found by a linear search started from the first slot in the array (OMG). 
If there are k such elements, then it takes k * (k + 1) / 2 = O(k^2) time. 
Since k can be n if the dictionary is not registered to the finalization 
process, this means O(n^2) time:

objects := (1 to: 10000) collect: [ :each | Object new ].
dictionary := WeakKeyDictionary new.
objects do: [ :each | dictionary at: each put: 1 ].
objects := nil.
Smalltalk garbageCollect.
counter := 0.
dictionary allAssociationsDo: [ :each | counter := counter + 1 ].
self assert: counter = 10000.
[ dictionary grow ] timeToRun "===> 1830"


> I looked for another solution which would keep WeakKeyDictionary oblivious to 
> whether or not it was being finalized or not and could not find one, thus I 
> think it is worth porting.
> Keeping the (pardon my fren.. err, norwegian) POS that is currectly in 1.1, 
> or reverting to the 1.0 version  for 1.2, is not a good alternative.
> Cheers,
> Henry
> _______________________________________________
> Pharo-project mailing list
> Pharo-project at lists.gforge.inria.fr
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

More information about the Pharo-dev mailing list