[Pharo-dev] A weak/leak story

Guille Polito guillermopolito at gmail.com
Tue Apr 12 05:41:23 EDT 2016


Hi list,

With Pavel and Christophe we spend some time digging these last weaks 
chasing the memory leaks we were seeing lately. It is a long story to 
tell, so this mail is divided in three:

1) A brief intro to weak structures and finalization in Pharo, for those 
that do not know,
2) A bit of history to explain what happened in pre-spur and post-spur,
3) The actual cause of the memory leak today,
4) How to avoid them in your application, and what are we going to do to 
prevent this in the future.

For those that need/want/prefer just the practical explanation, you can 
jump over 2) and just read 1) and 3).

========================================================================
1. A weak explanation
========================================================================

To cleanup objects upon garbage collection, Pharo and Squeak use a 
finalization mechanism based on a Weak Registry. That is, if you want to 
execute some cleanup (like closing a file) when an object is about to be 
collected, you have to put your object inside the weak registry with the 
corresponding executor/finalizer object. The object you want to 'track' 
is hold weakly by this weak registry i.e., if the only reference to the 
object is from the weak registry, it will be chosen for garbage 
collection. When this object is collected, a special process in the 
Pharo image will send #finalize to your executor object where you 
implement your cleanup.

To interact with the weak registry, there are two main subscription 
messages:

- #add:executor:

   Will add an object to the registry with the executor that is send as 
argument.

- #add:

   Will add an object to the registry, and use as executor a 'shallow 
copy' of the object.

Some conclusions to be made from this:
  1) If the executor points strongly to the object that we want to 
collect, it will never be collected. That is why the #add: message 
creates a copy of the object.
  2) If we do not provide an explicit executor, the registered object 
should already contain all information required for the finalization 
(like file handlers or external pointers). If not, the shallow copy will 
not be able to finalize correctly.

Also:
  - Using weak objects/references do not guarantee that #finalize will 
be called, you need to put your object inside the registry!
  - Using weak objects/references do not guarantee that your object will 
be magically collected. You can still cause memory leaks!

========================================================================
2. A weak story
========================================================================

Pharo and Squeak use historically the weak registry mentioned above. 
Because of the limitations that we mentioned, a different kind of weak 
structure called Ephemerons is required/more useful. To overcome some of 
these limitations, Igor (Hi Igor! maybe you're reading :)) implemented a 
couple of years ago a new finalization mechanism that, IIANM, worked as 
follows:

- Some weak objects could have a first instance variable with a special 
linked list
- When the object was about to be collected, instead it was removed from 
the weak structure and put into its container's linked list
- On the image side, a special process iterated all special linked lists 
and executed #finalize on the weak objects

This mechanism was called NewFinalization, in contrast with what was 
called LegacyFinalization. Of course these names are context dependent, 
since today's Pharo is back to the so called legacy one ;). 
NewFinalization was implemented as the default finalization mechanim in 
Pharo, both in VM and image side. But the VM changes remained in the 
Pharo branch of development. After some discussions, I remember Igor and 
Eliot agreed that what they actually needed were Ephemerons, and since 
Eliot had started working on Spur at that time, he said he would provide 
Ephemeric classes with the new object format.

Basically, for those interested, an ephemeron is an association

   weak key -> strong value

with the special quality that upon garbage collection all references to 
the weak key that are computed from the strong value (directly or 
indirectly) are taken as weak. This allows the collection of the weak 
key even if the strong value points to it, but requires some more 
machinery in the GC/VM. You can read more in here [1].

Until a couple of months/weeks ago, Pharo was using the NewFinalization 
mechanism with it's special image and VM support. And Squeak was using 
the 'Legacy' one. And then Spur arrived.

So Spur arrived, and Eliot and Esteban made a lot of effort to simplify 
the VM's maintenance, and they merged both branches. As a conclusion, 
Pharo Spur VM did not support any more NewFinalization. This provoked at 
first some leaks because objects were not being finalized. A couple of 
weeks ago, we migrated back the image code to use the 'Legacy' 
mechanism, see issue 17537 [2].

And then finalization was not working either. Nor #finalize was being 
called on executors, nor objects in the weak registry were collected. As 
a symptom, opening any tool will cause 30 new everlasting registrations 
into the weakregistry, and no tools were collected.


========================================================================
3. The cause
========================================================================

After lots of digging, we finally found what was the particular issue 
causing objects in the weak registry to not be collected. In some words, 
it is caused by the normal belief that "weak objects are magical", which 
caused that weak references and finalizers are really spread over the 
system with no proper care. And particularly related to the usage of 
announcements.

To explain better, I made some pictures for you :)


***First, imagine you have a morph with its own local announcer. You 
subscribe to two events, and the graph will look like this.

strong-graph

- the announcer knows two strong subscriptions
- the subscriptions know the announcer to be able to unregister
- the subscriptions know the registered object to send the message in 
case the event happens

This forms a closed graph that will be collected. No problem so far.


***Second, let's see what happens if we use weak subsriptions:



- the announcer know two weak subscriptions
- these weak subscriptions know the announcer strongly to be able to 
unregister
- they also know the subscriber object but weakly
- THE difference is made by the weak registry: a global object that 
manages when and how objects are finalized. In the case of announcers, 
the weak registry will store weakly the subscriber morph, and strongly 
the weak announcer subscription.

So far so good also: the references to the morph are weak. When the 
morph is collected, the weak registry will execute finalize on the 
announcement subscriptions. The subscriptions will unregister from the 
morph.


***The really problematic case is the third one: mixing weak and strong 
subscriptions in the same announcer.



The object graph is just a mixture of the two other ones. One weak 
subscription and one strong subscription. BUT:

  - there is a strong path from a global object (the weak registry) to 
the subscriber (the morph)
  - then the morph is never collected
  - the weak registry never finalizes the weak announcement subscription
  - the graph remains there forever.


And these are the simple cases that show the problem. Imagine that you 
can have this same configuration but in cycles/chains among different 
morphs/announcements. Plus this is aggravated by evil globals (e.g., the 
theme and the HandMorph remembers the last focused morph, the system 
window class remembers the last top window even if it was closed...).


========================================================================
4. The solution?
========================================================================

Our solution for the moment is simple. We would like to enforce the 
following two rules for announcements:

- announcers local to a morph should only be used strongly. YES, this 
may cause small hiccups and leaks, for example if you register a morph A 
to the announcer to another morph B. But in the long term, these two 
will form a closed graph and will be collected.

- announcers used globally, such as the System announcer, should be used 
only and uniquely in a weak manner. Like that we ensure that they are 
loosely coupled for real.

So, please, please, do not use weak announcements unless you're really 
sure of what you're doing. At least, until we have ephemerons and we are 
sure everything works as expected. Ephemerons would solve this in a more 
natural way: if we model the weak registry subscription as an ephemeron, 
any reference to the weak #key that arrives from the #value will be 
treated as weak also.

Other action points we are working on:
- fixing tools to follow the rules above
- We are also writing tests to check that tools (gt*, Nautilus, Rubric, 
FT) do not leak.
- chasing other small memory leaks created by stepping, focus global 
variables...


((fogbugz allIssues select: [ :each | each relatedToLeak ])
     flatCollect: [ :each | each participants ])
         do: #thanks



[1] https://en.wikipedia.org/wiki/Ephemeron
[2] 
https://pharo.fogbugz.com/f/cases/17537/SystemAnnouncer-has-far-too-many-subscriptions 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.pharo.org/pipermail/pharo-dev_lists.pharo.org/attachments/20160412/1d2fc825/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: strong.png
Type: image/png
Size: 25148 bytes
Desc: not available
URL: <http://lists.pharo.org/pipermail/pharo-dev_lists.pharo.org/attachments/20160412/1d2fc825/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: weak.png
Type: image/png
Size: 61130 bytes
Desc: not available
URL: <http://lists.pharo.org/pipermail/pharo-dev_lists.pharo.org/attachments/20160412/1d2fc825/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: both.png
Type: image/png
Size: 46293 bytes
Desc: not available
URL: <http://lists.pharo.org/pipermail/pharo-dev_lists.pharo.org/attachments/20160412/1d2fc825/attachment-0002.png>


More information about the Pharo-dev mailing list