[Pharo-project] [GS/SS Beta] serializing object graphs / ordering graph anchors

Henrik Sperre Johansen henrik.s.johansen at veloxit.no
Fri Jun 3 05:00:33 EDT 2011


On 03.06.2011 10:38, Norbert Hartl wrote:
>
> Am 03.06.2011 um 10:20 schrieb Mariano Martinez Peck:
>
>>
>>
>>>         If I look at the serializers I know the mechanics are simple:
>>>
>>>         1. take an object and look it up in a table.
>>>         2. if the object is in the table it has already been
>>>         serialized. We take a reference from the table and write it down
>>>         3. if it isn't in the table it means the object has not been
>>>         serialized and the contents need to be written. After that
>>>         the object with a reference representation is written to the
>>>         table
>>>         4. while writing the content of an object (inst vars) I
>>>         encounter simple objects (direct string representation) or
>>>         complex objects (composite). In the latter case we start
>>>         over from 1.
>>>
>>>         If the object graph is well interconnected than the odds are
>>>         high that a stack will grow high until the first reference
>>>         can be written. With a small pier kernel I already matched a
>>>         stack size of 1000.
>>>
>>>
>>>     In Fuel we  traverse the graph but instead of using a recursion
>>>     we use our own simple stack implementation where we push and pop
>>>     objects.
>>>     But Martin may explaint better than me.
>>
>>     I'm interested in details how it works and if it avoids the deep
>>     stack. Is there additional information about that?
>>
>>
>> In 4. you said "while writing the content of an object (inst vars) I 
>> encounter simple objects (direct string representation) or complex 
>> objects (composite). In the latter case we start over from 1.".
>> So... Imagine object X with two instVars Y and Z. Imagine the method 
>> of the traverse is called #analye:.  So you do #analize: X.  Then, in 
>> such method you check whether X has regular pointers to regular 
>> objects, and if true, you analize them. So in this case you would 
>> send #analyze: Y and #analize: Z, generating the recursion. Right ?
>
> yes, the time between analyse: Y and analyse: Z is dependent on the 
> graph that is attached to Y. Y can have objects that have further 
> objects etc. So leaving X aside, Y can be the first object to 
> serialize and Z the last.
>>
>> Ok, instead of sending #analyze: Y and #analize: Z  what we do is 
>> just to do a push on a stack:   #push: Y  and #push: Z. And then we 
>> are done with X, so we pop from the stack and we continue with the 
>> next object (at some point in the future we will pop Y and Z) When 
>> there are no more objects in the stack it means we are done.
>
> How do you write X? I can see that you turn a recursive trace into 
> depth-by-depth trace but the difficult part is to write X. In X you 
> need a reference representation of any sort. How do you derive that 
> from the stack?
>
>> For performance (I think) we did FLSimpleStack. You can check also 
>> FLSimpleStackTest. Both are in Fuel repo.
>>
>> Would something like that help for the stack in GemStone?
>>
> Yes. With this approach the resulting stack size is much smaller. To 
> exceed the stack setting you need a graph containing an object which 
> has a shortest route that is larger than the stack limit. And I think 
> this is very unlikely :)
> I would need to change Sixx in order to have it work like this. I need 
> to think about it.
>
> Thanks,
>
> Norbert
Yes, using a specific stack in the way Fuel does relies on there being a 
separate analyze-phase before the actual serialization of  the objects.
(Due to wanting to group objects of the same type together when serializing)

You could do the same in a "raw" serializer like SIXX:
1) pop first object from stack
2) if any of the instvars need to be serialized, and their ids can not 
be found in cache, push self + missing objects
3) if not, serialize self, then goto 1

The key difference is you can not considering an object "done" during an 
analysis phase, but rather push it back on the stack for reevaluation 
after its dependencies have been handled in the same way.

So the resulting stack will be much larger than what you see in Fuel, 
but still kept outside the call stack.

Cheers,
Henry


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.pharo.org/pipermail/pharo-dev_lists.pharo.org/attachments/20110603/3df50581/attachment-0001.html>


More information about the Pharo-dev mailing list