[Pharo-dev] [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Sven Van Caekenberghe sven at stfx.eu
Wed Apr 6 10:07:38 EDT 2016


Yeah, I just realised that iterating and changing a dictionary or set at the same time won't work ;-)

> On 06 Apr 2016, at 16:01, Henrik Johansen <henrik.s.johansen at veloxit.no> wrote:
> 
> If you are iterating over a Set with incorrectly placed objects, remove: calls aren't going to do you much good ;)
> Not to mention, even nilling the slot directly, then add:'ing, still means you have to scan subsequent entries up to the next empty slot for potentially better placement, if the object ended up being added elsewhere.
> (IOW, if you're gonna do it, better iterate in reverse)
> 
> Cheers,
> Henry
> 
>> On 06 Apr 2016, at 3:46 , Sven Van Caekenberghe <sven at stfx.eu> wrote:
>> 
>> 
>>> On 06 Apr 2016, at 15:34, Nicolai Hess <nicolaihess at gmail.com> wrote:
>>> 
>>> 
>>> 
>>> 2016-04-06 15:25 GMT+02:00 Sven Van Caekenberghe <sven at stfx.eu>:
>>> Hi Nicolai,
>>> 
>>>> On 06 Apr 2016, at 14:56, Nicolai Hess <nicolaihess at gmail.com> wrote:
>>>> 
>>>> 
>>>> 
>>>> 2016-04-06 14:27 GMT+02:00 Sven Van Caekenberghe <sven at stfx.eu>:
>>>> Fix for review:
>>>> 
>>>> ===
>>>> Name: STON-Core-SvenVanCaekenberghe.71
>>>> Author: SvenVanCaekenberghe
>>>> Time: 6 April 2016, 2:22:24.782251 pm
>>>> UUID: 64b8b741-365e-41fe-aa98-565e33ca5d24
>>>> Ancestors: STON-Core-SvenVanCaekenberghe.70
>>>> 
>>>> Fix a bug where STONReferences occurring as keys in Dictionaries or elements in Sets caused those to be unhealthy after materialization. Thx to Peter Uhnák for reporting this issue.
>>>> 
>>>> Add 3 new unit tests to STONReaderTests
>>>> 
>>>> #testDictionaryWithReferenceKeys
>>>> #testSetWithReferenceElements
>>>> #testDeepStructure
>>>> 
>>>> Fix Details
>>>> 
>>>> change the implementation of STONReader>>#processSubObjectsOf: from iterative to recursive (see version 39 of 29 November 2012, this might be a functional regression, see #testDeepStructure; cleanup of stack instance variable for later) so that #stonProcessSubObjects: can be overwritten with code being executed before or after full reference resolution
>>>> 
>>>> imho, recursion stack depth will be equal during both writing and reading, and should be acceptable.
>>>> 
>>>> overwrite #stonProcessSubObjects: in Dictionary and Set to #rehash at the end, but only when needed (minimal optimalization, see Dictionary>>#containsStonReferenceAsKey and Set>>#containsStonReference)
>>>> ===
>>>> Name: STON-Tests-SvenVanCaekenberghe.63
>>>> Author: SvenVanCaekenberghe
>>>> Time: 6 April 2016, 2:22:45.01986 pm
>>>> UUID: 0beb2322-b81a-46ee-a0e2-6648a808774a
>>>> Ancestors: STON-Tests-SvenVanCaekenberghe.62
>>>> 
>>>> (idem)
>>>> ===
>>> 
>>> Thanks for looking at the code.
>>> 
>>>> Hi Sven,
>>>> instead of rehashing the dictionary for every ston reference,
>>> 
>>> (It rehashes only once after resolving all references)
>>> 
>>> Ah, of course. I thought this would be called for every ston reference used as key.
>>> 
>>> 
>>> 
>>>> wouldn't it work to remove and readd the value after processing the subobject:
>>>> 
>>>> Dictionary>>#stonProcessSubObjects: block
>>>>   self keys do:[:key |
>>>>       |value|
>>>>       value := block value:(self removeKey: key ifAbsent:[ nil]).
>>>>       self at: (block value: key) put: value].
>>> 
>>> Interesting idea. I have to think about that approach.
>>> 
>>> Now, Object>>#stonProcessSubObjects: is very general and looks at named and indexed instance variables. But this probably could be replaced by something more high level and specific I guess.
>>> 
>>> Adding and removing each key/value has a cost too. I try to make the simplest case very efficient and only pay a price when really needed. Anyway, time for some calculations.
>>> 
>>> ok, yes running over all keys isn't better than rehashing :-)
>> 
>> Still, your idea might be better than you would expect:
>> 
>> Now, there is 1 (partial) iteration to do the check before, 2 iterations over the keys and values arrays, then an optional full rehash (which also reallocates and thus generates garbage).
>> 
>> Your idea does only 1 iteration over keys, with remove and add (which also happens more efficiently on the arrays above), no check before, no rehash, probably no garbage generation at all in any case.
>> 
>> Like I said, I have to study it (especially the cost of remove/add, maybe that can be optimised a bit as well).
>> 
>>> Thanks again for the suggestion !
>>> 
>>> Sven
>>> 
>>>>> On 06 Apr 2016, at 14:04, Sven Van Caekenberghe <sven at stfx.eu> wrote:
>>>>> 
>>>>> https://pharo.fogbugz.com/f/cases/17946/STON-materializes-unhealthy-Dictionaries-and-Sets-when-references-occur-in-its-keys-or-elements
>>>>> 
>>>>> fix coming
>>>>> 
>>>>>> On 05 Apr 2016, at 13:11, Sven Van Caekenberghe <sven at stfx.eu> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> On 05 Apr 2016, at 13:02, Nicolai Hess <nicolaihess at gmail.com> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 2016-04-05 12:32 GMT+02:00 Cyril Ferlicot Delbecque <cyril.ferlicot at gmail.com>:
>>>>>>> 
>>>>>>> 
>>>>>>> On 05/04/2016 12:09, Sven Van Caekenberghe wrote:
>>>>>>> 
>>>>>>>> Like I said, it is a hashing issue, sometimes it will be correct by accident.
>>>>>>>> 
>>>>>>>> I hope you did not have to much trouble with this bug, I guess it must have been hard to chase.
>>>>>>>> 
>>>>>>>> Is it urgent ?
>>>>>>>> 
>>>>>>>> I probably can give you a quick fix, but I would like to think a bit more about this, since rehashing each materialised dictionary seems expensive.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> Hi Sven,
>>>>>>> 
>>>>>>> I got the same kind of problem in a personal application.
>>>>>>> 
>>>>>>> I use Sets that I serialize and I had a lot of trouble because sometimes
>>>>>>> some action had strange behaviours.
>>>>>>> 
>>>>>>> For example in a set with element `aSet remove: aSet anyOne` raised 'XXX
>>>>>>> not found in aSet'.
>>>>>>> 
>>>>>>> I am glad to hear that it is a Ston issue and not me that used sets in a
>>>>>>> bad way :)
>>>>>>> 
>>>>>>> For me too it is not urgent since I have a not of university work for
>>>>>>> the moment.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> How are hashed collections created/filled during ston-parsing ?
>>>>>>> If the position in a hashed collection is created by a ston-reference, that is later replaced by the "real" object,
>>>>>>> the index in the dictionary  (or other hashed collections) may be wrong.
>>>>>> 
>>>>>> Yes, that is indeed it, Nicolai.
>>>>>> 
>>>>>> But I would like to try to minimise the rehashing as it seems expensive. But first I need a more reliable failure.
>>>>>> 
>>>>>>> --
>>>>>>> Cyril Ferlicot
>>>>>>> 
>>>>>>> http://www.synectique.eu
>>>>>>> 
>>>>>>> 165 Avenue Bretagne
>>>>>>> Lille 59000 France
>>>>> 
>> 
>> 
> 




More information about the Pharo-dev mailing list