[Pharo-dev] [Vm-dev] Frequent SegFaults in PharoVM with Pharo 3.0

Stéphane Ducasse stephane.ducasse at inria.fr
Mon Nov 25 10:27:53 EST 2013


thanks clement.
Max this is a nice catch.

Stef
On Nov 25, 2013, at 1:13 PM, Clément Bera <bera.clement at gmail.com> wrote:

> Hello,
> 
> In this version I added #timesRepeat: as a special selector to be inlined by the compiler (as #to:do:, #ifTrue:, ...). There was already another side effect of this addition of timesRepeat: as a special selector for XStreams. I guess we should remove it.
> 
> Try to edit the default Opal options in:
> OpalCompiler>>#defaultOptions
> 
> instead of:
> 	+ optionInlineTimesRepeat
> put:
>         - optionInlineTimesRepeat
> 
> Then execute: OpalCompiler recompileAll.
> (If you run your code with a DoIt you do not need to recompileAll, but just in case ...)
> 
> And see if you bug comes from the timesRepeat:.
> 
> Now the bytecode for "3 timesRepeat: [ Smalltalk garbageCollect ]" is correct. But there may be some side effect in the VM I am not thinking of.
> 
> 29 <20> pushConstant: 3
> 30 <68> popIntoTemp: 0
> 31 <10> pushTemp: 0
> 32 <76> pushConstant: 1
> 33 <69> popIntoTemp: 1
> 34 <11> pushTemp: 1
> 35 <10> pushTemp: 0
> 36 <B4> send: <=
> 37 <AC 09> jumpFalse: 48
> 39 <41> pushLit: Smalltalk
> 40 <D2> send: garbageCollect
> 41 <87> pop
> 42 <11> pushTemp: 1
> 43 <76> pushConstant: 1
> 44 <B0> send: +
> 45 <69> popIntoTemp: 1
> 46 <A3 F2> jumpTo: 34
> 48 <7C> returnTop
> 
> 
> 
> 2013/11/25 Max Leske <maxleske at gmail.com>
> I was able to isolate the image where the SegFault problem appears first: 30435. Here’s the update method:
> 
> update30435
> 	"self new update30435"
> 	self withUpdateLog: '11729 Sync Opa from repo: #timesRepeat optimization enabled
> 	https://pharo.fogbugz.com/f/cases/11729
> 	
> 11720 DateModel: Add #displayBlock:
> 	https://pharo.fogbugz.com/f/cases/11720
> 	
> 11727 remove unused ivar scrollBarOnLeft in ScrollPane
> 	https://pharo.fogbugz.com/f/cases/11727'.
> 	self loadTogether: self script114 merge: false.
> 	(SystemNavigation new allCallsOn: #timesRepeat:) do: [ :each | each method recompile ].
> 	self flushCaches.
> 
> 
> What I noticed is that 11729 is about #timesRepeat: and the code I used to trigger the SegFaults is “3 timesRepeat: [ Smalltalk garbageCollect ]”. So the problem might actually lie in #timesRepeat: rather than in #garbageCollect:
> 
> @Marcus
> I’ve put you into CC because I hope that you know more about #timesRepeat:
> 
> Cheers,
> Max
> 
> 
> 
> On 25.11.2013, at 08:26, Max Leske <maxleske at gmail.com> wrote:
> 
>> Interesting development: no SegFaults (while using exactly the same code) when using an older Pharo image (version 30321 instead of latest).
>> This obviously suggests that the problem lies with the image and not necessarily the vm.
>> 
>> Max
>> 
>> On 10.11.2013, at 20:10, Max Leske <maxleske at gmail.com> wrote:
>> 
>>> 
>>> On 07.11.2013, at 08:20, Max Leske <maxleske at gmail.com> wrote:
>>> 
>>>> On 07.11.2013, at 00:27, vm-dev-request at lists.squeakfoundation.org wrote:
>>>> 
>>>>> Date: Wed, 6 Nov 2013 19:23:47 -0200
>>>>> From: Mariano Martinez Peck <marianopeck at gmail.com>
>>>>> Subject: Re: [Vm-dev] Frequent SegFaults in PharoVM with Pharo 3.0
>>>>> To: Squeak Virtual Machine Development Discussion
>>>>> 	<vm-dev at lists.squeakfoundation.org>
>>>>> Message-ID:
>>>>> 	<CAA+-=mWDtHGQWSHx-m4u37Kj0UAMx=UG=qUCz+wMvgDT8BaQZQ at mail.gmail.com>
>>>>> Content-Type: text/plain; charset="windows-1252"
>>>>> 
>>>>> Max, did you try with Eliot VMs besides the pharo ones?
>>>> 
>>>> No, I haven’t. I can try (though only on my machine), but if I can’t reproduce the SegFault that doesn’t mean much unfortunately. It took me 10 to 20 tries with the PharoVM too…
>>>> 
>>>> I’ll try anyway and let you know what I find.
>>>> 
>>>> Max
>>> 
>>> I just gave CogVM and NBCog a shot. Unfortunately I can’t use the images because they’ve been built with PharoVM and hang after load. At least on the Pharo CI there are no jobs that create Cog / NBCog images.
>>> 
>>> So no luck with different VMs.
>>> 
>>> Max
>>> 
>>>> 
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> 
>>>>> On Wed, Nov 6, 2013 at 6:04 PM, Max Leske <maxleske at gmail.com> wrote:
>>>>> 
>>>>>> 
>>>>>> Hi
>>>>>> 
>>>>>> We’ve been encountering frequent SegFaults when running the Fuel tests on
>>>>>> the Pharo CI (https://ci.inria.fr/pharo-contribution/job/Fuel/). Since
>>>>>> today I’ve also been able to reproduce the SegFaults on my MacBook Pro (OS
>>>>>> X 10.9) too. We have not been able to determine the cause of the SegFault
>>>>>> but we can produce SegFaults often, although not reliable (more reliable on
>>>>>> the CI).
>>>>>> 
>>>>>> The CI uses the stable VM:
>>>>>> http://files.pharo.org/vm/pharo/linux/stable.zip
>>>>>> I use a newer version from October on my machine:
>>>>>> http://files.pharo.org/vm/pharo/mac/273.zip
>>>>>> 
>>>>>> I’ve attached all the dumps of the crashes I was able to produce on my
>>>>>> machine, together with Apple’s crash logs.
>>>>>> 
>>>>>> I’ve been able to deduce the following:
>>>>>> - garbage collection seems to be a trigger for the SegFault. When one of
>>>>>> the methods FLMethodContextSerialization>>testFuelShouldIgnoreFuel and
>>>>>> FLMethodContextSerialization>>testMethodContextWithNilPc contain the line
>>>>>> “3 timesRepeat: [Smalltalk garbageCollect]” the SegFault appears nearly
>>>>>> always (on CI). When I remove the line the builds run through.
>>>>>> - Not all methods with the garbage collect line trigger a SegFault (I
>>>>>> could only identify those two)
>>>>>> - the garbage collect line itself suffices as a trigger in the mentioned
>>>>>> methods.
>>>>>> - The number of tests (amount of used memory?) seems to influence the
>>>>>> appearence of the SegFault (e.g. loading "DevelopmentGroup" seems to
>>>>>> trigger it more often than loading “Benchmarks”)
>>>>>> - the SegFault always appears after the tests with the garbage collect
>>>>>> line have been run, never before
>>>>>> - the VM can’t write the crash.dmp every time
>>>>>> 
>>>>>> Since the SegFaults are so random I cannot give you an image to reproduce
>>>>>> the problem. I’ve had the best results using a fresh 3.0 image (
>>>>>> http://files.pharo.org/image/30/30549.zip) and then evaluating the
>>>>>> following in a workspace:
>>>>>> 
>>>>>> Gofer it
>>>>>>        smalltalkhubUser: 'Pharo' project:  'Fuel';
>>>>>>        package: 'ConfigurationOfFuel';
>>>>>>        load.
>>>>>> 
>>>>>> ((Smalltalk at: #ConfigurationOfFuel) project version: #bleedingEdge)
>>>>>> load: 'DevelopmentGroup'.
>>>>>> 
>>>>>> HDTestReport
>>>>>>        runClasses: (TestCase allSubclasses select: [ :class | class name
>>>>>> beginsWith: 'FL'])
>>>>>>        named: ‘foo'
>>>>>> 
>>>>>> If it doesn’t work, try using the TestRunner manually. Select the default
>>>>>> Fuel tests (alphabetically at F) and the additional Fuel tests (at the
>>>>>> bottom of the list) and run them.
>>>>>> 
>>>>>> If anybody has any clue about what could be going on I’d really appreciate
>>>>>> any input. I’ll happily provide more information if I can.
>>>>>> 
>>>>>> Thanks for reading :)
>>>>>> Max
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> Mariano
>>>>> http://marianopeck.wordpress.com
>> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.pharo.org/pipermail/pharo-dev_lists.pharo.org/attachments/20131125/7f2f3730/attachment-0002.html>


More information about the Pharo-dev mailing list