[Pharo-dev] Issue 20309 - Startup should run always in a fresh process

Guillermo Polito guillermopolito at gmail.com
Wed Aug 16 07:28:43 EDT 2017


On Wed, Aug 16, 2017 at 1:21 PM, Denis Kudriashov <dionisiydk at gmail.com>
wrote:

> 2017-08-16 12:28 GMT+02:00 Guillermo Polito <guillermopolito at gmail.com>:
>
>>
>>
>> On Wed, Aug 16, 2017 at 12:13 PM, Denis Kudriashov <dionisiydk at gmail.com>
>> wrote:
>>
>>>
>>>
>>> 2017-08-16 12:02 GMT+02:00 Guillermo Polito <guillermopolito at gmail.com>:
>>>
>>>>
>>>>
>>>> On Wed, Aug 16, 2017 at 10:50 AM, Denis Kudriashov <
>>>> dionisiydk at gmail.com> wrote:
>>>>
>>>>> There is possibility where delay can be used during startup/shutdown.
>>>>> Library can clean resources which are managed by kind of pool which
>>>>> organizes timeout logic to enter synchronization monitor.
>>>>>
>>>>> I checked Seamless which manages connections this way. But I not found
>>>>> any issue there.
>>>>>
>>>>
>>>> Can you explain this in more detail? I want to know exactly what "clean
>>>> resources", "kind of pool" and "timeout logic to enter synchronization
>>>> monitor" mean concretely.
>>>>
>>>> Also, how are you subscribing seamless to the startup list?
>>>>
>>>
>>> Seamless server cleans all opened connections on image save. It
>>> registers using:
>>>
>>> SessionManager default registerNetworkClassNamed: self name
>>>
>>> I copied this logic from ZnServer.
>>> But Seamless manages connections using ObjectPool. So when image save is
>>> performed "connectionPool clear" is evaluated. It closes connections and
>>> reset all caches. Problem that ObjectPool is protected by Monitor with
>>> timeout option. And #clear method enters this monitor. It is possible that
>>> at the time of image save monitor will be busy and #clear method will wait
>>> for delay to enter critical section.
>>>
>>>
>> Does this means that there may be a timeout while closing connections
>> during shutdown?
>> Because in that case the session manager will continue shutting down
>> things and seamless may be not fully cleaned up upon next startup.
>>
>
> Yes. It is possible. But it looks like very rare case. Interesting if
> there is real solution to this.
>

Well... maybe we need some actions that can "cancel" the shutdown? Like in
the operating system, when there is an app that cannot be closed, it asks
you to force close or not...

Otherwise, I'd advice that at shutdown seamless should be in any case safer
and avoid that case. Maybe you want to clear the pool without a timeout?
Maybe you want to force kill connections without waiting for them (because
they may never end)?


> From the other side I am not sure why connections should be closed when
> image is saved. In case of Seamless pool is constructed in the way that it
> checks if socket is valid before borrow it to user.
>

Well, that's a seamless issue, isn't it? Did you try not subscribing
seamless to the startup to see if it still behaves well?


>
>
>>
>>
>>>
>>>
>>>>
>>>>>
>>>>>
>>>>> 2017-08-16 1:46 GMT+02:00 Guillermo Polito <guillermopolito at gmail.com>
>>>>> :
>>>>>
>>>>>>
>>>>>> On Tue, Aug 15, 2017 at 3:25 PM, Ben Coman <btc at openinworld.com>
>>>>>> wrote:
>>>>>>
>>>>>>> In case any of the shutdown/startup scripts use a delay, now or in
>>>>>>> the future,
>>>>>>> I'd first try at    highestPriority-1 to avoid influence on the
>>>>>>> DelayScheduler.
>>>>>>> but then Eliot's suggestion to valueUnpreemptively may avoid that
>>>>>>> anyway.
>>>>>>>
>>>>>>
>>>>>> Why should a shutdown/startup use a delay? startup and shutdown
>>>>>> should be fast and not be blocked... If there is a delay on client code, it
>>>>>> should block a client thread, not a system thread...
>>>>>>
>>>>>> Moreover, I see a series of issues in having the delay process
>>>>>> running in higher priority than the startup. If I'm wrong, please correct
>>>>>> me because otherwise that means there is something I'm not getting.
>>>>>>
>>>>>> First, today the Delay scheduling process is being terminated on
>>>>>> shutdown and being re-initialized on startup. This means that even if a
>>>>>> shutdown/startup action tries to use a delay that will fail/block
>>>>>> indefinitely?
>>>>>>
>>>>>> Second, what happens with race conditions between the startup and the
>>>>>> delay process? If the shutdown is in the middle of terminating the delay
>>>>>> process and the delay process gets suddenly activated?
>>>>>>
>>>>>> stopTimerEventLoop
>>>>>> "Stop the timer event loop"
>>>>>>
>>>>>> timerEventLoop ifNotNil: [ timerEventLoop terminate ].
>>>>>> timerEventLoop := nil.
>>>>>>
>>>>>> Maybe before terminating the timerEventLoop we need to suspend it?
>>>>>> That will at least atomically (primitive) remove the process from the ready
>>>>>> list and avoid it from being activated again, no?
>>>>>>
>>>>>> In any case, I see no good in letting a delay work on startup. That
>>>>>> is far too low level and the system would be in a far too unstable state to
>>>>>> run any code other than the startup itself.
>>>>>>
>>>>>>
>>>>>>> btw, what happens if an error occurs inside valueUnpreemptively?
>>>>>>> Does the normal priority debugger still get to run?
>>>>>>>
>>>>>>> cheers -ben
>>>>>>>
>>>>>>> On Mon, Aug 14, 2017 at 6:42 PM, Guillermo Polito <
>>>>>>> guillermopolito at gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I'm proposing a kind-of critical change that I believe is very good
>>>>>>>> for the health of the system: I want that the startup of the system runs in
>>>>>>>> maximum priority and becomes non-interruptable.
>>>>>>>>
>>>>>>>> Right now, when you save your image, the shutdown and startup are
>>>>>>>> run in the same priority than the process that triggered the save (usually
>>>>>>>> the ui or the command line, priority 40). This can cause lots of problems
>>>>>>>> and race conditions: processes with higher priorities can interrupt the
>>>>>>>> shutdown/startup and try to do something while the system is unstable. As a
>>>>>>>> side effect also, when you use extensively the command line, you start
>>>>>>>> stacking startup contexts from old sessions:
>>>>>>>>
>>>>>>>> ...
>>>>>>>> session 3 ctxt 4 <- This guy makes a save and a new session starts
>>>>>>>> session 3 ctxt 3
>>>>>>>> session 3 ctxt 2
>>>>>>>> session 3 ctxt 1
>>>>>>>> session 2 ctxt 4 <- This guy makes a save and a new session starts
>>>>>>>> session 2 ctxt 3
>>>>>>>> session 2 ctxt 2
>>>>>>>> session 2 ctxt 1
>>>>>>>> session 1 ctxt 4 <- This guy makes a save and a new session starts
>>>>>>>> session 1 ctxt 3
>>>>>>>> session 1 ctxt 2
>>>>>>>> session 1 ctxt 1
>>>>>>>>
>>>>>>>> Old contexts are never collected, and the objects they referenced
>>>>>>>> neither.
>>>>>>>>
>>>>>>>> To fix these two problems I propose to do every image save/session
>>>>>>>> start in a new process in maximum priority. That way, other process should
>>>>>>>> not be able to interrupt the startup process. Moreover, every session
>>>>>>>> shutdown/startup should happen in a new clean process, to avoid the session
>>>>>>>> stacking.
>>>>>>>>
>>>>>>>> For normal users, this should have no side effect at all. This
>>>>>>>> change will have a good impact on people working on the debugger and the
>>>>>>>> stack such as fueling-out the stack because they will have a cleaner stack.
>>>>>>>>
>>>>>>>> There is however a side-effect/design point to consider: startup
>>>>>>>> actions should be quick to run. If a startup action requires to run a
>>>>>>>> long-running action such as starting a server or managing a command line
>>>>>>>> action, that should run in a separate process with lower priority (usually
>>>>>>>> userPriority). In other words, the startup action should create a new
>>>>>>>> process managing its action.
>>>>>>>>
>>>>>>>> If you want to review (and I'd be glad)
>>>>>>>>
>>>>>>>> Pull request: https://github.com/pharo-project/pharo/pull/198
>>>>>>>> Fogbugz issue: https://pharo.fogbugz.com/f/cases/20309
>>>>>>>> Current validation going on: https://ci.inria.fr/pharo-ci-j
>>>>>>>> enkins2/job/Test%20pending%20pull%20request%20and%20branch%2
>>>>>>>> 0Pipeline/view/change-requests/job/PR-198/
>>>>>>>>
>>>>>>>> Guille
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Guille Polito
>>>>>>>>
>>>>>>>>
>>>>>>>> Research Engineer
>>>>>>>>
>>>>>>>> French National Center for Scientific Research -
>>>>>>>> *http://www.cnrs.fr* <http://www.cnrs.fr>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *Web:* *http://guillep.github.io* <http://guillep.github.io>
>>>>>>>>
>>>>>>>> *Phone: *+33 06 52 70 66 13 <+33%206%2052%2070%2066%2013>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>>
>>>>>>
>>>>>> Guille Polito
>>>>>>
>>>>>>
>>>>>> Research Engineer
>>>>>>
>>>>>> French National Center for Scientific Research - *http://www.cnrs.fr*
>>>>>> <http://www.cnrs.fr>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *Web:* *http://guillep.github.io* <http://guillep.github.io>
>>>>>>
>>>>>> *Phone: *+33 06 52 70 66 13 <+33%206%2052%2070%2066%2013>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>>
>>>> Guille Polito
>>>>
>>>>
>>>> Research Engineer
>>>>
>>>> French National Center for Scientific Research - *http://www.cnrs.fr*
>>>> <http://www.cnrs.fr>
>>>>
>>>>
>>>>
>>>> *Web:* *http://guillep.github.io* <http://guillep.github.io>
>>>>
>>>> *Phone: *+33 06 52 70 66 13 <+33%206%2052%2070%2066%2013>
>>>>
>>>
>>>
>>
>>
>> --
>>
>>
>>
>> Guille Polito
>>
>>
>> Research Engineer
>>
>> French National Center for Scientific Research - *http://www.cnrs.fr*
>> <http://www.cnrs.fr>
>>
>>
>>
>> *Web:* *http://guillep.github.io* <http://guillep.github.io>
>>
>> *Phone: *+33 06 52 70 66 13 <+33%206%2052%2070%2066%2013>
>>
>
>


-- 



Guille Polito


Research Engineer

French National Center for Scientific Research - *http://www.cnrs.fr*
<http://www.cnrs.fr>



*Web:* *http://guillep.github.io* <http://guillep.github.io>

*Phone: *+33 06 52 70 66 13
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.pharo.org/pipermail/pharo-dev_lists.pharo.org/attachments/20170816/e764b556/attachment.html>


More information about the Pharo-dev mailing list