[Pharo-users] Voluntarily cancelling requests ("applying an expiration date")

Sabine Manaa manaa.sabine at gmail.com
Tue Feb 11 03:21:27 EST 2020


Hi Holger,

I did not completely understand your mail but when reading the answer of
Sven I remembered that some time ago we also had problems with running out
of semaphores.

After writing with Esteban Lorenzano, we did the following:
1)  Smalltalk vm maxExternalSemaphoresSilently: 65535. at startup of the
application
2)  setting the default pool size of VOMongoRepository from 10 to 2 (we
have our own VOMongoRepository subclass)

Perhaps this is not your topic perhaps it helps.
Sabine


Am Mo., 10. Feb. 2020 um 15:14 Uhr schrieb Sven Van Caekenberghe <
sven at stfx.eu>:

> Hi Holger,
>
> That is a complicated story ;-)
>
> But, you running out of external semaphores means that you are using too
> many sockets, are not closing/releasing them (in time) and/or your GC does
> not run enough to keep up (it is easy to deplete the external semaphore
> table without the GC kicking in).
>
> You must have a loop somewhere that goes too fast and maybe does not clean
> up properly while doing so.
>
> YMMV, but I do similar things -- implement/offer REST services that call
> other REST/network services, all with timeouts, in several variations, for
> years, and I do not have problems like you describe.
>
> I would suggest enabling logging so that you can see better where the
> allocations happen and if your cleanup code does its work.
>
> Sven
>
> PS: Zinc logging is easy, just do
>
>   ZnLogEvent logToTranscript
>
> > On 9 Feb 2020, at 16:31, Holger Freyther <holger at freyther.de> wrote:
> >
> > tl;dr: I am searching for a pattern (later code) to apply expiration to
> operations.
> >
> >
> >
> > Introduction:
> >
> > One nice aspect of Mongodb is that it has built-in data distribution[1]
> and configurable retention[2]. The upstream project has a document called
> "Server Discovery and Monitoring (SDAM)", defining how a client should
> behave. Martin Dias is currently implementing SDAM in MongoTalk/Voyage and
> I took it on a test drive.
> >
> >
> > Behavior:
> >
> > My software stack is using Zinc, Zinc-REST, Voyage and Mongo. When a new
> REST requests arrives I am using Voyage (e.g. >>#selectOne:) which will use
> MongoTalk. The MongoTalk code needs to select the right server. It's
> currently done by waiting for a result.
> >
> > Next I started to simulate database outages. The rest clients retried
> when not receiving a result within two seconds (no back-off/jitter). What
> happened was roughly the following:
> >
> >
> > [
> >       1.) ZnServer accepts a new connection
> >       2.) MongoTalk waits for a server longer than 2s
> >       "nothing.. the above waits..."
> > ] repeat.
> >
> >
> >
> >
> > Problem:
> >
> > What happened next surprised me. I expected to have a bad time when the
> database recovers and all the stale (remember the REST clients already gave
> up and closed the socket) requests will be answered. Instead my image
> crashed early in my test as the ExternalSemaphoreTable was full.
> >
> > Let's focus on the timeout behavior and discuss the existence of the
> ExternalSemaphoreTable and the number of entries separately at a different
> time.
> >
> >
> >
> >
> > To me the two main problems I see are:
> >
> >
> > 1.) Lack of back-pressure for ZnManagingMultiThreadedServer
> >
> > 2.) Disconnect of time between the Application Layer handling REST is
> allowed to take and down the stack how long MongoTalk may sleep and wait
> for a server.
> >
> >
> > The first item is difficult. Even answering HTTP 500 when we are out of
> space in the ExternalSemaphore is difficult... Let's ignore this for now as
> well.
> >
> >
> >
> >
> >
> >
> > What I look for:
> >
> >
> > 1.) Voluntarily Timeout
> >
> > Inside my Application code I would like to tag an operation with a
> timeout. This means everything that is done should complete within X
> seconds. It can be used on a voluntarily basis.
> >
> >
> >>> #lookupPerson
> >
> >   "We expect all database operations to complete within two seconds"
> >   person := ComputeContext current withTimeout: 2 seconds during: [
> >       repository selectOne: Person where: [:each name | ...],
> >   ].
> >
> >
> >
> > MongoTalk>>stuff
> >  "See if the outer context timeout has expired and signal. E.g. before
> writing
> >  something into the socket to keep consistency."
> >  ComputeContext current checkExpired.
> >
> >
> > MongoTalk>>other
> >  "Sleep for up to the remaining time out
> >  (someSemaphore waitTimeoutContext: ComputeContext current) ifFalse: [
> >     SomethingExpired signal.
> >  ]
> >
> >
> >
> > 2.) Cancellation
> >
> >
> > More difficult to write in pseudo code (without TaskIt?). In my above
> case we are waiting for the database to be ready while the client already
> closed the file descriptor. Now we are not able to see this until much
> later.
> >
> > The idea is that in addition to the timeout we can pass a block that is
> called when an operation should be cancelled and the ComputeContext can be
> checked if something has been cancelled?
> >
> >
> >
> >
> > The above takes inspiration from Go's context package[3]. In Go the
> context should be passed as parameter but we could make it a Process
> variable?
> >
> >
> >
> >
> >
> > Question:
> >
> > How do you handle this in your systems? Is this something we can
> consider for Pharo9?
> >
> >
> >
> > thanks
> >       holger
> >
> >
> >
> >
> >
> >
> >
> >
> > [1] It has the concept of "replicationSet" and works by having a
> primary, secondary and arbiters running.
> > [2] For every write one can configure if the write should succeed
> immediately (before it is even on disk) or when it has been written to
> multiple stores (e.g. majority, US and EMEA)
> > [3] https://golang.org/pkg/context/
> >
> >
> >
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.pharo.org/pipermail/pharo-users_lists.pharo.org/attachments/20200211/6d606cbb/attachment.html>


More information about the Pharo-users mailing list