[Pharo-users] Voluntarily cancelling requests ("applying an expiration date")
holger at freyther.de
Sun Feb 9 10:31:31 EST 2020
tl;dr: I am searching for a pattern (later code) to apply expiration to operations.
One nice aspect of Mongodb is that it has built-in data distribution and configurable retention. The upstream project has a document called "Server Discovery and Monitoring (SDAM)", defining how a client should behave. Martin Dias is currently implementing SDAM in MongoTalk/Voyage and I took it on a test drive.
My software stack is using Zinc, Zinc-REST, Voyage and Mongo. When a new REST requests arrives I am using Voyage (e.g. >>#selectOne:) which will use MongoTalk. The MongoTalk code needs to select the right server. It's currently done by waiting for a result.
Next I started to simulate database outages. The rest clients retried when not receiving a result within two seconds (no back-off/jitter). What happened was roughly the following:
1.) ZnServer accepts a new connection
2.) MongoTalk waits for a server longer than 2s
"nothing.. the above waits..."
What happened next surprised me. I expected to have a bad time when the database recovers and all the stale (remember the REST clients already gave up and closed the socket) requests will be answered. Instead my image crashed early in my test as the ExternalSemaphoreTable was full.
Let's focus on the timeout behavior and discuss the existence of the ExternalSemaphoreTable and the number of entries separately at a different time.
To me the two main problems I see are:
1.) Lack of back-pressure for ZnManagingMultiThreadedServer
2.) Disconnect of time between the Application Layer handling REST is allowed to take and down the stack how long MongoTalk may sleep and wait for a server.
The first item is difficult. Even answering HTTP 500 when we are out of space in the ExternalSemaphore is difficult... Let's ignore this for now as well.
What I look for:
1.) Voluntarily Timeout
Inside my Application code I would like to tag an operation with a timeout. This means everything that is done should complete within X seconds. It can be used on a voluntarily basis.
"We expect all database operations to complete within two seconds"
person := ComputeContext current withTimeout: 2 seconds during: [
repository selectOne: Person where: [:each name | ...],
"See if the outer context timeout has expired and signal. E.g. before writing
something into the socket to keep consistency."
ComputeContext current checkExpired.
"Sleep for up to the remaining time out
(someSemaphore waitTimeoutContext: ComputeContext current) ifFalse: [
More difficult to write in pseudo code (without TaskIt?). In my above case we are waiting for the database to be ready while the client already closed the file descriptor. Now we are not able to see this until much later.
The idea is that in addition to the timeout we can pass a block that is called when an operation should be cancelled and the ComputeContext can be checked if something has been cancelled?
The above takes inspiration from Go's context package. In Go the context should be passed as parameter but we could make it a Process variable?
How do you handle this in your systems? Is this something we can consider for Pharo9?
 It has the concept of "replicationSet" and works by having a primary, secondary and arbiters running.
 For every write one can configure if the write should succeed immediately (before it is even on disk) or when it has been written to multiple stores (e.g. majority, US and EMEA)
More information about the Pharo-users