-
-
Notifications
You must be signed in to change notification settings - Fork 418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Linux epoll event resubscribe performance and race condition. #1564
Conversation
Side note. We've been running with this for 4-6 weeks now a Sendence (I think its been that long), without issue. |
packages/builtin/asio_event.pony
Outdated
@pony_asio_event_get_writeable(ev) | ||
|
||
fun set_writeable(ev: AsioEventID, writeable': Bool) => | ||
@pony_asio_event_set_writeable(ev, writeable') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should have these wrapper functions - the net
package actors should be calling the FFI functions directly.
My objection is that these functions provide ways to read/write to a AsioEventID tag
, which does not respect capabilities and memory safety. That is, just because you've used them safely in this PR (and helpfully commenting above each usage how/why it is safe 🏅), doesn't mean that users will be as careful. And they shouldn't have to be careful - part of the point of Pony is being able to write code that respects the compiler rules and assume that it is concurrency safe without thinking about the issue any deeper than that.
If we make the unsafe operations be FFI-only (by removing these wrappers), than a programmer can only do an unsafe operation by doing an FFI call, which is exactly what we want. The fact that it's an FFI call should set off alarm bells for the programmer, and make them think about carefully proving to themselves the safety of what they're doing, just as you've done here (since they now know the compiler won't be proving the safety for them).
Removing the wrapper functions and simply calling the FFI functions directly from TCPConnection
et al is a simple way to avoid this concern while not fundamentally changing the nature of your solution. These actors are already using FFI extensively, so it shouldn't be a burden.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I'll update and force push to make the changes you've suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes made and pushed.
#endif | ||
|
||
bool readable; /* is fd readable? */ | ||
bool writeable; /* is fd writeable? */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just asking for prudence's sake - does adding these struct fields put us into a higher size_t
category that will make allocating events significantly more expensive? I only ask because we recently went through something similar with adding annotations to ast_t
(see ongoing discussion in #1531).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe so. The size of asio_event_t
is already over 32 bytes (but well below 64 bytes) before the new additions (assuming I did the math correctly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for checking!
Also, just wanted to say thank you for the very thorough and detailed ticket description / commit message! ❤️ |
Prior to this change, if ONE_SHOT is enabled, linux epoll based asio wouldn't be subscribed for certain events after another event was fired on the same FD. The specific scenario was the following: * Register event E for READ/WRITE with ONE_SHOT. * E fires for READ and a message is placed on the message queue for the TCPConnection actor. * TCPConnection actor eventually gets to processing this message (after processing any other messages ahread of this one on the queue) and then goes through the work of `_event_notify`. At the end of `_event_notify`, `_resubscribe_event` is called and epoll is told to resubscribe for READ and/or WRITE events as required. Normally, this wouldn't be an issue but the behavior we're seen is that with ONE_SHOT, if a READ (or WRITE) event is fired, all events for the FD are disabled. This means that in the above scenario, when the READ event is fired, the WRITE is also disabled along with the READ because of ONE_SHOT. We could (and did) try adding a call to `_resubscribe_event` in `_event_notify` before any processing is done to resubscribe earlier but there were two problems with this. First, it still meant that the other event was unsubscribed for the duration of the time the asio event notification message was sitting on the message queue of the TCPConnection actor. Second, it resulted in a race condition and incorrect use of epoll where the TCPConnection actor would ask epoll to resubscribe to either READ or WRITE when the event had already fired and was waiting on the message queue. For the READ event case, this is incorrect use of epoll because the manual for epoll strongly indicates that when using ONE_SHOT and Edge Triggering all data should be read from the socket before it is resubcribed again. The final design we settled upon is to move the readable/writeable flags into the Event structure, split the resubscribe for READ and WRITE events, and have epoll automagically resubscribe to the untriggered event if ONE_SHOT is enabled. This ensures there are no race conditions because the decision to resubscribe is based on the the state of the readable/writeable flags on the event which is always up to date based on the latest events that fired. It also ensures that we resubscribe to the untriggered event as soon as possible because the epoll asio thread takes care of it. * The event now holds the readable/writeable flags. * This is only safe because the asio thread will only write to those flags if there's a subscription for an event. * In order to avoid two threads writing/reading to the new flags at the same time, the resubscribe for reads and writes has been split. * Additionally, the epoll code will now automagically resubscribe to the un-triggered event in order to minimize the delay for when an event is triggered for ONE_SHOT code so it behaves the same as for kqueue ONE_SHOT where if an FD is subscribed with ONE_SHOT for both `read` and `write` events and a `read` event fires and is disabled because of ONE_SHOT, the `write` event is still left enabled. This also greatly simplifies where/when resubscribes need to occur in the TCPConnection actor. * Update kqueue code to also split readable/writeable resubscribe.
e745da3
to
ed387a0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, but I won't merge just yet since the PR hasn't been open very long, and someone else might want to take a look before merging?
Makes sense. Thanks for your feedback on things and the suggested improvement. |
That description was so good, by the time I read the code, I knew exactly what it did and what I was looking for. Looks good to me, thanks @dipinhora ! |
Prior to this change, if ONE_SHOT is enabled, linux epoll based
asio wouldn't be subscribed for certain events after another
event was fired on the same FD.
The specific scenario was the following:
for the TCPConnection actor.
(after processing any other messages ahread of this one on the
queue) and then goes through the work of
_event_notify
. Atthe end of
_event_notify
,_resubscribe_event
is called andepoll is told to resubscribe for READ and/or WRITE events as
required.
Normally, this wouldn't be an issue but the behavior we're seen
is that with ONE_SHOT, if a READ (or WRITE) event is fired, all
events for the FD are disabled. This means that in the above
scenario, when the READ event is fired, the WRITE is also
disabled along with the READ because of ONE_SHOT.
We could (and did) try adding a call to
_resubscribe_event
in_event_notify
before any processing is done to resubscribeearlier but there were two problems with this. First, it still
meant that the other event was unsubscribed for the duration of
the time the asio event notification message was sitting on the
message queue of the TCPConnection actor. Second, it resulted in
a race condition and incorrect use of epoll where the TCPConnection
actor would ask epoll to resubscribe to either READ or WRITE when
the event had already fired and was waiting on the message queue.
For the READ event case, this is incorrect use of epoll because
the manual for epoll strongly indicates that when using ONE_SHOT
and Edge Triggering all data should be read from the socket before
it is resubcribed again.
The final design we settled upon is to move the readable/writeable
flags into the Event structure, split the resubscribe for READ
and WRITE events, and have epoll automagically resubscribe to the
untriggered event if ONE_SHOT is enabled. This ensures there are
no race conditions because the decision to resubscribe is based on
the the state of the readable/writeable flags on the event which is
always up to date based on the latest events that fired. It also
ensures that we resubscribe to the untriggered event as soon as
possible because the epoll asio thread takes care of it.
write to those flags if there's a subscription for an event.
at the same time, the resubscribe for reads and writes has been
split.
to the un-triggered event in order to minimize the delay for when
an event is triggered for ONE_SHOT code so it behaves the same
as for kqueue ONE_SHOT where if an FD is subscribed with ONE_SHOT
for both
read
andwrite
events and aread
event fires andis disabled because of ONE_SHOT, the
write
event is still leftenabled. This also greatly simplifies where/when resubscribes
need to occur in the TCPConnection actor.