-
-
Notifications
You must be signed in to change notification settings - Fork 418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix use-after-free triggered by fast actor reaping when the cycle detector is active #4616
Conversation
The logic in the actor run and the cycle detector work together to enable fast reaping of actors with rc == 0. This relies on atomics and protecting the relevant areas of logic with critical sections. This logic unfortunately suffered from a use-after-free bug due to a race between the cycle detector receiving the block message and destroying the actor and the actor cycle detector critical flag being release as identified in ponylang#4614 which could sometimes lead to memory corruption. This commit changes things to remove the need to protect the logic with critical sections. It achieves this by ensuring that an actor with rc == 0 that the cycle detector knows about will never be rescheduled again even if the cycle detector happens to send it a message and the cycle detector is free to reap the actor when it receives the block message. The cycle detector ensures that the actor's message queue is empty or that the only messages pending are the expected ones from the cycle detector so it can safely destroy the actor. resolves ponylang#4614
@redvers it would be ideal if you're able to confirm that this PR resolves your segfault from #4614.. @SeanTAllen it would be ideal if you're able to review the logic changes to confirm i didn't screw anything up as i believe you're the last person to touch this code and likely know the nuances/edge cases that need to be handled best.. |
Hi @dipinhora, The changelog - fixed label was added to this pull request; all PRs with a changelog label need to have release notes included as part of the PR. If you haven't added release notes already, please do. Release notes are added by creating a uniquely named file in the The basic format of the release notes (using markdown) should be:
Thanks. |
@dipinhora I will review and I appreciate your believing that I am still familiar with this code that tortured me a few years ago. I especially appreciate at it, after I believe you found a bug in code I wrote (at least I remember it as "I wrote"). |
This hurts my brain. I think it is good. But I'd love to have @redvers throw lots of requests at it. Writing the code to be removed also hurt my brain. I remember Dipin and I spending A TON OF TIME on it. A metric shit ton of time. |
I definitely have the easiest job here. ;) |
Yup, 15.5 Millions requests later, no SEGV. |
The logic in the actor run and the cycle detector work together to enable fast reaping of actors with rc == 0. This relies on atomics and protecting the relevant areas of logic with critical sections. This logic unfortunately suffered from a use-after-free bug due to a race between the cycle detector receiving the block message and destroying the actor and the actor cycle detector critical flag being release as identified in #4614 which could sometimes lead to memory corruption.
This commit changes things to remove the need to protect the logic with critical sections. It achieves this by ensuring that an actor with rc == 0 that the cycle detector knows about will never be rescheduled again even if the cycle detector happens to send it a message and the cycle detector is free to reap the actor when it receives the block message. The cycle detector ensures that the actor's message queue is empty or that the only messages pending are the expected ones from the cycle detector so it can safely destroy the actor.
resolves #4614