-
-
Notifications
You must be signed in to change notification settings - Fork 418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix invariant violation in Pony runtime hashes #1886
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
832d599
to
27459c0
Compare
I did a manual CHANGELOG entry to better capture this. |
plietar
reviewed
May 4, 2017
src/libponyrt/ds/hash.c
Outdated
// Reconstute our invariants | ||
// | ||
// During the process of removing "dead" items from our hash, it is | ||
// possible to violate the invariants of our map. We will no proceed to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will no_w_ proceed to
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pushing a fix.
If you were being facetious, you could describe the Pony runtime as a series of hashmaps that are held together by some code. Hash performance and correctness can have a great impact on everything else in the runtime because they are at the basis of most everything else in the runtime. This change fixes a number of issues that appears to be garbage collection bugs but were in fact, problems with invariant violation in the underlying hash implementation. It should be noted that while the rest of this comment discuss invariant violations that exist in our Robin Hood hash implementation, some of the bugs that this closes predate the Robin Hood implementation. This leads me to believe that the previous implementation had some subtle problem that could occur under some rare interleaving of operations. How this occurred is unknown at this time and probably always will be unless someone wants to go back to the previous version and use what we learned here to diagnose the state of the code at that time. This patch closes issues #1781, #1872, and #1483. It's the result of teamwork amongst myself, Sylvan Clebch and Dipin Hora. History should show that we were all involved in this resolution. The skinny: When garbage collecting items from our hash, that is, removing deleted items to free up space, we can end up violating hash invariants. Previously, one of these invariants was correctly fixed, however, it incorrectly assumed another invariant held but that is not the case. Post garbage collection, if any items have been deleted from our hash, we do an "optimize" operation on each hash item. We check to see if the location the item would hash to is now an empty bucket. If it is, we move the item to that location thereby restoring the expected chaining. There is, however, a problem with doing this. It's possible over time to violate another invariant when fixing the first violation. For a given item at a given location in the hash, each item has a probe value. An invariant of our data structure is that items at earlier locations in the hash will always have an equal or lower probe value for that location than items that come later. For example, items: "foo" and "bar". Given a hashmap whose size is 8, where "foo" would made to index 1 and "bar" would map to index "2". When looking at the probe values for "foo" and "bar" at index 1, "foo" would have a probe value of "0" as it is at the location it hashes to whereas "bar" would have a probe value of "7". The value is the number of indexes away from our "natural" hash index that the item is. When search the hash, we can use this probe value to not do a linear search of all indexes for the a given key. Once we find an item whose probe value for a given index is higher than ours, we know that the key can't be in the map past that index. Except our course for when we are restoring invariants after a delete. It's possible, due to the sequential nature of our "optimize" repair step, to violate this "always lower probe value" invariant. The previous implementation of "optimize_item" assumed that in invariant held true. By not detecting the invariant violation and fixing it, we could end up with maps where a key existed in it but it wouldn't be found. When the map in question was an object map used to hold gc'able items, this would result in an error that appears to be a gc error. See #1781, #1872, and #1483. It should be noted, that because of the complex chain of events that needs to occur to trigger this problem that we were unable to devise a unit test to catch this problem. If we had property based testing for the Pony runtime, this most likely would have been caught. Hopefully, PR #1840 to add rapidcheck into Pony happens soon. Closes #1781 Closes #1872 Closes #1483
jemc
added a commit
that referenced
this pull request
Jul 2, 2017
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
If you were being facetious, you could describe the Pony runtime as
a series of hashmaps that are held together by some code. Hash
performance and correctness can have a great impact on everything
else in the runtime because they are at the basis of most everything
else in the runtime.
This change fixes a number of issues that appears to be garbage
collection bugs but were in fact, problems with invariant violation
in the underlying hash implementation. It should be noted that while
the rest of this comment discuss invariant violations that exist in
our Robin Hood hash implementation, some of the bugs that this closes
predate the Robin Hood implementation. This leads me to believe that
the previous implementation had some subtle problem that could occur
under some rare interleaving of operations. How this occurred is
unknown at this time and probably always will be unless someone wants
to go back to the previous version and use what we learned here to
diagnose the state of the code at that time.
This patch closes issues #1781, #1872, and #1483. It's the result of
teamwork amongst myself, Sylvan Clebch and Dipin Hora. History should
show that we were all involved in this resolution.
The skinny:
When garbage collecting items from our hash, that is, removing
deleted items to free up space, we can end up violating hash
invariants. Previously, one of these invariants was correctly fixed,
however, it incorrectly assumed another invariant held but that is
not the case.
Post garbage collection, if any items have been deleted from our
hash, we do an "optimize" operation on each hash item. We check to
see if the location the item would hash to is now an empty bucket.
If it is, we move the item to that location thereby restoring the
expected chaining. There is, however, a problem with doing this.
It's possible over time to violate another invariant when fixing the
first violation.
For a given item at a given location in the hash, each item has a
probe value. An invariant of our data structure is that items at
earlier locations in the hash will always have an equal or lower
probe value for that location than items that come later.
For example, items: "foo" and "bar". Given a hashmap whose size is
8, where "foo" would made to index 1 and "bar" would map to index
"2". When looking at the probe values for "foo" and "bar" at index
1, "foo" would have a probe value of "0" as it is at the location
it hashes to whereas "bar" would have a probe value of "7". The
value is the number of indexes away from our "natural" hash index
that the item is.
When search the hash, we can use this probe value to not do a linear
search of all indexes for the a given key. Once we find an item
whose probe value for a given index is higher than ours, we know
that the key can't be in the map past that index.
Except our course for when we are restoring invariants after a
delete. It's possible, due to the sequential nature of our
"optimize" repair step, to violate this "always lower probe
value" invariant.
The previous implementation of "optimize_item" assumed that in
invariant held true. By not detecting the invariant violation and
fixing it, we could end up with maps where a key existed in it but
it wouldn't be found. When the map in question was an object map
used to hold gc'able items, this would result in an error that
appears to be a gc error. See #1781, #1872, and #1483.
Closes #1781
Closes #1872
Closes #1483