-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bugfix: Reuse the same file pointer for repeated find
output
#511
base: main
Are you sure you want to change the base?
Conversation
find
output (#439)find
output
Hi, I see some tests failed, could you please check them? Thanks. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #511 +/- ##
==========================================
+ Coverage 87.46% 87.51% +0.05%
==========================================
Files 31 31
Lines 6692 6720 +28
Branches 299 300 +1
==========================================
+ Hits 5853 5881 +28
+ Misses 627 626 -1
- Partials 212 213 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
I executed the test suite on my windows machine and it failed, and I don't know why certainly. |
src/find/matchers/mod.rs
Outdated
// Used on file output arguments. | ||
// When the same file is specified multiple times, the same pointer is used. | ||
struct FileMemoizer { | ||
mem: HashMap<String, Arc<File>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a string as the hash key won't work. It needs to handle different spellings of the same path, as well as hard links:
$ touch file
$ ln file link
$ find -fprint file -fprint ./file -fprint link
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, newbie mistake of my part, will fix it.
src/find/matchers/ls.rs
Outdated
@@ -256,7 +257,7 @@ impl Ls { | |||
impl Matcher for Ls { | |||
fn matches(&self, file_info: &WalkEntry, matcher_io: &mut MatcherIO) -> bool { | |||
if let Some(file) = &self.output_file { | |||
self.print(file_info, file, true); | |||
self.print(file_info, file.clone(), true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.print(file_info, file.clone(), true); | |
self.print(file_info, file.as_ref(), true); |
Same everywhere else, there's usually no need to clone the Arc
…Arc<File> clones. test:Add new tests to symbolic and hardlinks.
…ready imported unix symlink module.
// Based on path inode or file_index check if the file already has been specified. | ||
// If yes, use the same file pointer. | ||
struct FileMemoizer { | ||
mem: HashMap<u64, Arc<File>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A single u64
is not enough to uniquely identify a file. On UNIX-like systems, you need the pair (st_dev, st_ino)
(AKA device ID + inode number) at least for a unique ID. Not sure what's necessary on Windows. But using just the inode number is wrong since inode numbers are only unique within a single file system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I read about it sometime ago but forgot, thanks for the precise explanation.
I will use the std::fs::metadata
to take the st_dev and st_ino from the same source, but using std::os::unix::fs::MetadataExt
and std::os::windows::fs::MetadataExt
.
Will do a manual test to check the assurance of this implementation, but I don't think that has a way to do an automated test of it, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use uucore's FileInformation? Check samefile.rs for an example. (Hopefully it impls Hash)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that makes sense, FileInformation impls PartialEq and Hash, it's more simple than what I was trying to do. Will implement, test, and send a new commit (again).
Fixes #439
This PR creates a kind of memoization to the file creation inside the findutils/find command.
Uses a hashmap to know which file paths already has been created and returns the Arc to prevent overwrite.
It also add a new testing function
find_using_same_out_multiple_times
to these more "complex" cases.But, the expected outputs are fixed, if the assert would be better by catching the expected value dynamically in case of the test_data/simple structure change, let me know!