You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm working with a large dataset of multiple files of PacBio CLR reads, and at the moment filtlong is running on a combined file containing all the data (267 Gb). It seems it might be faster if there were an option to build and save a kmer hash of the Illumina reads used for QC so that the same hash could be used by multiple independent processes running on individual files. If fast read access to the hash is important, it could be copied to local scratch space on each individual node, so each process has its own copy.
The text was updated successfully, but these errors were encountered:
I'm working with a large dataset of multiple files of PacBio CLR reads, and at the moment filtlong is running on a combined file containing all the data (267 Gb). It seems it might be faster if there were an option to build and save a kmer hash of the Illumina reads used for QC so that the same hash could be used by multiple independent processes running on individual files. If fast read access to the hash is important, it could be copied to local scratch space on each individual node, so each process has its own copy.
The text was updated successfully, but these errors were encountered: