Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does porechop_ABI work for Oxford Nanopore DirectRNA sequencing data? (e.g., SQK-RNA002 & SQK-RNA004) #19

Closed
bernardo-heberle opened this issue Dec 11, 2023 · 3 comments
Labels
question Further information is requested

Comments

@bernardo-heberle
Copy link

bernardo-heberle commented Dec 11, 2023

Does porechop_ABI work for Oxford Nanopore DirectRNA sequencing data? (e.g., SQK-RNA002 & SQK-RNA004)

This thread suggests that the original porechop does not work on directRNA, but I am not sure if this extension does.

@bernardo-heberle bernardo-heberle changed the title Does porechop work for Oxford Nanopore DirectRNA sequencing data? (e.g., SQK-RNA002 & SQK-RNA004) Does porechop_ABI work for Oxford Nanopore DirectRNA sequencing data? (e.g., SQK-RNA002 & SQK-RNA004) Dec 11, 2023
@qbonenfant qbonenfant added the question Further information is requested label Dec 11, 2023
@qbonenfant
Copy link
Collaborator

Hi,
Thanks for your interest in Porechop_ABI, and sorry about the delayed response (I have been quite busy).

Short answer: It will find something, but I can not garantee it will be the adapter you are seeking. You will most likely have to check manually if the result is relevant or not.

Long answer:

Porechop_ABI is a frequency-based tool. It's reliability heavilly depends on the frequency delta between DNA/RNA sequences VS adapter sequences:

  • If you sequence highly expressed RNA sequences (structural proteins, 16s RNA,... ) it will likely fail as a trimming tool (as explained in other issues on this repo).

  • If the issue about inconsistent basecalling of direct RNA adapters (discussed in the thread you sent) results in extremely high variation of the adapter sequence accross the dataset, it will likely fail too.

  • If the adapter sequence is only altered from its expected value, but stays consistent, then it should not be a problem at all, even on RNA sequences.

We did not study in depth the discrepancies between inferred adapters and theroretical adapters for direct RNA, so i can't give you a definit answer on this matter.

We did try some direct RNA dataset during the preliminary assement of our method and got mixed results.
In our tests, Porechop_ABI definitely found a strong adapter, but it did not align very well on the reference we had.
It was a long time ago and I do not recall the exact details, nor do I have access to the raw data to check the exact results we had right now (i may have to update this answer later). Also, our method was a lot less refined during those test, I may need to rerun those with the latest version.

I hope this helped, if you have any question, I will keep an eye on this issue.

@bernardo-heberle
Copy link
Author

Thank you for your very thorough and quick response @qbonenfant. This was not a delayed response at all, it only took you 3 hours to respond.

I appreciate you walking me through the details and nuances. I will go ahead and give it a try on the direct RNA datasets I have and see how it goes. Will update the issue once I am done running tests.

@bernardo-heberle
Copy link
Author

bernardo-heberle commented Dec 14, 2023

I tested porechop_ABI with four different directRNA sequencing samples (SQK-RNA002 kit) and it reported the exact same adapter sequence for all 4 sample independently. I think that is a pretty strong sign that it works fairly well for directRNA data. The alignment downstream is very similar regardless if the adapters were trimmed with porechop_ABI or not.

Here is the adapter sequences it found, it is not the reference adapter sequence provided by ONT, but that is to be expected when basecalling DNA with an RNA model.

Start
Consensus_1_start_(100.0%)
TGTGTTTGTTAGTCGCTATGGCTGTCTCTAAAGTTTACGCTAGATCCGTCTACGACTCCCGTGGTAACCCAACCGTCGAAGTCGAATTAACCAC

End
Consensus_1_end_(100.0%)
TTCCACCACGGTGACAAGTTGTAACATCGTCGTGAGTAGTGAACCGTAAGCAAAATAATCCC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants