Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lengths ASVs V3-V4 region #2096

Open
andrebolerbarros opened this issue Mar 18, 2025 · 0 comments
Open

Lengths ASVs V3-V4 region #2096

andrebolerbarros opened this issue Mar 18, 2025 · 0 comments

Comments

@andrebolerbarros
Copy link

Hi everyone,

I am currently working with NovaSeq data, using the modified loess for monotonicity, and trimmed according to what is discussed in the tutorial (trimleft for primers, kept maximum sequence length for trunLen).

I ran DADA2, removed chimeras, kept only ASV's classified as "Bacteria" and also removed unclassified ASVs at the Phylum level. The final distribution of ASVs length is the following:

352 353 410 419 420 421 422 423 424 426 427 429 437 438 439 440 441 442 444 445 446 
  1   1   1   8 163  74  66  89  12   4   3   1  13  19  91 159   6   1   9  73 3

Image

According to this previous issue (#896 (comment)) and references such as this: https://pmc.ncbi.nlm.nih.gov/articles/PMC5785224/, I can safely conclude variation is expected, in a binomial fashion.

Doing some "crude" math, we get:

161nt (V3) + 291nt (V4) = 452nt
452nt - (17 primer fwd + 20 primer rev) = 415nt

186nt (V3) + 291nt (V4) = 477nt
477nt - (17 primer fwd + 20 primer rev) = 440nt

So, the values I get are around ~5nt of difference from the "theoretical" values. If my math is correct, then my results match with the expectation, correct? I am still puzzled with the number of intermediate values...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant