-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bcftools missing lower frequency variants? #1523
Comments
What version of bcftools are you using? I just tested with the latest github version and see it detects the deletion, even with default parameters:
Similarly with the two SNVs:
|
I'm using the latest github version:
If I just use Command:
Truncated output:
I would hope that each of these variants would make it into the final vcf file. However, the first and third one are lost after
Output:
If I use the same command but omit
The two variants that seem real are being removed during
|
The problem is in the |
Thanks for pointing that out. I’m working with viral data (SARS-CoV-2), and we expect there to potentially be intra-host diversity. The hope is to capture the intra-host diversity in a VCF file, if it exists. Is this not going to be possible when specifying the ploidy? |
By setting ploidy to 1, the I will close the issue now as it is not a bug. |
Hi,
Instead of providing the files in #1459, I thought I'd create a new issue since I found a related problem. In #1459, I described that an indel with relatively high coverage was not being recovered by
bcftools mpileup | bcftools call
. The deletion occurs in ~26% of reads:The deletion is picked up by a combination of
samtools mpileup | ivar
, but is not picked up bybcftools mpileup | bcftools call
.Commands:
I've attached an example BAM file (downsampled from the original) in which
ivar
picks up the 17 base deletion occurring at a variant allele frequency of ~0.19 (19%), butbcftools
does not call the variant.Similarly, I've found that well supported variants occurring at a frequencies ~0.30 and lower in other samples are missed by
bcftools
. I've attached an example BAM file wheresamtools
+ivar
picks up the following low frequency variants (among other higher freq variants) that are missed bybcftools
:Visualisation of the BAM file in
igv
shows that the variants appear 'real', e.g.:The commands are the same as above. Any idea what might be going wrong, or how I should change the command(s)? I'd like to switch to
bcftools
for everything, but I also need to guarantee that I won't be missing lower frequency variants. The reference genome is: https://www.ncbi.nlm.nih.gov/nuccore/1798174254Thanks!
The two necessary BAM files are in:
example_data.zip
The text was updated successfully, but these errors were encountered: