-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using COUNT with bcftools filter across all samples #2363
Comments
I think that should be possible using the reverse logic of
|
This issue was closed so I'm not sure if my comment will be seen, but your suggestion doesn't work for me. Using the syntax you gave, it only filters against the first sample (sample 0). I'm looking for a filter statement that will match the condition against ALL samples. Consider the following, in which I run your suggested filter, query for REF, ALT, and AD values, then grep for cases where COUNT(AD) is greater than 2:
Which results in this (shown as a table to make it easier to read):
You can see that for each of these rows, there's at least one sample that has more than two AD values (but it's never sample1). This makes sense, because the syntax of the filter expression you suggested only checks the first sample. My ultimate goal is to retain sites where ALL samples have the same number of AD values as the total number of REF and ALT alleles, something like this (which doesn't work as written, but seems like it should given the expression syntax documentation):
It's easy enough for me to generate something like this, which checks each sample individually:
But I was hoping there was a quicker and more elegant way to do it. |
Is there a way to use
bcftools filter
with COUNT to return only positions with a certain number of values for a given FORMAT entry across all samples?For example, I'd like to filter out positions where the number of all AD values is exactly two. I can do this for one sample, like this:
Is there a way to do this for all samples? Or do I have to enter each position manually? I've tried various permutations of the indexing syntax but it doesn't quite seem to do what I need.
thanks!
The text was updated successfully, but these errors were encountered: