-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some support for cis/trans arithmetic #322
Conversation
@@ -48,6 +49,7 @@ def sample( | |||
out_clr_path, | |||
count=None, | |||
frac=None, | |||
cis_target=False, | |||
exact=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider adding ignore_diags
here ? or ultimately dist_min
and dist_max
?
if it does not end up oin this PR - leave an issue, maybe ? for the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would ignore diags only apply to cis_count? Or any count?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say yes ... Also just make sure what ignore_diags
means exactly - ignore diags to set the count target, of course - but what should happen with the ignored_diags in the subsamples matrix ? Should they be zero ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dunno - maybe leave ignore_diags
as an issue to discuss a bit more - but proceed with this one as is ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes meaning ... which of the options?
I'd say the calculation of the subsampling fraction should not take ignore_diags
diags into account, but subsampling would be applied to all data including those diags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK! Merge then?
All done except ignore_diags, also added tests for coverage API, and fixed a bug in exact sampling (which was broken for me due to wrong dtype). |
Apart from the idea about ignore_diags this is ready IMO |
tests/test_sample.py
Outdated
# Exact sampling is very slow! So commented out | ||
# cooltools.api.sample.sample( | ||
# clr, | ||
# op.join(request.fspath.dirname, "data/CN.mm9.1000kb.test_sampled.cool"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use a smaller input sample to uncomment this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is our normal test cooler at 1Mb resolution... So quite small! Idk we could add a 10Mb resolution one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even with 10Mb resolution it takes a while... This whole file takes 95 sec to run through pytest on my laptop, which I guess is acceptable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's try this way - w'd need to optimize out tests at some point anyways
Coverage saves total cis contacts into info (when
store=True
).Sample can have total cis contacts as target: tries to use the stored value and calculates it if it's not there.