-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Small changes for FlashLFQ writer #131
Conversation
Fixed retention time division by 60. Time is required in minutes for FlashLFQ, it's already in minutues
Humm I am not 100% sure if it is a valid assumption that the RT from sage will always be in seconds. Would you mind leaving the details on how the mzml was generated? |
According to sage documentation, retention time is given in minutes. mzml was generated with ThermoRawFileParser1.4.3 in command line mode with default parameters. |
Cool! for future reference: https://github.com/lazear/sage/blob/73eaf49a4e53179b9bd5329bc1f085835f5f3987/crates/sage-cloudpath/src/mzml.rs#L250-L258 Sage does seem to check the unit at parse time of the mzml to make the units minutes. So I guess that if we specify in the mokapot docs that the rt column is meant to be in minutes, we could go ahead and do that. (not sure if other tools that write .pin files will default to minutes ...) https://github.com/lazear/sage/blob/73eaf49a4e53179b9bd5329bc1f085835f5f3987/crates/sage-cloudpath/src/tdf.rs#L35 + MannLabs/timsrust#34 also makes sure that .d files are in seconds in sage. @wfondrie do you recall why that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be good. We're in the midst of some other large updates, so I'm going to go ahead and merge it.
Fixed retention time division by 60. Time is required in minutes for FlashLFQ, it's already in minutues Co-authored-by: William Fondrie <[email protected]>
* (feat) added auto handling of traditional pin and testing * (fix) added handling of default direction * (fix) changed intermediate files pin->tsv and fixed tests accordingly * (chore) formatting on docs and removed T20 from them * (chore) upgraded to upstream actions * (chore) removed unused dependency in docs * (chore) reformatted tests * Feature/confidence streaming (#127) * ✨ cherry picks internal fixes from !68 and !70 * Cherry pick feature/confidence_streaming branch * ✨ adds filelock dependency for tests * 💄 linting * 💄 reformat to satisfy linter k * ✨ imports type annotations from future for python 3.9 * ✨ make pytest and cli behave with type annotations in Python 3.9 * ✨ test dropping Python 3.9 support - inspired by https://github.com/wfondrie/mokapot/pull/126/files#diff-1db27d93186e46d3b441ece35801b244db8ee144ff1405ca27a163bfe878957fL20 * Set scale_to_one to false in *all* cases * Fixed path problems probably causing errors under windows * Fix more possible path issues * Fix warning about bitwise not in python 3.12 * Fix problem with numpy 2.x's different str rep of floats * Make hashing of rows for splitting independent of numpy version and spectra columns * Feature/streaming fix windows (#48) * ✨ log more infos * ✨ uses uv for env setup; fix dependencies --------- Co-authored-by: Elmar Zander <[email protected]> * Small changes for FlashLFQ writer (#131) Fixed retention time division by 60. Time is required in minutes for FlashLFQ, it's already in minutues Co-authored-by: William Fondrie <[email protected]> * wip: formatting and rebasing fixes * chore: ruff format * wip,chore: re-adding flashlfq support * ci,fix: fixed confidence out and ci migration * format: eof newline * wip,fix: progress to re-add flashlfq output * chore: uv lock and formatting * chore: added pr template * wip: make brew generic again * wip,fix: added deleter to on psm dataset * feat: re-added flashlfq support * chore: linting + formatting * fix: fixtures and progess in definition of cols * test, fix: annotated/commented new fixtures * lint: formatting * ci: removed fail on codecov fail * ci: test speedup replacing RF with dtree * ci: attempt to fix test docker build * refactor: change stats to dataclass * refactor: extracted output writer factory * refactor: extracted level manager in confidence * refactor: extracted level writer group * refactor: extracted more writer builder work to class * feat: score propagation and unscored confidence * feat(confidence): add data reading api * feat,experiment: Experimental qvalue-fdr estimation * chore,docs: updated basic docs to curr api and updated typing * chore: updated basic n joint model docs code (md in progress) * chore: updated notebook * chore,confidence: update docstrings * chore,qvalue: removed commented out code * chore: fixed line length lints in docstrings * fix,sqlite: fixed path for sqlite writer * feat, wip: compound key on spectrum * refactor,wip: centralized column group logic * refactor,dataset: broke module into files and changed tdc implementation to remove numba * fix: fixed string to bool target col conversion and added notes on tests * ci: enabled lint and test on all PRs * chore: updated triqler and np versions * feat,doc: fixed empty cols in proteins and better column descriptions in docstrings * test: added content testing to cli testing + csv -> tsv * fix: flashlfq and misc fixes * ci: added extra xml to test makefile * chore: unify naming schemas * chore: self-review cleanup * fix,chore: updated makefile and fixed iterator * chore: self-review cleanup --------- Co-authored-by: Siegfried Gessulat <[email protected]> Co-authored-by: Elmar Zander <[email protected]> Co-authored-by: Ivan Chudinov <[email protected]> Co-authored-by: William Fondrie <[email protected]>
I am using mokapot (v0.10.0) as an embedded rescoring engine within the ms2rescore package (https://github.com/compomics/ms2rescore, v3.0.3), which you may have heard of for rescoring results from sage (https://github.com/lazear/sage, v0.14.6).
I've tried doing label-free quantification with FlashLFQ GUI (v1.2.6), as you kindly provided the writer for its "Generic" format, and bumped into zero intensity for all quantified proteins. Checking the
results.sage.ms2rescore.mokapot.flashlfq
and*.raw
files with FreeStyle, I found the unnecessary subdivision of the retention time for conversion to FlashLFQ compatible format, as the retention time is supposed to be minutes, but it's already in minutes.To test the theory, I manually deleted the division in the source files of ms2rescore venv in
lib/python3.10/site-packages/mokapot/writers/flashlfq.py
at line 115 and resubmitted the job to ms2rescore and later to FlashLFQ.As a result, all proteins are now appropriately quantified.
Please consider this PR if you find it useful.