Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small changes for FlashLFQ writer #131

Merged
merged 2 commits into from
Dec 4, 2024

Conversation

unholyparrot
Copy link
Contributor

I am using mokapot (v0.10.0) as an embedded rescoring engine within the ms2rescore package (https://github.com/compomics/ms2rescore, v3.0.3), which you may have heard of for rescoring results from sage (https://github.com/lazear/sage, v0.14.6).

I've tried doing label-free quantification with FlashLFQ GUI (v1.2.6), as you kindly provided the writer for its "Generic" format, and bumped into zero intensity for all quantified proteins. Checking the results.sage.ms2rescore.mokapot.flashlfq and *.raw files with FreeStyle, I found the unnecessary subdivision of the retention time for conversion to FlashLFQ compatible format, as the retention time is supposed to be minutes, but it's already in minutes.

To test the theory, I manually deleted the division in the source files of ms2rescore venv in lib/python3.10/site-packages/mokapot/writers/flashlfq.py at line 115 and resubmitted the job to ms2rescore and later to FlashLFQ.

As a result, all proteins are now appropriately quantified.

Please consider this PR if you find it useful.

Fixed retention time division by 60.
Time is required in minutes for FlashLFQ, it's already in minutues
@jspaezp
Copy link
Collaborator

jspaezp commented Dec 1, 2024

Humm I am not 100% sure if it is a valid assumption that the RT from sage will always be in seconds. Would you mind leaving the details on how the mzml was generated?

@unholyparrot
Copy link
Contributor Author

According to sage documentation, retention time is given in minutes.

mzml was generated with ThermoRawFileParser1.4.3 in command line mode with default parameters.

@jspaezp
Copy link
Collaborator

jspaezp commented Dec 2, 2024

Cool!

for future reference: https://github.com/lazear/sage/blob/73eaf49a4e53179b9bd5329bc1f085835f5f3987/crates/sage-cloudpath/src/mzml.rs#L250-L258

Sage does seem to check the unit at parse time of the mzml to make the units minutes. So I guess that if we specify in the mokapot docs that the rt column is meant to be in minutes, we could go ahead and do that. (not sure if other tools that write .pin files will default to minutes ...)

https://github.com/lazear/sage/blob/73eaf49a4e53179b9bd5329bc1f085835f5f3987/crates/sage-cloudpath/src/tdf.rs#L35 + MannLabs/timsrust#34 also makes sure that .d files are in seconds in sage.

@wfondrie do you recall why that /60 is there? is there any tool that writes the .pin in seconds (I think we should be able to remove it safely ...)?

@wfondrie wfondrie self-requested a review December 4, 2024 21:19
Copy link
Owner

@wfondrie wfondrie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be good. We're in the midst of some other large updates, so I'm going to go ahead and merge it.

@wfondrie wfondrie merged commit 0d1f437 into wfondrie:main Dec 4, 2024
1 of 10 checks passed
jspaezp pushed a commit to jspaezp/mokapot that referenced this pull request Dec 5, 2024
Fixed retention time division by 60.
Time is required in minutes for FlashLFQ, it's already in minutues

Co-authored-by: William Fondrie <[email protected]>
wfondrie added a commit that referenced this pull request Feb 27, 2025
* (feat) added auto handling of traditional pin and testing

* (fix) added handling of default direction

* (fix) changed intermediate files pin->tsv and fixed tests accordingly

* (chore) formatting on docs and removed T20 from them

* (chore) upgraded to upstream actions

* (chore) removed unused dependency in docs

* (chore) reformatted tests

* Feature/confidence streaming (#127)

* ✨ cherry picks internal fixes from !68 and !70

* Cherry pick feature/confidence_streaming branch

* ✨ adds filelock dependency for tests

* 💄 linting

* 💄 reformat to satisfy linter
k

* ✨ imports type annotations from future for python 3.9

* ✨ make pytest and cli behave with type annotations in Python 3.9

* ✨ test dropping Python 3.9 support

- inspired by
  https://github.com/wfondrie/mokapot/pull/126/files#diff-1db27d93186e46d3b441ece35801b244db8ee144ff1405ca27a163bfe878957fL20

* Set scale_to_one to false in *all* cases

* Fixed path problems probably causing errors under windows

* Fix more possible path issues

* Fix warning about bitwise not in python 3.12

* Fix problem with numpy 2.x's different str rep of floats

* Make hashing of rows for splitting independent of numpy version and spectra columns

* Feature/streaming fix windows (#48)

* ✨ log more infos
* ✨ uses uv for env setup; fix dependencies

---------

Co-authored-by: Elmar Zander <[email protected]>

* Small changes for FlashLFQ writer (#131)

Fixed retention time division by 60.
Time is required in minutes for FlashLFQ, it's already in minutues

Co-authored-by: William Fondrie <[email protected]>

* wip: formatting and rebasing fixes

* chore: ruff format

* wip,chore: re-adding flashlfq support

* ci,fix: fixed confidence out and ci migration

* format: eof newline

* wip,fix: progress to re-add flashlfq output

* chore: uv lock and formatting

* chore: added pr template

* wip: make brew generic again

* wip,fix: added deleter to on psm dataset

* feat: re-added flashlfq support

* chore: linting + formatting

* fix: fixtures and progess in definition of cols

* test, fix: annotated/commented new fixtures

* lint: formatting

* ci: removed fail on codecov fail

* ci: test speedup replacing RF with dtree

* ci: attempt to fix test docker build

* refactor: change stats to dataclass

* refactor: extracted output writer factory

* refactor: extracted level manager in confidence

* refactor: extracted level writer group

* refactor: extracted more writer builder work to class

* feat: score propagation and unscored confidence

* feat(confidence): add data reading api

* feat,experiment: Experimental qvalue-fdr estimation

* chore,docs: updated basic docs to curr api and updated typing

* chore: updated basic n joint model docs code (md in progress)

* chore: updated notebook

* chore,confidence: update docstrings

* chore,qvalue: removed commented out code

* chore: fixed line length lints in docstrings

* fix,sqlite: fixed path for sqlite writer

* feat, wip: compound key on spectrum

* refactor,wip: centralized column group logic

* refactor,dataset: broke module into files and changed tdc implementation to remove numba

* fix: fixed string to bool target col conversion and added notes on tests

* ci: enabled lint and test on all PRs

* chore: updated triqler and np versions

* feat,doc: fixed empty cols in proteins and better column descriptions in docstrings

* test: added content testing to cli testing + csv -> tsv

* fix: flashlfq and misc fixes

* ci: added extra xml to test makefile

* chore: unify naming schemas

* chore: self-review cleanup

* fix,chore: updated makefile and fixed iterator

* chore: self-review cleanup

---------

Co-authored-by: Siegfried Gessulat <[email protected]>
Co-authored-by: Elmar Zander <[email protected]>
Co-authored-by: Ivan Chudinov <[email protected]>
Co-authored-by: William Fondrie <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants