Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update subworkflow for deepvariant to version 1.8.0 #7473

Merged
merged 8 commits into from
Feb 26, 2025

Conversation

fa2k
Copy link
Contributor

@fa2k fa2k commented Feb 12, 2025

Update the deepvariant variant calling subworkflow (nf-core/deepvariant) to version 1.8.0.

Other module subcommands were updated in #7301.

This work is based on @fellen31's original pull request to update the subworkflow, and adds some handling of the "small model". Summary of changes:

  • Container version increased
  • Enable --regions parameter to DEEPVARIANT_POSTPROCESSVARIANTS process
  • Pass the small model outputs from DEEPVARIANT_MAKEEXAMPLES to DEEPVARIANT_POSTPROCESSVARIANTS if small model is used
  • Update tests to use small model (which is default), but add a test to disable the small model
  • Add a test to ensure that the subworkflow output is equal to the integrated DEEPVARIANT_RUNDEEPVARIANT process output

The future of the subworkflow in nf-core is a bit uncertain. Recent updates of DeepVariant have required an excessive amount of changes to be made in the subworkflow. Due to the argument handling with ext.args, they would also require many changes by the pipelines using the subworkflow, to keep up with the default arguments used by the run_deepvariant script.

The purpose of this subworkflow is to optimize compute resource usage, allocating GPUs only when needed. With version 1.8.0, Google released a new "fast pipeline" (https://github.com/google/deepvariant/blob/r1.8/docs/deepvariant-fast-pipeline-case-study.md) that could solve the same problem in a different way. At the moment, the fast pipeline is about as complex as the subworkflow approach, so I'm suggesting we keep the subworkflow for now.

PR checklist

Closes #XXX

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Emit the versions.yml file.
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • For modules:
      • nf-core modules test <MODULE> --profile docker
      • nf-core modules test <MODULE> --profile singularity
      • nf-core modules test <MODULE> --profile conda
    • For subworkflows:
      • nf-core subworkflows test <SUBWORKFLOW> --profile docker
      • nf-core subworkflows test <SUBWORKFLOW> --profile singularity
      • nf-core subworkflows test <SUBWORKFLOW> --profile conda

@fa2k fa2k requested a review from fellen31 February 12, 2025 12:56
@fellen31
Copy link
Contributor

Looks like some tests are failing?

@fa2k
Copy link
Contributor Author

fa2k commented Feb 14, 2025

Oh sorry I missed that. I will check it and let you know when it's fixed. It may be related to the nextflow version.

@fa2k
Copy link
Contributor Author

fa2k commented Feb 19, 2025

@fellen31 I've fixed the test for POSTPROCESSVARIANTS, adding the correct input tuple. Now all the tests pass.

Copy link
Contributor

@fellen31 fellen31 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for continuing this.

// The following block determines whether the small model was used, and if so, adds the variant calls from it
// to the argument --small_model_cvo_records.
def small_model_arg = ""
def small_model_calls_copy = small_model_calls // Create a copy of the process-level variable so it can be used inside the if{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why you would need this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was working around an error message I got when using the input variable directly. I have to do some more testing, because it seems to work now.

Copy link
Contributor Author

@fa2k fa2k Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue seems to be that if I remove def small_model_calls_copy = small_model_calls, I get this error:

│   Nextflow stdout:                                                                                                                              │
│                                                                                                                                                 │
│   ERROR ~ Module compilation error                                                                                                              │
│   - file : /data/projects/modules/subworkflows/nf-core/deepvariant/tests/../../../../modules/nf-core/deepvariant/postprocessvariants/main.nf    │
│   - cause: Variable `small_model_calls` already defined in the process scope @ line 49, column 36.                                              │
│              def small_model_matcher = (small_model_calls[0].baseName =~ /^(.+)-\d{5}-of-(\d{5})$/)                                             │
│                                         ^                                                                                                       │
│                                                                                                                                                 │
│   1 error                                                                                                                                       │
│                                                                                                                                                 │
│                                                                                                                                                 │
│    -- Check script '/data/projects/modules/subworkflows/nf-core/deepvariant/tests/../main.nf' at line: 3 or see                                 │
│ '/data/projects/modules/.nf-test/tests/108496a30634aa9373981b51feca7830/meta/nextflow.log' file for more details                                │
│   Nextflow stderr:                                                                                                                              │

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, and if you remove the def before small_model_matcher?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to work if I remove the def's on everything except for small_model_arg. I'll create a commit for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tip :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this run in the CI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't done anything explicitly to make it run in CI. I will try to confirm whether it runs. Do you think it should be enabled in CI? We may have a situation (like just happened) where we want to update rundeepvariant without the subworkflow, and can ignore this test result.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right. Nice to be able to test that the subworkflow produces the same output as the rundeepvariant, but probably shouldn't be in the CI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's enabled in CI now, that happened automatically. I have to look into how it can be disabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the tags?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added them to the skip file, but the format is a bit different than usual, because I only want to skip one of the test files. I have to check that it still runs main.nf and that it skips equality.nf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it's successfully disabled now

Copy link
Contributor

@fellen31 fellen31 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Have you tested the differences in time and resources for the subworkflow vs rundeepvariant for 1.8? Would be interesting to see how much difference it makes.

@fa2k
Copy link
Contributor Author

fa2k commented Feb 26, 2025

Have you tested the differences in time and resources for the subworkflow vs rundeepvariant for 1.8? Would be interesting to see how much difference it makes.

No, but it would be interesting to know. I'll try to get this information, but also merge when ready.

@fa2k fa2k enabled auto-merge February 26, 2025 12:23
@fa2k fa2k added this pull request to the merge queue Feb 26, 2025
Merged via the queue into nf-core:master with commit 470ac76 Feb 26, 2025
61 checks passed
@fa2k fa2k deleted the update-deepvariant-subworkflow branch February 26, 2025 12:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants