-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Align parabricks subworkflow #6876
Merged
+281
−0
Merged
Changes from all commits
Commits
Show all changes
80 commits
Select commit
Hold shift + click to select a range
df512a6
add parabricks
famosab 296188a
remove config tag
famosab 661bdd4
fix typo
famosab b7abc0a
fix typo
famosab 83bd443
fix typo
famosab dc00c7b
update paths
famosab ec71494
update paths
famosab 202d305
remove ch
famosab 68d2e46
change gpu access
famosab b3e43e8
change fasta
famosab 99cd5c1
update container
famosab d34c6df
low memory
famosab 41c206c
indey
famosab 72c6ea5
index bwamem
famosab 55901ff
index bwamem
famosab b662fa5
index bwa
famosab 2e6a27d
add index file
famosab 2e8dfa9
add index file
famosab 6e7b6f5
add index file
famosab 7eced10
add index file
famosab 5cbee34
stage in
famosab 7367d15
stage in
famosab e676ee9
workdir
famosab 280feec
revert workdir
famosab 472a3a9
revert workdir
famosab ad8cd22
add bwa index
famosab 5e6202a
add bwa index link
famosab e0227b6
add bwa index link
famosab 3ce2d86
add bwa index link
famosab 87daaf9
rm stage
famosab cd9faa6
please work now
famosab 332eea6
remove fq2bam from this PR
famosab 7dceaa3
Merge branch 'master' into parabricks-sbwf
famosab f5c8cc4
update tests
famosab a27baac
change inputs in test and to fq2bam
famosab f9088af
add low memory
famosab fcd7bd8
adjust applybqsr input
famosab 38bbe78
adjust io to be consistent
famosab 52be7aa
Merge branch 'master' into parabricks-sbwf
famosab c179f67
Merge branch 'master' into parabricks-sbwf
famosab 0450b3c
Merge branch 'master' into parabricks-sbwf
famosab 84ff84f
wip
famosab a39b3db
Merge branch 'parabricks-sbwf' of github.com:famosab/modules into par…
famosab 1084460
try applybqsr
famosab 87576ae
Merge branch 'master' into parabricks-sbwf
famosab cad4876
minor updates
sateeshperi 7bb0222
update snap
famosab 9748d56
update snap
famosab eb13562
update snap - problem is the naming in applybqsr
famosab 2a4c49f
add tag gpu
famosab 64963ad
Merge branch 'master' into parabricks-sbwf
famosab 8ce25d8
Merge branch 'master' into parabricks-sbwf
famosab 82a0754
update meta
famosab d760937
Merge branch 'parabricks-sbwf' of github.com:famosab/modules into par…
famosab cd5ec00
update config
famosab 83df033
Merge branch 'master' into parabricks-sbwf
famosab f4ca194
Merge branch 'master' into parabricks-sbwf
famosab 5c967c1
Apply suggestions from code review
famosab c357cc9
Merge branch 'master' into parabricks-sbwf
famosab 27309d1
Merge branch 'master' into parabricks-sbwf
famosab 91712d7
Merge branch 'master' into parabricks-sbwf
famosab 788040e
Merge branch 'master' into parabricks-sbwf
famosab 7ce98df
Merge branch 'master' into parabricks-sbwf
famosab cd4314d
Merge branch 'master' into parabricks-sbwf
famosab ce2b452
Merge branch 'master' into parabricks-sbwf
famosab 7544071
Merge branch 'master' into parabricks-sbwf
famosab 9ce6067
Merge branch 'master' into parabricks-sbwf
sateeshperi 2a7d739
Merge branch 'master' into parabricks-sbwf
famosab 656fca3
Merge branch 'nf-core:master' into parabricks-sbwf
famosab 40f1f20
Merge branch 'master' into parabricks-sbwf
famosab 55c6933
add gpus
famosab d6ac06d
comment out channels
famosab ba0dd9a
change channels
famosab 6879f42
update prefix
famosab 9d506ad
update prefix
famosab 6043d08
correctly assign prefix
famosab 5c1b490
correctly assign prefix
famosab c4f4dd7
add updated snap
famosab b5f4b94
revert config changes
famosab 90cc3d5
Merge branch 'master' into parabricks-sbwf
famosab File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
// | ||
// Alignment and BQSR with Nvidia CLARA Parabricks | ||
// | ||
include { PARABRICKS_FQ2BAM } from '../../../modules/nf-core/parabricks/fq2bam/main' | ||
include { PARABRICKS_APPLYBQSR } from '../../../modules/nf-core/parabricks/applybqsr/main' | ||
|
||
workflow FASTQ_ALIGN_PARABRICKS { | ||
|
||
take: | ||
ch_reads // channel: [mandatory] meta, reads | ||
ch_fasta // channel: [mandatory] meta, fasta | ||
ch_index // channel: [mandatory] meta, index | ||
ch_interval_file // channel: [optional] meta, intervals_bed_combined | ||
ch_known_sites // channel [optional] known_sites_indels | ||
|
||
main: | ||
ch_versions = Channel.empty() | ||
ch_bam = Channel.empty() | ||
ch_bai = Channel.empty() | ||
ch_bqsr_table = Channel.empty() | ||
ch_qc_metrics = Channel.empty() | ||
ch_duplicate_metrics = Channel.empty() | ||
|
||
PARABRICKS_FQ2BAM( | ||
ch_reads, | ||
ch_fasta, | ||
ch_index, | ||
ch_interval_file, | ||
ch_known_sites | ||
) | ||
|
||
// Collecting FQ2BAM outputs | ||
ch_bam = ch_bam.mix(PARABRICKS_FQ2BAM.out.bam) | ||
ch_bai = ch_bai.mix(PARABRICKS_FQ2BAM.out.bai) | ||
ch_qc_metrics = ch_qc_metrics.mix(PARABRICKS_FQ2BAM.out.qc_metrics) | ||
ch_bqsr_table = ch_bqsr_table.mix(PARABRICKS_FQ2BAM.out.bqsr_table) | ||
ch_duplicate_metrics = ch_duplicate_metrics.mix(PARABRICKS_FQ2BAM.out.duplicate_metrics) | ||
ch_versions = ch_versions.mix(PARABRICKS_FQ2BAM.out.versions) | ||
|
||
// Apply BQSR | ||
PARABRICKS_APPLYBQSR( | ||
ch_bam, | ||
ch_bai, | ||
ch_bqsr_table.ifEmpty([]), | ||
ch_interval_file, | ||
ch_fasta | ||
) | ||
ch_versions = ch_versions.mix(PARABRICKS_APPLYBQSR.out.versions) | ||
|
||
emit: | ||
bam = PARABRICKS_APPLYBQSR.out.bam // channel: [ [meta], bam ] | ||
bai = PARABRICKS_APPLYBQSR.out.bai // channel: [ [meta], bai ] | ||
qc_metrics = ch_qc_metrics // channel: [ [meta], qc_metrics ] | ||
duplicate_metrics = ch_duplicate_metrics // channel: [ [meta], duplicate_metrics ] | ||
bqsr_table = ch_bqsr_table // channel: [ [meta], bqsr_table ] | ||
versions = ch_versions // channel: [ versions.yml ] | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/subworkflows/yaml-schema.json | ||
name: "fastq_align_parabricks" | ||
description: Align a fastq file using GPU-based acceleration | ||
keywords: | ||
- fastq | ||
- align | ||
- parabricks | ||
- gpu | ||
- preprocessing | ||
components: | ||
- parabricks/fq2bam | ||
- parabricks/applybqsr | ||
input: | ||
- ch_reads: | ||
type: file | ||
description: | | ||
Channel containing reads (either one file for se or two files for pe) | ||
Structure: [ val(meta), [ path(fastq1), path(fastq2) ] ] | ||
- ch_fasta: | ||
type: file | ||
description: | | ||
Channel containing reference fasta file | ||
Structure: [ val(meta), path(fasta) ] | ||
- ch_index: | ||
type: file | ||
description: | | ||
Channel containing reference BWA index | ||
Structure: [ val(meta), path(.{amb,ann,bwt,pac,sa}) ] | ||
- ch_interval_file: | ||
type: file | ||
description: | | ||
(optional) file(s) containing genomic intervals for use in base | ||
quality score recalibration (BQSR) | ||
Structure: [ val(meta), path(.{bed,interval_list,picard,list,intervals}) ] | ||
- ch_known_sites: | ||
type: file | ||
description: | | ||
(optional) known sites file(s) for calculating BQSR. markdups must | ||
be true to perform BQSR. | ||
Structure [ path(vcf) ] | ||
output: | ||
- bam: | ||
type: file | ||
description: | | ||
Channel containing BAM files | ||
Structure: [ val(meta), path(bam) ] | ||
pattern: "*.bam" | ||
- bai: | ||
type: file | ||
description: | | ||
Channel containing indexed BAM (BAI) files | ||
Structure: [ val(meta), path(bai) ] | ||
pattern: "*.bai" | ||
- versions: | ||
type: file | ||
description: | | ||
File containing software versions | ||
Structure: [ path(versions.yml) ] | ||
pattern: "versions.yml" | ||
authors: | ||
- "@famosab" | ||
maintainers: | ||
- "@famosab" |
106 changes: 106 additions & 0 deletions
106
subworkflows/nf-core/fastq_align_parabricks/tests/main.nf.test
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,106 @@ | ||
nextflow_workflow { | ||
|
||
name "Test Subworkflow FASTQ_ALIGN_PARABRICKS" | ||
script "../main.nf" | ||
workflow "FASTQ_ALIGN_PARABRICKS" | ||
config "./nextflow.config" | ||
|
||
tag "subworkflows" | ||
tag "subworkflows_nfcore" | ||
tag "subworkflows/fastq_align_parabricks" | ||
tag "parabricks" | ||
tag "parabricks/fq2bam" | ||
tag "parabricks/applybqsr" | ||
tag "bwa" | ||
tag "bwa/index" | ||
tag "gpu" | ||
|
||
setup { | ||
run("BWA_INDEX") { | ||
script "../../../../modules/nf-core/bwa/index/main.nf" | ||
process { | ||
""" | ||
input[0] = Channel.of([ | ||
[ id:'test' ], // meta map | ||
file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) | ||
]) | ||
""" | ||
} | ||
} | ||
} | ||
|
||
test("sarscov2 single-end [fastq_gz]") { | ||
|
||
when { | ||
workflow { | ||
""" | ||
input[0] = Channel.of([ | ||
[ id:'test', single_end:true ], | ||
[ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true)] | ||
]) | ||
input[1] = Channel.value([ | ||
[id: 'reference'], | ||
file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) | ||
]) | ||
input[2] = BWA_INDEX.out.index | ||
input[3] = Channel.value([ | ||
[id: 'intervals'], | ||
file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/picard/baits.interval_list', checkIfExists: true) | ||
]) | ||
input[4] = file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true) | ||
""" | ||
} | ||
} | ||
|
||
then { | ||
assertAll( | ||
{ assert workflow.success}, | ||
{ assert snapshot( | ||
workflow.out.bam.collect { meta, bamfile -> bam(bamfile).getReadsMD5() }, | ||
workflow.out.bai.collect { meta, bai -> file(bai).name }, | ||
workflow.out.versions | ||
).match() | ||
} | ||
) | ||
} | ||
} | ||
|
||
test("sarscov2 paired-end [fastq_gz]") { | ||
|
||
when { | ||
workflow { | ||
""" | ||
input[0] = Channel.of([ | ||
[ id:'test', single_end:false ], | ||
[ | ||
file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), | ||
file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) | ||
] | ||
]) | ||
input[1] = Channel.value([ | ||
[id: 'reference'], | ||
file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) | ||
]) | ||
input[2] = BWA_INDEX.out.index | ||
input[3] = Channel.value([ | ||
[id: 'intervals'], | ||
file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/picard/baits.interval_list', checkIfExists: true) | ||
]) | ||
input[4] = file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true) | ||
""" | ||
} | ||
} | ||
|
||
then { | ||
assertAll( | ||
{ assert workflow.success}, | ||
{ assert snapshot( | ||
workflow.out.bam.collect { meta, bamfile -> bam(bamfile).getReadsMD5() }, | ||
workflow.out.bai.collect { meta, bai -> file(bai).name }, | ||
workflow.out.versions | ||
).match() | ||
} | ||
) | ||
} | ||
} | ||
} |
40 changes: 40 additions & 0 deletions
40
subworkflows/nf-core/fastq_align_parabricks/tests/main.nf.test.snap
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
{ | ||
"sarscov2 single-end [fastq_gz]": { | ||
"content": [ | ||
[ | ||
"7e2bd786d964e42ddbc2ab0c9f340b09" | ||
], | ||
[ | ||
"test.recal.bam.bai" | ||
], | ||
[ | ||
"versions.yml:md5,0d8766379e89038cb5fdcd074f3289f6", | ||
"versions.yml:md5,df165e28f025dad39d826caead132115" | ||
] | ||
], | ||
"meta": { | ||
"nf-test": "0.9.2", | ||
"nextflow": "24.10.4" | ||
}, | ||
"timestamp": "2025-02-17T16:25:03.460025311" | ||
}, | ||
"sarscov2 paired-end [fastq_gz]": { | ||
"content": [ | ||
[ | ||
"73e8e89cda8fce1cf07bdebff0f793ec" | ||
], | ||
[ | ||
"test.recal.bam.bai" | ||
], | ||
[ | ||
"versions.yml:md5,0d8766379e89038cb5fdcd074f3289f6", | ||
"versions.yml:md5,df165e28f025dad39d826caead132115" | ||
] | ||
], | ||
"meta": { | ||
"nf-test": "0.9.2", | ||
"nextflow": "24.10.4" | ||
}, | ||
"timestamp": "2025-02-17T16:26:01.468588642" | ||
} | ||
} |
15 changes: 15 additions & 0 deletions
15
subworkflows/nf-core/fastq_align_parabricks/tests/nextflow.config
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
process { | ||
|
||
withName: 'PARABRICKS_FQ2BAM' { | ||
ext.args = '--low-memory' | ||
} | ||
|
||
// Ref: https://forums.developer.nvidia.com/t/problem-with-gpu/256825/6 | ||
// Parabricks’s fq2bam requires 24GB of memory. | ||
// Using --low-memory for testing | ||
|
||
withName: 'PARABRICKS_APPLYBQSR' { | ||
ext.prefix = { "${meta.id}.recal" } | ||
} | ||
|
||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we maybe do a
https://docs.nvidia.com/clara/parabricks/latest/documentation/tooldocs/man_fq2bam.html#man-fq2bam
Also it would be awesome to do some thing like
--memory-limit ${task.memory} / 2
by default or make sure there's 16 cpus per GPU requested.Just trying to push the resourceLimits syntax to the limits here 😆