Skip to content
This repository has been archived by the owner on Jan 27, 2020. It is now read-only.

Improve VEP #732

Merged
merged 33 commits into from
Feb 27, 2019
Merged
Show file tree
Hide file tree
Changes from 32 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
eaa460c
tabix convert the cache directory
maxulysse Feb 20, 2019
532e82f
remove --database and --offline option from VEP
maxulysse Feb 20, 2019
cdfb689
fix tabix indexation
maxulysse Feb 20, 2019
857ecfe
merge scripts
maxulysse Feb 21, 2019
43b613e
update docs
maxulysse Feb 21, 2019
571c47a
update DockerFile
maxulysse Feb 21, 2019
b454062
add usage of CADD plugin
maxulysse Feb 21, 2019
02b3c8f
add cadd_cache as an autorized params
maxulysse Feb 21, 2019
bcceceb
fix CADD download
maxulysse Feb 21, 2019
dfa0a50
fix containers build
maxulysse Feb 21, 2019
da1e5ba
code polishing
maxulysse Feb 21, 2019
65730c5
add container for DonwloadCADD
maxulysse Feb 21, 2019
2bb95f8
typo
maxulysse Feb 21, 2019
558e2e5
add cadd_version as authorized params
maxulysse Feb 21, 2019
df8bf50
code polishing
maxulysse Feb 21, 2019
69a75dc
fix tabix
maxulysse Feb 21, 2019
65b9270
update VEP
maxulysse Feb 22, 2019
853097f
code polishing
maxulysse Feb 22, 2019
4d2be26
fix path to VEP cache
maxulysse Feb 22, 2019
e360ede
update help
maxulysse Feb 22, 2019
0de01af
use VEP script to download cache
maxulysse Feb 22, 2019
bf8daeb
fix output directory
maxulysse Feb 25, 2019
16256e9
put --offline back
maxulysse Feb 25, 2019
d9a6fa4
use only 4 cpus for VEP
maxulysse Feb 25, 2019
1a4c045
include fasta ref for HGVS
maxulysse Feb 25, 2019
12b6fdd
Tabix index via VEP install script
maxulysse Feb 25, 2019
526ede0
add path to CADD files
maxulysse Feb 26, 2019
bb6c91a
add authorized params
maxulysse Feb 26, 2019
dfa91d1
add docs about CADD VEP plugin usage [skip ci]
maxulysse Feb 26, 2019
3e0292b
update docs [skip ci]
maxulysse Feb 26, 2019
f48c4b9
don't add fasta on containers
maxulysse Feb 26, 2019
12d8a83
Merge branch 'dev' into VEP
maxulysse Feb 27, 2019
d89a474
Merge branch 'dev' into VEP
maxulysse Feb 27, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 25 additions & 15 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,12 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
- [#719](https://github.com/SciLifeLab/Sarek/pull/719) - Possibility to use cache wen annotating with `snpEff` and `VEP`
- [#722](https://github.com/SciLifeLab/Sarek/pull/722) - Add path to ASCAT `.gc` file in `igenomes.config`
- [#728](https://github.com/SciLifeLab/Sarek/pull/728) - Update `Sarek-data` submodule with multiple patients TSV file
- [#732](https://github.com/SciLifeLab/Sarek/pull/732) - Add `cadd_WG_SNVs`, `cadd_WG_SNVs_tbi`, `cadd_InDels`, `cadd_InDels_tbi` and `cadd_cache` params
- [#732](https://github.com/SciLifeLab/Sarek/pull/732) - Add tabix indexed cache for VEP
- [#732](https://github.com/SciLifeLab/Sarek/pull/732) - New `DownloadCADD` process to download CADD files
- [#732](https://github.com/SciLifeLab/Sarek/pull/732) - Specify values for `cadd_WG_SNVs`, `cadd_WG_SNVs_tbi`, `cadd_InDels`, `cadd_InDels_tbi` and `cadd_cache` params in `munin.conf` file
- [#732](https://github.com/SciLifeLab/Sarek/pull/732) - Use `cadd_cache` param for optional use of CADD VEP plugin in `annotate.nf`
- [#732](https://github.com/SciLifeLab/Sarek/pull/732) - VEP cache has now fasta files for `--HGVS`
- [#735](https://github.com/SciLifeLab/Sarek/pull/735) - Added `--exome` for Manta, and for StrelkaBP
- [#735](https://github.com/SciLifeLab/Sarek/pull/735) - Added Travis CI test for targeted

Expand All @@ -24,33 +30,37 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
- [#710](https://github.com/SciLifeLab/Sarek/pull/710) - Improve release checklist and script
- [#711](https://github.com/SciLifeLab/Sarek/pull/711) - Improve configuration priorities
- [#716](https://github.com/SciLifeLab/Sarek/pull/716) - Update paths to containers and iGenomes
- [#717](https://github.com/SciLifeLab/Sarek/pull/717) - `mapping` step can now map BAM files too
- [#717](https://github.com/SciLifeLab/Sarek/pull/717) - `checkFileExtension` has changed to `hasExtension`, and now only verify if file has extension
- [#717](https://github.com/SciLifeLab/Sarek/pull/717) - `fastqFiles` renamed to `inputFiles`
- [#717](https://github.com/SciLifeLab/Sarek/pull/717) - `mapping` step can now map BAM files too
- [#717](https://github.com/SciLifeLab/Sarek/pull/717) - `MapReads` can now convert BAM to FASTQ and feed it to BWA on the fly
- [#717](https://github.com/SciLifeLab/Sarek/pull/717) - `checkFileExtension` has changed to `hasExtension`, and now only verify if file has extension
- [#717](https://github.com/SciLifeLab/Sarek/pull/717) - Update documentation
- [#717](https://github.com/SciLifeLab/Sarek/pull/717), [#732](https://github.com/SciLifeLab/Sarek/pull/732) - Update documentation
- [#719](https://github.com/SciLifeLab/Sarek/pull/719) - `snpeff` and `vep` containers are now built with conda
- [#719](https://github.com/SciLifeLab/Sarek/pull/719) - `vepCacheVersion` is now defined in `conf/genomes.config` or `conf/igenomes.config`
- [#722](https://github.com/SciLifeLab/Sarek/pull/722) - Add path to ASCAT `.gc` file in `igenomes.config`
- [#722](https://github.com/SciLifeLab/Sarek/pull/722) - Update `Sarek-data` submodule
- [#723](https://github.com/SciLifeLab/Sarek/pull/723), [#725](https://github.com/SciLifeLab/Sarek/pull/725) - Update docs
- [#724](https://github.com/SciLifeLab/Sarek/pull/724) - Improved AwsBatch configuration
- [#728](https://github.com/SciLifeLab/Sarek/pull/728) - VCFs and Annotated VCFs are now ordered by Patient, then tools
- [#728](https://github.com/SciLifeLab/Sarek/pull/728) - Strelka Best Practices output is now prefixed with `StrelkaBP_`
- [#728](https://github.com/SciLifeLab/Sarek/pull/728) - Improved usage of `targetBED` params
- [#728](https://github.com/SciLifeLab/Sarek/pull/728) - Strelka Best Practices output is now prefixed with `StrelkaBP_`
- [#728](https://github.com/SciLifeLab/Sarek/pull/728) - VCFs and Annotated VCFs are now ordered by Patient, then tools
- [#732](https://github.com/SciLifeLab/Sarek/pull/732) - Merge `buildContainers.nf` and `buildReferences.nf` in `build.nf`
- [#732](https://github.com/SciLifeLab/Sarek/pull/732) - Reduce number of CPUs for `RunVEP` to `4` cf: [VEP docs](https://www.ensembl.org/info/docs/tools/vep/script/vep_other.html)
- [#732](https://github.com/SciLifeLab/Sarek/pull/732) - Update VEP from `95.1` to `95.2`

### `Removed`
- [#715](https://github.com/SciLifeLab/Sarek/pull/715) - Remove `defReferencesFiles` function from `buildReferences.nf`
- [#719](https://github.com/SciLifeLab/Sarek/pull/719) - `snpEff` base container is no longer used
- [#721](https://github.com/SciLifeLab/Sarek/pull/721) - Remove COSMIC docs
- [#728](https://github.com/SciLifeLab/Sarek/pull/728) - Remove `defineDirectoryMap()`
- [#732](https://github.com/SciLifeLab/Sarek/pull/732) - Removed `--database` option for VEP cf: [VEP docs](https://www.ensembl.org/info/docs/tools/vep/script/vep_other.html)

### `Fixed`
- [#720](https://github.com/SciLifeLab/Sarek/pull/720) - bamQC is now run on the recalibrated bams, and not after MarkDuplicates
- [#726](https://github.com/SciLifeLab/Sarek/pull/726) - Fix Ascat ref file input (one file can't be a set)
- [#727](https://github.com/SciLifeLab/Sarek/pull/727) - bamQC outputs are no longer overwritten (name of dir is now the file instead of sample)
- [#728](https://github.com/SciLifeLab/Sarek/pull/728) - Fix multi sample TSV file [#691](https://github.com/SciLifeLab/Sarek/issues/691)
- [#728](https://github.com/SciLifeLab/Sarek/pull/728) - Fix issue with annotation that was consuming `cache` channels
- [#728](https://github.com/SciLifeLab/Sarek/pull/728) - Fix multi sample TSV file [#691](https://github.com/SciLifeLab/Sarek/issues/691)

## [2.2.2] - 2018-12-19

Expand All @@ -66,10 +76,10 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.

### `Changed`

- [#678](https://github.com/SciLifeLab/Sarek/pull/678) - Changing VEP to v92 and adjusting CPUs for VEP
- [#663](https://github.com/SciLifeLab/Sarek/pull/663) - Update `do_release.sh` script
- [#671](https://github.com/SciLifeLab/Sarek/pull/671) - publishDir modes are now params
- [#677](https://github.com/SciLifeLab/Sarek/pull/677), [#698](https://github.com/SciLifeLab/Sarek/pull/698), [#703](https://github.com/SciLifeLab/Sarek/pull/703) - Update docs
- [#678](https://github.com/SciLifeLab/Sarek/pull/678) - Changing VEP to v92 and adjusting CPUs for VEP
- [#679](https://github.com/SciLifeLab/Sarek/pull/679) - Update old awsbatch configuration
- [#682](https://github.com/SciLifeLab/Sarek/pull/682) - Specifications for memory and cpus for awsbatch
- [#693](https://github.com/SciLifeLab/Sarek/pull/693) - Qualimap bamQC is now ran after mapping and after recalibration for better QC
Expand Down Expand Up @@ -110,15 +120,15 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
- [#616](https://github.com/SciLifeLab/Sarek/pull/616) - Update documentation
- [#620](https://github.com/SciLifeLab/Sarek/pull/620) - Add `tmp/` to `.gitignore`
- [#625](https://github.com/SciLifeLab/Sarek/pull/625) - Add [`pathfindr`](https://github.com/NBISweden/pathfindr) as a submodule
- [#639](https://github.com/SciLifeLab/Sarek/pull/639) - Add a complete example analysis to docs
- [#635](https://github.com/SciLifeLab/Sarek/pull/635) - To process targeted sequencing with a target BED
- [#639](https://github.com/SciLifeLab/Sarek/pull/639) - Add a complete example analysis to docs
- [#640](https://github.com/SciLifeLab/Sarek/pull/640), [#642](https://github.com/SciLifeLab/Sarek/pull/642) - Add helper script for changing version number

### `Changed`

- [#608](https://github.com/SciLifeLab/Sarek/pull/608) - Update Nextflow required version
- [#616](https://github.com/SciLifeLab/Sarek/pull/616) - Update CHANGELOG
- [#615](https://github.com/SciLifeLab/Sarek/pull/615) - Use `splitCsv` instead of `readlines`
- [#616](https://github.com/SciLifeLab/Sarek/pull/616) - Update CHANGELOG
- [#621](https://github.com/SciLifeLab/Sarek/pull/621), [#638](https://github.com/SciLifeLab/Sarek/pull/638) - Improve install script
- [#621](https://github.com/SciLifeLab/Sarek/pull/621), [#638](https://github.com/SciLifeLab/Sarek/pull/638) - Simplify tests
- [#627](https://github.com/SciLifeLab/Sarek/pull/627), [#629](https://github.com/SciLifeLab/Sarek/pull/629), [#637](https://github.com/SciLifeLab/Sarek/pull/637) - Refactor docs
Expand All @@ -128,9 +138,9 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
- [#638](https://github.com/SciLifeLab/Sarek/pull/638) - Use correct `.simg` extension for Singularity images
- [#639](https://github.com/SciLifeLab/Sarek/pull/639) - Smaller refactoring of the docs
- [#640](https://github.com/SciLifeLab/Sarek/pull/640) - Update RELEASE_CHECKLIST
- [#642](https://github.com/SciLifeLab/Sarek/pull/642) - Update conda channel order priorities
- [#642](https://github.com/SciLifeLab/Sarek/pull/642) - MultiQC 1.5 -> 1.6
- [#642](https://github.com/SciLifeLab/Sarek/pull/642) - Qualimap 2.2.2a -> 2.2.2b
- [#642](https://github.com/SciLifeLab/Sarek/pull/642) - Update conda channel order priorities
- [#642](https://github.com/SciLifeLab/Sarek/pull/642) - VCFanno 0.2.8 -> 0.3.0
- [#642](https://github.com/SciLifeLab/Sarek/pull/642) - VCFtools 0.1.15 -> 0.1.16

Expand Down Expand Up @@ -175,14 +185,14 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
- [#582](https://github.com/SciLifeLab/Sarek/pull/582), [#587](https://github.com/SciLifeLab/Sarek/pull/587) - Update figures
- [#595](https://github.com/SciLifeLab/Sarek/pull/595) - Function `defineDirectoryMap()` is now part of `SarekUtils`
- [#595](https://github.com/SciLifeLab/Sarek/pull/595) - Process `GenerateMultiQCconfig` replace by function `createMultiQCconfig()`
- [#597](https://github.com/SciLifeLab/Sarek/pull/597) - Move `checkFileExtension()`, `checkParameterExistence()`, `checkParameterList()`, `checkReferenceMap()`, `checkRefExistence()`, `extractBams()`, `extractGenders()`, `returnFile()`, `returnStatus()` and `returnTSV()` functions to `SarekUtils`
- [#597](https://github.com/SciLifeLab/Sarek/pull/597) - `extractBams()` now takes an extra parameter.
- [#597](https://github.com/SciLifeLab/Sarek/pull/597) - Replace depreciated operator `phase` by `join`.
- [#597](https://github.com/SciLifeLab/Sarek/pull/597) - Move `checkFileExtension()`, `checkParameterExistence()`, `checkParameterList()`, `checkReferenceMap()`, `checkRefExistence()`, `extractBams()`, `extractGenders()`, `returnFile()`, `returnStatus()` and `returnTSV()` functions to `SarekUtils`
- [#597](https://github.com/SciLifeLab/Sarek/pull/597) - Reduce data footprint for Process `CreateRecalibrationTable`
- [#597](https://github.com/SciLifeLab/Sarek/pull/597) - Replace depreciated operator `phase` by `join`.
- [#599](https://github.com/SciLifeLab/Sarek/pull/599) - Merge is tested with `ANNOTATEALL`
- [#604](https://github.com/SciLifeLab/Sarek/pull/604) - Synching `GRCh38` `wgs_calling_regions` bedfiles
- [#607](https://github.com/SciLifeLab/Sarek/pull/607) - Update to GATK4
- [#607](https://github.com/SciLifeLab/Sarek/pull/607) - One container approach
- [#607](https://github.com/SciLifeLab/Sarek/pull/607) - Update to GATK4
- [#608](https://github.com/SciLifeLab/Sarek/pull/608) - Update Nextflow required version
- [#616](https://github.com/SciLifeLab/Sarek/pull/616) - Update CHANGELOG
- [#617](https://github.com/SciLifeLab/Sarek/pull/617) - Replace depreciated $name syntax with withName
Expand Down Expand Up @@ -252,8 +262,8 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.

### `Fixed`

- [#533](https://github.com/SciLifeLab/Sarek/issues/533) - Replace `VEP` `--pick` option by `--per_gene`
- [#530](https://github.com/SciLifeLab/Sarek/issues/530) - use `$PWD` for default `outDir`
- [#533](https://github.com/SciLifeLab/Sarek/issues/533) - Replace `VEP` `--pick` option by `--per_gene`

## [1.2.5] - 2018-01-18

Expand Down Expand Up @@ -297,9 +307,9 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.

### `Fixed`

- [#475](https://github.com/SciLifeLab/Sarek/issues/475) - 16 cpus for local executor
- [#357](https://github.com/SciLifeLab/Sarek/issues/357) - `ASCAT` works for GRCh38
- [#471](https://github.com/SciLifeLab/Sarek/issues/471) - Running `Singularity` on `/scratch`
- [#475](https://github.com/SciLifeLab/Sarek/issues/475) - 16 cpus for local executor
- [#480](https://github.com/SciLifeLab/Sarek/issues/480) - No `tsv` file needed for step `annotate`

## [1.2.2] - 2017-10-06
Expand Down
20 changes: 14 additions & 6 deletions annotate.nf
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,12 @@ process RunVEP {
set annotator, variantCaller, idPatient, file(vcf), file(idx) from vcfForVep
file dataDir from Channel.value(params.vep_cache ? file(params.vep_cache) : "null")
val cache_version from Channel.value(params.genomes[params.genome].vepCacheVersion)
set file(cadd_WG_SNVs), file(cadd_WG_SNVs_tbi), file(cadd_InDels), file(cadd_InDels_tbi) from Channel.value([
params.cadd_WG_SNVs ? file(params.cadd_WG_SNVs) : "null",
params.cadd_WG_SNVs_tbi ? file(params.cadd_WG_SNVs_tbi) : "null",
params.cadd_InDels ? file(params.cadd_InDels) : "null",
params.cadd_InDels_tbi ? file(params.cadd_InDels_tbi) : "null"
])

output:
set finalAnnotator, variantCaller, idPatient, file("${vcf.simpleName}_VEP.ann.vcf") into vepVCF
Expand All @@ -220,16 +226,17 @@ process RunVEP {
script:
finalAnnotator = annotator == "snpEff" ? 'merge' : 'VEP'
genome = params.genome == 'smallGRCh37' ? 'GRCh37' : params.genome
cache = (params.vep_cache && params.annotation_cache) ? "--dir_cache \${PWD}/${dataDir}" : "--dir_cache /.vep"
dir_cache = (params.vep_cache && params.annotation_cache) ? " \${PWD}/${dataDir}" : "/.vep"
cadd = (params.cadd_cache && params.cadd_WG_SNVs && params.cadd_InDels) ? "--plugin CADD,whole_genome_SNVs.tsv.gz,InDels.tsv.gz" : ""
"""
vep \
vep \
-i ${vcf} \
-o ${vcf.simpleName}_VEP.ann.vcf \
--assembly ${genome} \
${cadd} \
--cache \
--cache_version ${cache_version} \
${cache} \
--database \
--cache_version ${cache_version} \
--dir_cache ${dir_cache} \
--everything \
--filter_common \
--fork ${task.cpus} \
Expand Down Expand Up @@ -355,7 +362,8 @@ def minimalInformationMessage() {
if (params.containerPath != "") log.info " ContainerPath: " + params.containerPath
log.info " Tag : " + params.tag
log.info "Reference files used:"
log.info " snpeffDb :\n\t" + params.genomes[params.genome].snpeffDb
log.info " snpEff DB :\n\t" + params.genomes[params.genome].snpeffDb
log.info " VEP Cache :\n\t" + params.genomes[params.genome].vepCacheVersion
}

def nextflowMessage() {
Expand Down
Loading