Skip to content

Latest commit

 

History

History
41 lines (29 loc) · 2.45 KB

File metadata and controls

41 lines (29 loc) · 2.45 KB

Pangenome Genotyping

  1. Downlaod Decomposed HPRC from https://zenodo.org/records/6797328
wget https://zenodo.org/records/6797328/files/cactus_filtered_ids.vcf.gz
  1. Download and Build TheGreatGenotyper, beagle, bcftools and AnnotSV.

  2. Split the pangenome into 10 slices for faster computation. The following script splits variants from each chromosome into 10 chunks and forms a slice using a chunk from each chromosome.

./slice_pangenome.sh  cactus_filtered_ids.vcf.gz 10 sliced_pangenome_10/
  1. Edit config.yaml to configure input and output, as well as programs.
Field Description
INPUT_DIR Input folder containing the sliced pangenome
TEMP_FOLDER Folder to store temporary files.
INPUT_REFERENCE Input genome reference
INPUT_INDEX Txt file containing a list of CCDG indexes.
BEAGLE Binary Path for Beagle
BEAGLE_MAP Beagle MAP file. Download here(maps)
TheGreatGenotyper Binary Path for The Great Genotyper
AnnotSV Binary Path for AnnotSV
bcftools Binary path for Bcftools
OUTPUT_dir Output Directory
  1. Run the workflow
snakemake --configfile config.yaml -np

Output

the final output can be downloaded from this link