Configuration
Create a new directory for your data.
mkdir -p data/custom
Copy over your unaligned
sequences.fastaandmetadata.tsvtodata/custom.Note: GISAID sequences and metadata can be downloaded using the “Input for the Augur pipeline” option on https://gisaid.org/.
metadata.tsvMUST have at minimum the columnsstrain,date,country.
If collection dates or country are unknown, these fields can be left empty or filled with “NA”.The first column MUST be
strain.
Create a profile for your custom build.
scripts/create_profile.sh --data data/custom
2022-06-17 09:15:06 Searching for metadata (data/custom/metadata.tsv) 2022-06-17 09:15:06 SUCCESS: metadata found 2022-06-17 09:15:06 Checking for 3 required metadata columns (strain date country) 2022-06-17 09:15:06 SUCCESS: 3 columns found. 2022-06-17 09:15:06 Searching for sequences (data/custom/sequences.fasta) 2022-06-17 09:15:06 SUCCESS: Sequences found 2022-06-17 09:15:06 Checking that the metadata strains match the sequence names 2022-06-17 09:15:06 SUCCESS: Strain column matches sequence names 2022-06-17 09:15:06 Creating new profile directory (my_profiles/custom) 2022-06-17 09:15:06 Creating build file (my_profiles/custom/builds.yaml) 2022-06-17 09:15:06 Adding default input data (defaults/inputs.yaml) 2022-06-17 09:15:06 Adding custom input data (data/custom) 2022-06-17 09:15:06 Adding `custom` as a build 2022-06-17 09:15:06 Creating system configuration (my_profiles/custom/config.yaml) 2022-06-17 09:15:06 Adding default system resources 2022-06-17 09:15:06 Done! The custom profile is ready to be run with: snakemake --profile my_profiles/customNote: you can add the param
--controlsto add thecontrolsbuild that will run in parallel.
Edit
my_profiles/custom/config.yaml, so that thejobsanddefault-resourcesmatch your system.Note: For HPC environments, see the High Performance Computing section.
#------------------------------------------------------------------------------# # System config #------------------------------------------------------------------------------# # Maximum number of jobs to run simultaneously jobs : 1 # Default resources for a SINGLE JOB default-resources: - cpus=1 - mem_mb=4000 - time_min=60
Do a “dry run” to confirm setup.
snakemake --profile my_profiles/custom --dry-run
Run your custom profile.
snakemake --profile my_profiles/custom
Important: If you are doing routine production analyses, it is recommend to first delete all previous output before running your profile. This will force
ncov-recombinantto download fresh copies of the pango-designation issues (resources/issues.tsv) and the lineage phylogeny (resources/tree.nwk).
snakemake --profile my_profiles/custom --delete-all-output
snakemake --profile my_profiles/custom