Configuration

Create a new directory for your data.
```
mkdir -p data/custom
```
Copy over your unaligned sequences.fasta and metadata.tsv to data/custom.
- Note: GISAID sequences and metadata can be downloaded using the “Input for the Augur pipeline” option on https://gisaid.org/.
- metadata.tsv MUST have at minimum the columns strain, date, country.
  If collection dates or country are unknown, these fields can be left empty or filled with “NA”.
- The first column MUST be strain.

Create a profile for your custom build.

scripts/create_profile.sh --data data/custom

2022-06-17 09:15:06     Searching for metadata (data/custom/metadata.tsv)
2022-06-17 09:15:06     SUCCESS: metadata found
2022-06-17 09:15:06     Checking for 3 required metadata columns (strain date country)
2022-06-17 09:15:06     SUCCESS: 3 columns found.
2022-06-17 09:15:06     Searching for sequences (data/custom/sequences.fasta)
2022-06-17 09:15:06     SUCCESS: Sequences found
2022-06-17 09:15:06     Checking that the metadata strains match the sequence names
2022-06-17 09:15:06     SUCCESS: Strain column matches sequence names
2022-06-17 09:15:06     Creating new profile directory (my_profiles/custom)
2022-06-17 09:15:06     Creating build file (my_profiles/custom/builds.yaml)
2022-06-17 09:15:06     Adding default input data (defaults/inputs.yaml)
2022-06-17 09:15:06     Adding custom input data (data/custom)
2022-06-17 09:15:06     Adding `custom` as a build
2022-06-17 09:15:06     Creating system configuration (my_profiles/custom/config.yaml)
2022-06-17 09:15:06     Adding default system resources
2022-06-17 09:15:06     Done! The custom profile is ready to be run with:

                        snakemake --profile my_profiles/custom

Note: you can add the param --controls to add the controls build that will run in parallel.

Edit my_profiles/custom/config.yaml, so that the jobs and default-resources match your system.

Note: For HPC environments, see the High Performance Computing section.

#------------------------------------------------------------------------------#
# System config
#------------------------------------------------------------------------------#

# Maximum number of jobs to run simultaneously
jobs : 1

# Default resources for a SINGLE JOB
default-resources:
- cpus=1
- mem_mb=4000
- time_min=60

Do a “dry run” to confirm setup.

snakemake --profile my_profiles/custom --dry-run

Run your custom profile.
```
snakemake --profile my_profiles/custom
```

Important: If you are doing routine production analyses, it is recommend to first delete all previous output before running your profile. This will force ncov-recombinant to download fresh copies of the pango-designation issues (resources/issues.tsv) and the lineage phylogeny (resources/tree.nwk).

snakemake --profile my_profiles/custom --delete-all-output
snakemake --profile my_profiles/custom