While the ViroMatch code base is hosted at https://github.com/twylie/viromatch, we only support using the official Docker image. Instructions follow for installing and running ViroMatch using Docker.
Gather your input sequences for processing. You may provide either (1) a single input unmapped BAM (uBAM) file or (2) paired FASTQ files. If paired FASTQ files are provided, the R1-file should be first followed by the R2-file, space delimited.
The ViroMatch pipeline, and all of its underlying code dependencies, are provided as a Docker image. Once you’ve installed Docker Desktop on your computer and have downloaded the ViroMatch Docker image, you will be able to run the pipeline containers.
Download DockerOnce you’ve installed Docker Desktop (and the Docker service is running on your system) you will be able to download and run the ViroMatch Docker image. Specifically, you want the viromatch:latest
image.
ViroMatch requires several specific databases (--host
, --viralfna
, --viralfaa
, --nt
, --nr
) for processing. These databases have been pre-compiled and are available for download.
Be aware the databases are quite large in cumulative size (~860 GB).
Databases are being hosting through Globus Connect. Globus Connect is free to install and use for users at non-profit research and education institutions. You will need to login to Globus to access the databases.
Download ViroMatch Databases Using GlobusWe can run ViroMatch via a Docker container using the command line. From a terminal, we would type the following to call the Docker container and run ViroMatch. Arguments will need to reflect your file paths and personal settings in order to run properly on your system.
docker \
container run \
-itd \
-v devViroMatchSMK/t/data:/data \
-v /tmp/myTest:/outdir \
-v devViroMatchSMK/t/ncbi/nt:/nt \
-v devViroMatchSMK/t/ncbi/nr:/nr \
-v devViroMatchSMK/t/viral/viralfna:/viralfna \
-v devViroMatchSMK/t/viral/viralfaa:/viralfaa \
-v devViroMatchSMK/t/host:/host \
-v devViroMatchSMK/t/adaptor:/adaptor \
-v devViroMatchSMK/t/taxonomy:/taxonomy \
twylie/viromatch:latest \
viromatch \
--sampleid 'Sample 1' \
--input /data/test.r1.fastq /data/test.r2.fastq \
--outdir /outdir/myTest \
--nt /nt/nt.fofn \
--nr /nr/nr.fofn \
--viralfna /viralfna/viral_genomes.fasta \
--viralfaa /viralfaa/viral_genomes.dmnd \
--host /host/human.fna \
--adaptor /adaptor/adaptor.fqtrim \
--taxid /taxonomy/taxonomy.tsv
Upon completion, ViroMatch will provide reports detailing viral taxonomic classification and quantification. All report files are at the top-level of the --outdir
directory provided in the execution command.
Results:
REPORT.nuc_ambiguous_counts.txt
REPORT.nuc_counts.txt
REPORT.trans_nuc_ambiguous_counts.txt
REPORT.trans_nuc_counts.txt