Calculating fast differential genome coverages among metagenomic sources using micov.
Journal Article
Overview
abstract
Breadth of coverage, the proportion of a reference genome covered by at least one sequencing read, is critical for interpreting metagenomic data, informing analyses from genome assembly to taxonomic profiling. However, existing tools typically summarize coverage breadth at the whole-genome or aggregate-sample level, missing informative variation along genomes and between sample groups. Here we introduce MIcrobiome COVerage (micov), a tool that computes and compares per-sample breadth of coverage across many genomes and samples. micov offers two key advances: (1) rapid cumulative coverage breadth calculations specific to each sample type, and (2) detection of differential coverage breadth along genomes. Applying micov to three metagenomic datasets, we show that it identifies a genomic region in Prevotella copri that explains variation in community composition independent of host country of origin, uncovers dietary association with a partially annotated region in an uncharacterized Lachnospiraceae genome, enabling hypothesis generation for genes of unknown function, and improves sensitivity in low-biomass settings by detecting a single genomic copy of enteropathogenic Escherichia coli (EPEC) in wastewater and distinguishing Mediterraneibacter gnavus across specimen types.