2.6 Quality control and reads mapping
Sequencing files were trimmed and the adapters were removed using Trimmomatic 0.39 (sliding window 5:20) (Bolger, Lohse & Usadel, 2014). The quality of the trimmed reads was evaluated using FastQC (Simon, 2010). Genomes had to meet the following quality criteria to be included in the study: GC content of ~65% without multiple or anomalous peaks, coverage of at least 15X, median read length Reads were mapped using Burrows-Wheeler Aligner 0.7.17 (BWA-MEM) (Li, 2010) againstMycobacterium tuberculosis H37Rv for the identification of RDs (regions of difference) and to check that genomes had a sequencing coverage of at least 95%. Accordingly, the resulting bam files were evaluated for the absence of RD4 and presence of RD1 (M. bovisspecific and to rule out BCGs, respectively) using an algorithm previously described (Zimpel, 2020). Quality-approved reads were then mapped against M. bovis AF2122/97 using BWA-MEM, and Picard v2.18.23 (https://github.com/broadinstitute/picard) was used to remove duplicates from resulting files.