Materials and Methods
The Global Initiative on Sharing Avian Influenza Data (GISAID) was
founded in 2006, and, since 2010, has been hosted by the German Federal
Ministry of Food, Agriculture and Consumer Protection. GISAID has also
become a coronavirus repository since December 2019. As of 13 May 2020,
the cutoff point for our phylogenetic analysis, the GISAID database
(https://www.gisaid.org/) had compiled 16,667 coronavirus full
genomes, isolated from humans, Chinese pangolins, and batRhinolophus affinis . Among the all deposited genome sequences
1485 from all Asian countries. Although SARS-CoV-2 is an RNA virus, the
deposited sequences, by convention, are in DNA format. We discarded
partial sequences, and used only the most complete genomes that we
aligned to the full reference genome (NC_045512.2) by Wu et al. (2020)
[5] comprising 29,903 nucleotides which was retried from NCBI
(https://www.ncbi.nlm.nih.gov/nuccore). Finally, to ensure
comparability, we truncated the flanks of all sequences to the consensus
range 56 to 29,797, with nucleotide position numbering according to the
Wuhan 1 reference sequence [5]. To analyze the obtained
1st Bangladeshi SARS-CoV-2 genome derived from the
infected female patients aged 22 (GISAID accession ID: EPI_ISL_437912)
which was submitted by Child Health Research Lab, Bangladesh in a
phylogenetic context, a dataset of 32 available SARS-Cov-2 complete
genomes from different Asian countries followed by few other continent
countries was retrieved from GISAID (https://www.gisaid.org/, last
access 12 May 2020). At least one sequence from all Asian countries who
has submitted SARS-Cov-2 genome in the GISAID database was taken to
reveal the draft scenario of the circulating SARS-CoV-2 strain in this
Asia zone in comparison of newly revealed Genome from Bangladesh.
Sequence alignment was performed using Multiple Sequence Comparison by
Log- Expectation (MUSCLE) software
(http://www.clustal.org)[6].
Estimation of the best fitting substitution model (Hasegawa, Kishino,
and Yano, HKY model) and inference of the phylogenetic tree were
conducted by a neighbor-joining approach using Molecular Evolutionary
Genetics Analysis across Computing Platforms (MEGA 7;
https://www.megasoftware.net/) [7]. Support for the tree
topology was estimated with 1,000 bootstrap replicates. Using an
alignment, the single nucleotide polymorphisms (SNPs) composition and
the potentially resulting variable amino-acids in derived protein
sequences compared with the Wuhan reference sequences (NC_045512), were
further investigated with six other genome sequences (EPI_ISL_430111,
EPI_ISL_437762, EPI_ISL_412974, EPI_ISL_417444, EPI_ISL_427813,
EPI_ISL_437438) that clustered or non-clustered from Asia and Europe
with the sequence of the patient in Bangladesh. For mutation type
analysis MEGA7 and Datamonkey.org web server was used [8]. For
analysis of the novel mutation NCBI Blast was used
(https://blast.ncbi.nlm.nih.gov/Blast.cgi). Protein structures
were predicted using Phyre2 (Protein homology/analogy recognition engine
v2.0) [9] and I-TASSER (Iterative threading assembly refinement)
[10]. Templates with the highest confidence were used to generate
the model in each case. For Phyre2, intense model was used. Generated
PDB files were analysed and aligned using PyMOL v2.3.2. Images were
processed in adobe illustrator vCS6. Secondary structures were predicted
using PSIPRED included in Phyre2.