1.Ongoing Positive Selection Drives the Evolution of SARS-CoV-2 Genomes
Hou YALI ; Zhao SHILEI ; Liu QI ; Zhang XIAOLONG ; Sha TONG ; Su YANKAI ; Zhao WENMING ; Bao YIMING ; Xue YONGBIAO ; Chen HUA
Genomics, Proteomics & Bioinformatics 2022;(6):1214-1223
SARS-CoV-2 is a new RNA virus affecting humans and spreads extensively throughout the world since its first outbreak in December,2019.Whether the transmissibility and pathogenicity of SARS-CoV-2 in humans after zoonotic transfer are actively evolving,and driven by adaptation to the new host and environments is still under debate.Understanding the evolutionary mechanism underlying epidemiological and pathological characteristics of COVID-19 is essential for predicting the epidemic trend,and providing guidance for disease control and treatments.Interrogating novel strategies for identifying natural selection using within-species polymorphisms and 3,674,076 SARS-CoV-2 genome sequences of 169 countries as of December 30,2021,we demonstrate with popula-tion genetic evidence that during the course of SARS-CoV-2 pandemic in humans,1)SARS-CoV-2 genomes are overall conserved under purifying selection,especially for the 14 genes related to viral RNA replication,transcription,and assembly;2)ongoing positive selection is actively driving the evolution of 6 genes(e.g.,S,ORF3a,and N)that play critical roles in molecular processes involving pathogen-host interactions,including viral invasion into and egress from host cells,and viral inhi-bition and evasion of host immune response,possibly leading to high transmissibility and mild symptom in SARS-CoV-2 evolution.According to an established haplotype phylogenetic relation-ship of 138 viral clusters,a spatial and temporal landscape of 556 critical mutations is constructed based on their divergence among viral haplotype clusters or repeatedly increase in frequency within at least 2 clusters,of which multiple mutations potentially conferring alterations in viral transmis-sibility,pathogenicity,and virulence of SARS-CoV-2 are highlighted,warranting attention.
2.Population Genetics of SARS-CoV-2:Disentangling Effects of Sampling Bias and Infection Clusters
Liu QI ; Zhao SHILEI ; Shi CHENG-MIN ; Song SHUHUI ; Zhu SIHUI ; Su YANKAI ; Zhao WENMING ; Li MINGKUN ; Bao YIMING ; Xue YONGBIAO ; Chen HUA
Genomics, Proteomics & Bioinformatics 2020;18(6):640-647
A novel RNA virus, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is responsible for the ongoing outbreak of coronavirus disease 2019 (COVID-19). Population genetic analysis could be useful for investigating the origin and evolutionary dynamics of COVID-19. However, due to extensive sampling bias and existence of infection clusters during the epidemic spread, direct applications of existing approaches can lead to biased parameter estima-tions and data misinterpretation. In this study, we first present robust estimator for the time to the most recent common ancestor (TMRCA) and the mutation rate, and then apply the approach to analyze 12,909 genomic sequences of SARS-CoV-2. The mutation rate is inferred to be 8.69 × 10-4 per site per year with a 95% confidence interval (CI) of [8.61 × 10-4, 8.77 × 10-4], and the TMRCA of the samples inferred to be Nov 28, 2019 with a 95% CI of [Oct 20, 2019, Dec 9, 2019]. The results indicate that COVID-19 might originate earlier than and outside of Wuhan Seafood Market. We further demonstrate that genetic polymorphism patterns, including the enrichment of specific haplotypes and the temporal allele frequency trajectories generated from infection clusters, are similar to those caused by evolutionary forces such as natural selection. Our results show that population genetic methods need to be developed to efficiently detangle the effects of sampling bias and infection clusters to gain insights into the evolutionary mechanism ofSARS-CoV-2. Software for implementing VirusMuT can be downloaded at https://bigd.big.ac.cn/biocode/tools/BT007081.