Research output: Contribution to journal › Article › peer-review
On the space of SARS-CoV-2 genetic sequence variants. / Palyanov, A Yu; Palyanova, N V.
In: Vavilovskii Zhurnal Genetiki i Selektsii, Vol. 27, No. 7, 12.2023, p. 839-850.Research output: Contribution to journal › Article › peer-review
}
TY - JOUR
T1 - On the space of SARS-CoV-2 genetic sequence variants
AU - Palyanov, A Yu
AU - Palyanova, N V
N1 - We gratefully acknowledge all data contributors, i. e., the Authors and their Originating laboratories responsible for obtaining the specimens, and their Submitting laboratories for generating the genetic sequence and metadata and sharing them via the GISAID Initiative and Genbank SARS-CoV-2 Data Hub, on which this research is based. We are also grateful to the Authors of the Nextclade project, which provides online tools for analysis and visualization of genetic data on various viruses. Copyright © AUTHORS. Публикация для корректировки.
PY - 2023/12
Y1 - 2023/12
N2 - The coronavirus pandemic caused by the SARS-CoV-2 virus, which humanity resisted using the latest advances in science, left behind, among other things, extensive genetic data. Every day since the end of 2019, samples of the virus genomes have been collected around the world, which makes it possible to trace its evolution in detail from its emergence to the present. The accumulated statistics of testing results showed that the number of confirmed cases of SARS-CoV-2 infection was at least 767.5 million (9.5 % of the current world population, excluding asymptomatic people), and the number of sequenced virus genomes is more than 15.7 million (which is over 2 % of the total number of infected people). These new data potentially contain information about the mechanisms of the variability and spread of the virus, its interaction with the human immune system, the main parameters characterizing the mechanisms of the development of a pandemic, and much more. In this article, we analyze the space of possible variants of SARS-CoV-2 genetic sequences both from a mathematical point of view and taking into account the biological limitations inherent in this system, known both from general biological knowledge and from the consideration of the characteristics of this particular virus. We have developed software capable of loading and analyzing SARS-CoV-2 nucleotide sequences in FASTA format, determining the 5' and 3' UTR positions, the number and location of unidentified nucleotides ("N"), performing alignment with the reference sequence by calling the program designed for this, determining mutations, deletions and insertions, as well as calculating various characteristics of virus genomes with a given time step (days, weeks, months, etc.). The data obtained indicate that, despite the apparent mathematical diversity of possible options for changing the virus over time, the corridor of the evolutionary trajectory that the coronavirus has passed through seems to be quite narrow. Thus it can be assumed that it is determined to some extent, which allows us to hope for a possibility of modeling the evolution of the coronavirus.
AB - The coronavirus pandemic caused by the SARS-CoV-2 virus, which humanity resisted using the latest advances in science, left behind, among other things, extensive genetic data. Every day since the end of 2019, samples of the virus genomes have been collected around the world, which makes it possible to trace its evolution in detail from its emergence to the present. The accumulated statistics of testing results showed that the number of confirmed cases of SARS-CoV-2 infection was at least 767.5 million (9.5 % of the current world population, excluding asymptomatic people), and the number of sequenced virus genomes is more than 15.7 million (which is over 2 % of the total number of infected people). These new data potentially contain information about the mechanisms of the variability and spread of the virus, its interaction with the human immune system, the main parameters characterizing the mechanisms of the development of a pandemic, and much more. In this article, we analyze the space of possible variants of SARS-CoV-2 genetic sequences both from a mathematical point of view and taking into account the biological limitations inherent in this system, known both from general biological knowledge and from the consideration of the characteristics of this particular virus. We have developed software capable of loading and analyzing SARS-CoV-2 nucleotide sequences in FASTA format, determining the 5' and 3' UTR positions, the number and location of unidentified nucleotides ("N"), performing alignment with the reference sequence by calling the program designed for this, determining mutations, deletions and insertions, as well as calculating various characteristics of virus genomes with a given time step (days, weeks, months, etc.). The data obtained indicate that, despite the apparent mathematical diversity of possible options for changing the virus over time, the corridor of the evolutionary trajectory that the coronavirus has passed through seems to be quite narrow. Thus it can be assumed that it is determined to some extent, which allows us to hope for a possibility of modeling the evolution of the coronavirus.
UR - https://www.scopus.com/record/display.uri?eid=2-s2.0-85181525801&origin=inward&txGid=5e6a95eac4bb41c62e9e5f13f9e3f21e
U2 - 10.18699/VJGB-23-97
DO - 10.18699/VJGB-23-97
M3 - Article
C2 - 38213712
VL - 27
SP - 839
EP - 850
JO - Вавиловский журнал генетики и селекции
JF - Вавиловский журнал генетики и селекции
SN - 2500-0462
IS - 7
ER -
ID: 59525973