Standard

Acceleration Of Recombinant Viral Sequences Search By 3SEQ Algorithm Via Adding Support Of Multi-Threaded Calculations And Considering Sample Collection Dates. / Devyaterikov, A. P.; Palyanov, A. Y.

In: Mathematical Biology and Bioinformatics, Vol. 19, No. 2, 01.2024, p. 338-353.

Research output: Contribution to journalArticlepeer-review

Harvard

APA

Vancouver

Author

BibTeX

@article{3d96733fb3664827ae95b3ab5f8c0477,
title = "Acceleration Of Recombinant Viral Sequences Search By 3SEQ Algorithm Via Adding Support Of Multi-Threaded Calculations And Considering Sample Collection Dates",
abstract = "The article presents an efficient multithreaded implementation of the modern 3SEQ algorithm for detecting recombinant genetic sequences, tested on viral genomes. The work was carried out within the framework of the project to create a domestic (Russian) web-platform (bioprojects.iis.nsk.su) for solving a wide range of problems related to data analysis in the field of bioinformatics, virology and epidemiology. A recombinant viral genome emerges when two different variants of virus genomes of the same species exchange their parts, which is possible in case of infection with both variants simultaneously. The emergence of recombinants is rare but important events in the context of virus evolution research. One of the most promising among the existing algorithms for searching for recombinants is 3SEQ, but the author{\textquoteright}s version works only in single-threaded mode. We implemented this algorithm with support for multithreaded computing and taking into account the dates of sample collection, which provided a significant increase in the computing speed. The developed software was used to search for recombinants in the samples of influenza A H1N1 (only PB2 segments from 2174 genomes were analyzed), Dengue fever (726 genomes), Ebola virus (865 genomes) and in two samples of SARS-CoV-2 coronavirus (776 and 2132 genomes). No recombinants were found for influenza A H1N1 (PB2 segment) and the first dataset on SARS-CoV-2 (variant from Russia), which is in agreement with the analysis of the same data by the RDP algorithm. For the second SARS-CoV-2 dataset (variants from the Siberian Federal District), the only recombinant present in the dataset was correctly found. 725 recombinants were found in Dengue fever viruses, with a recombination region length in the range from 50 to 1000 nucleotides. In Ebola viruses, the length of the recombination region was shorter – in 572 recombinants it was in the range of 50 to 100 nucleotides, and in 249 genomes – was less than 50 nucleotides.",
keywords = "3SEQ algorithm, acceleration, bioinformatics, computational performance, multithreaded, recombinants detection, software, virology, алгоритм 3SEQ, биоинформатика, вирусология, многопоточность, поиск рекомбинантов, программа, ускорение вычислений",
author = "Devyaterikov, {A. P.} and Palyanov, {A. Y.}",
year = "2024",
month = jan,
doi = "10.17537/2024.19.338",
language = "English",
volume = "19",
pages = "338--353",
journal = "Mathematical Biology and Bioinformatics",
issn = "1994-6538",
publisher = "Institute of Mathematical Problems of Biology",
number = "2",

}

RIS

TY - JOUR

T1 - Acceleration Of Recombinant Viral Sequences Search By 3SEQ Algorithm Via Adding Support Of Multi-Threaded Calculations And Considering Sample Collection Dates

AU - Devyaterikov, A. P.

AU - Palyanov, A. Y.

PY - 2024/1

Y1 - 2024/1

N2 - The article presents an efficient multithreaded implementation of the modern 3SEQ algorithm for detecting recombinant genetic sequences, tested on viral genomes. The work was carried out within the framework of the project to create a domestic (Russian) web-platform (bioprojects.iis.nsk.su) for solving a wide range of problems related to data analysis in the field of bioinformatics, virology and epidemiology. A recombinant viral genome emerges when two different variants of virus genomes of the same species exchange their parts, which is possible in case of infection with both variants simultaneously. The emergence of recombinants is rare but important events in the context of virus evolution research. One of the most promising among the existing algorithms for searching for recombinants is 3SEQ, but the author’s version works only in single-threaded mode. We implemented this algorithm with support for multithreaded computing and taking into account the dates of sample collection, which provided a significant increase in the computing speed. The developed software was used to search for recombinants in the samples of influenza A H1N1 (only PB2 segments from 2174 genomes were analyzed), Dengue fever (726 genomes), Ebola virus (865 genomes) and in two samples of SARS-CoV-2 coronavirus (776 and 2132 genomes). No recombinants were found for influenza A H1N1 (PB2 segment) and the first dataset on SARS-CoV-2 (variant from Russia), which is in agreement with the analysis of the same data by the RDP algorithm. For the second SARS-CoV-2 dataset (variants from the Siberian Federal District), the only recombinant present in the dataset was correctly found. 725 recombinants were found in Dengue fever viruses, with a recombination region length in the range from 50 to 1000 nucleotides. In Ebola viruses, the length of the recombination region was shorter – in 572 recombinants it was in the range of 50 to 100 nucleotides, and in 249 genomes – was less than 50 nucleotides.

AB - The article presents an efficient multithreaded implementation of the modern 3SEQ algorithm for detecting recombinant genetic sequences, tested on viral genomes. The work was carried out within the framework of the project to create a domestic (Russian) web-platform (bioprojects.iis.nsk.su) for solving a wide range of problems related to data analysis in the field of bioinformatics, virology and epidemiology. A recombinant viral genome emerges when two different variants of virus genomes of the same species exchange their parts, which is possible in case of infection with both variants simultaneously. The emergence of recombinants is rare but important events in the context of virus evolution research. One of the most promising among the existing algorithms for searching for recombinants is 3SEQ, but the author’s version works only in single-threaded mode. We implemented this algorithm with support for multithreaded computing and taking into account the dates of sample collection, which provided a significant increase in the computing speed. The developed software was used to search for recombinants in the samples of influenza A H1N1 (only PB2 segments from 2174 genomes were analyzed), Dengue fever (726 genomes), Ebola virus (865 genomes) and in two samples of SARS-CoV-2 coronavirus (776 and 2132 genomes). No recombinants were found for influenza A H1N1 (PB2 segment) and the first dataset on SARS-CoV-2 (variant from Russia), which is in agreement with the analysis of the same data by the RDP algorithm. For the second SARS-CoV-2 dataset (variants from the Siberian Federal District), the only recombinant present in the dataset was correctly found. 725 recombinants were found in Dengue fever viruses, with a recombination region length in the range from 50 to 1000 nucleotides. In Ebola viruses, the length of the recombination region was shorter – in 572 recombinants it was in the range of 50 to 100 nucleotides, and in 249 genomes – was less than 50 nucleotides.

KW - 3SEQ algorithm

KW - acceleration

KW - bioinformatics

KW - computational performance

KW - multithreaded

KW - recombinants detection

KW - software

KW - virology

KW - алгоритм 3SEQ

KW - биоинформатика

KW - вирусология

KW - многопоточность

KW - поиск рекомбинантов

KW - программа

KW - ускорение вычислений

UR - https://www.scopus.com/record/display.uri?eid=2-s2.0-85211229505&origin=inward&txGid=0ce9100fb4ebbab62dabcfa7b913f680

UR - https://www.mendeley.com/catalogue/9429fb5d-3f20-3ab3-8fa3-a8094630619a/

U2 - 10.17537/2024.19.338

DO - 10.17537/2024.19.338

M3 - Article

VL - 19

SP - 338

EP - 353

JO - Mathematical Biology and Bioinformatics

JF - Mathematical Biology and Bioinformatics

SN - 1994-6538

IS - 2

ER -

ID: 61295730