A tale of caution: How endogenous viral elements affect virus discovery in transcriptomic data - CIRAD - Centre de coopération internationale en recherche agronomique pour le développement Accéder directement au contenu
Article Dans Une Revue Virus Evolution Année : 2023

A tale of caution: How endogenous viral elements affect virus discovery in transcriptomic data

Résumé

Large-scale metagenomic and -transcriptomic studies have revolutionized our understanding of viral diversity and abundance. In contrast, endogenous viral elements (EVEs), remnants of viral sequences integrated into host genomes, have received limited attention in the context of virus discovery, especially in RNA-Seq data. EVEs resemble their original viruses, a challenge that makes distinguishing between active infections and integrated remnants difficult, affecting virus classification and biases downstream analyses. Here, we systematically assess the effects of EVEs on a prototypical virus discovery pipeline, evaluate their impact on data integrity and classification accuracy, and provide some recommendations for better practices. We examined EVEs and exogenous viral sequences linked to Orthomyxoviridae, a diverse family of negative-sense segmented RNA viruses, in 13 genomic and 538 transcriptomic datasets of Culicinae mosquitoes. Our analysis revealed a substantial number of viral sequences in transcriptomic datasets. However, a significant portion appeared not to be exogenous viruses but transcripts derived from EVEs. Distinguishing between transcribed EVEs and exogenous virus sequences was especially difficult in samples with low viral abundance. For example, three transcribed EVEs showed full-length segments, devoid of frameshift and nonsense mutations, exhibiting sufficient mean read depths that qualify them as exogenous virus hits. Mapping reads on a host genome containing EVEs before assembly somewhat alleviated the EVE burden, but it led to a drastic reduction of viral hits and reduced quality of assemblies, especially in regions of the viral genome relatively similar to EVEs. Our study highlights that our knowledge of the genetic diversity of viruses can be altered by the underestimated presence of EVEs in transcriptomic datasets, leading to false positives and altered or missing sequence information. Thus, recognizing and addressing the influence of EVEs in virus discovery pipelines will be key in enhancing our ability to capture the full spectrum of viral diversity.
Fichier principal
Vignette du fichier
vead088.pdf (11.89 Mo) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
licence : CC BY NC - Paternité - Pas d'utilisation commerciale

Dates et versions

hal-04548470 , version 1 (16-04-2024)

Licence

Paternité - Pas d'utilisation commerciale

Identifiants

Citer

Nadja Brait, Thomas Hackl, Côme Morel, Antoni Exbrayat, Serafin Gutierrez, et al.. A tale of caution: How endogenous viral elements affect virus discovery in transcriptomic data. Virus Evolution, 2023, 10 (1), pp.vead088. ⟨10.1093/ve/vead088⟩. ⟨hal-04548470⟩
0 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More