By Steven E Massey
Department of Biology, University of Puerto Rico – Rio Piedras,
San Juan, Puerto Rico 00901
This new paper by @stevenemassey is now available to download here
This version 2 will soon to be published on Researchgate and Arxiv
The three introductory threads can be viewed here:
RaTG13 is the closest related coronavirus genome phylogenetically to SARS-CoV-2,
consequently understanding its provenance is of key importance to understanding the
origin of the COVID-19 pandemic. The RaTG13 NGS dataset is attributed to a fecal
swab from the intermediate horseshoe bat Rhinolophus affinis. However, sequence
analysis reveals that this is unlikely. Metagenomic analysis using Metaxa2 shows that
only 10.3 % of small subunit (SSU) rRNA sequences in the dataset are bacterial,
inconsistent with a fecal sample, which are typically dominated by bacterial sequences.
In addition, the bacterial taxa present in the sample are inconsistent with fecal material.
Assembly of mitochondrial SSU rRNA sequences in the dataset produces a contig 98.7
% identical to R.affinis mitochondrial SSU rRNA, indicating that the sample was
generated from this or a closely related species. 87.5 % of the NGS reads map to the
Rhinolophus ferrumequinum genome, the closest bat genome to R.affinis available. In
the annotated genome assembly, 62.2 % of mapped reads map to protein coding
genes. These results clearly demonstrate that the dataset represents a Rhinolophus sp.
transcriptome, and not a fecal swab sample. Overall, the data show that the RaTG13
dataset was generated by the Wuhan Institute of Virology (WIV) from a transcriptome
derived from Rhinolophus sp. tissue or cell line, indicating that RaTG13 was in live
culture. This raises the question of whether the WIV was culturing additional unreported
coronaviruses closely related to SARS-CoV-2 prior to the pandemic. The implications
for the origin of the COVID-19 pandemic are discussed.