randombio.com | Science Dies in Unblogginess | Believe All Science | Follow The Science
Wednesday, June 16, 2021 | Science Commentary

The hidden debate about the SARS-CoV-2 furin cleavage site (updated Jun 18 2021)

The most important scientific discussion of our time is going on in obscure scientific journals

I f you get all your news from the news media, you might think the world's opinion on the origin of SARS-CoV-2 has suddenly changed. You might even think maybe the science has changed. It has not. An intense debate, largely hidden from the general public, has been going on for months.

Much of it went on in low-profile journals, leaving big publications like Nature in the dust. Nature is trying to catch up by publishing a summary for the general public on its website titled “The COVID lab-leak hypothesis: what scientists do and don't know.” [1] They write:

Scientists don't have enough evidence about the origins of SARS-CoV-2 to rule out the lab-leak hypothesis, or to prove the alternative--that the virus has a natural origin. . . . . . . Many other coronaviruses have furin cleavage sites, such as coronaviruses that cause colds [2].

They cite a non-peer reviewed article by Lytras et al. that claims that a virus named RmYN02 is more closely related to SARS-CoV-2 than RaTG13 is.[3] Those of us who are better at remembering names of viruses than names of people will remember that RmYN02 was cited by renegade Chinese dissident Yan Limeng [4] as a possible precursor for an engineered SARS-CoV-2. The Lytras et al. article basically says: “Nothing to see here, just horseshoe bats, or something . . . maybe . . .”, and suggests that wildlife sampling is needed to find it. In other words, “More research is needed.”

Just one detail: RmYN02 is not closer than RaTG13 to SARS-CoV-2. This is an indisputable point that's easy to find. So, why does Nature get it wrong?

The furin cleavage site

The answer is that the real debate is taking place not in Nature, Science, or the big virology journals, but in smaller places like Bioessays and Stem Cell Research. When the sequence of SARS-CoV-2 was published in early 2020, it was immediately obvious that the furin cleavage site could not have evolved by itself. It could only have originated by recombination or by engineering.

The furin cleavage site is important because (1) furin is omnipresent in mammalian tissues, (2) cleavage at this site is necessary for activation and entry of SARS-CoV-2 into mammalian cells, and (3) the probability of a 12-nucleotide insert CTCCTCGGCGGG (which codes for a PRRA amino acid sequence) occurring by random mutation that just happens to contain two arginines (abbreviated as RR) is very low. The addition of a furin site increases the tropism of the virus, meaning the virus is now able to infect more types of tissues. From studying the Indian B.1.617 strain, in which the furin site is even bigger, scientists also know that the furin site increases the virus's pathogenicity and/or transmissibility.[13] (Update: transmissibility is increased, pathogenicity seems not to be.)

The origin of the false claim about RmYN02 was a paper by a group of researchers in Shandong, China who reported the discovery of RmYN02 from the bat Rhinolophus malayanus and claimed it was “characterized by the insertion of multiple amino acids at the junction site of the S1 and S2 subunits of the spike (S) protein”, which is to say it contains a potential furin cleavage site.[5] They also claimed that RmYN02 is “the closest relative of SARS-CoV-2 reported to date in most of the genome.” Their own table, part of which is reproduced here, shows that this is not so.

 Percent identity with SARS-CoV-2 (from [5])
Strain Complete genome   S     RBD     1ab     E  
RmYN02 93.3 71.9 61.3 97.298.7
RaTG13 96.1 92.9 85.3 96.599.6

RmYN02 has only 93.3% identity overall to SARS-CoV-2, and only 61.3% identity in the spike (S) protein receptor-binding domain (RBD) in SARS-CoV-2, which means it is unlikely to bind to its human receptor angiotensin-converting enzyme 2 (hACE2). The overall Spike sequence is also much less similar (71.9% vs 92.9%) and even the highly conserved E protein is less similar. Only the 1ab region, which is a DNA replicase, is slightly closer than RaTG13 to SARS-CoV-2.

The paper makes other speculations that have since been abandoned, such as the supposed pangolin connection, but their important claim was about the possible origin of the furin site. If true, it would have been evidence that it could have evolved naturally, thereby refuting claims that it was engineered by the Wuhan Institute of Virology (WIV). This is undoubtedly why Nature gives it credence.

Incidentally, it will surely come as a relief to Dr Fauci that in their Acknowledgements there is no mention of funding from the US National Institutes of Health.

Virologists dispute Zhou's claims

The findings in this paper were almost immediately challenged by virologists Rossana Segreto and Yuri Deigin [6] in a Bioessays article published March 2021. They pointed out where in the literature the WIV researchers published gain-of-function experiments on the spike protein and identified a short region on the spike protein (aa 310 to 518) as “necessary and sufficient to convert Rp3-S into a huACE2-binding molecule”[7]. Shi Zhengli at WIV and the Baric group at the University of North Carolina published several papers describing their gain-of-function research.[8–10] Whether one agrees with the wisdom of such experiments, it cannot be denied that the WIV performed them.

Segreto and Deigin's main point, however, was that the supposed furin binding site in RmYN02 is not real. Their paper is not easy to follow, as their reasoning depends on an understanding of the intricacies of virus sequencing (which is not as easy as sequencing a pure clone like we have in the lab), and some of their figures are badly labeled. But their argument is that a comparison with the closest ancestors of RmYN02, including bat-SL-CoVZC45, bat-SL-CoVZXC21, and RacCS203, shows convincingly that Zhou et al. used a methodologically incorrect way of aligning their sequences, in effect shifting their bases to the right to create a 4-aa deletion that is not real, and making it falsely appear that RmYN02 has a primordial furin site (SPAAR) that sort of matches the SPRRAR furin site in SARS-CoV-2.

Below is a small part of the alignment of RmYN02 (MW201982.1) with SARS-CoV-2 (NC_045512.2) and ZC45 (mg772933.1) aligned by Zhou et al. and the new one made by S & D, who used the well-known CLUSTALW algorithm in Fig. 2 of the Bioessays paper.

covid furin site alignment

What they're saying is that ZC45 and RacCS203, which are known not to have furin cleavage sites, match up with RmYN02 before and after without insertions. There are only mutations. This implies that the NSPAA region in RmYN02 that Zhou et al. claim to be a nascent furin-binding domain is not real. There are no insertions. There is another a two-amino-acid deletion in RmYN02 compared to ZC45 about 30 bases further down (not shown), after which the two sequences start showing high identity.

Update 06/17/2021 We can see this more clearly by lining up the DNA sequences manually. If RmYN02 contained a 12-base insert, it would be easy to see. Here's the result:

covid furin site alignment

That diagram might look pretty small if you're reading it on a small screen, but it makes clearer what's happening: the two sequences align reasonably well before the insert and very well after the insert. The good alignment continues for quite a while on the right side. Around the 12-nucleotide insert in SARS-CoV-2, starting at CAACTCACCT, there are many mutations, which make it hard for the computer to line the two sequences up.

The dot plots below show this. The two sequences have homologous regions before and after the insert as shown by the green line on the right. A variable region starting around the AGT TAT in the diagram above is shown in red. Then the alignment jumps about 18 bases to the right. This is the insert. Regardless of where that pesky CCT fits, it's clear that there is a big gap in RmYN02 that's not present in SARS-CoV-2. This indicates recombination (though engineering can't be ruled out), and it is what confused Zhou et al.

Dot plots of SARS-CoV-2 and RmYN02 Spike

Dot plots of SARS-CoV-2 and RmYN02 Spike. Only a small part of the entire sequence is shown. The variable region is shown in red. This is followed by a clear gap where the aligment shifts to the right. This is the insert in SARS-CoV-2. No other noticeable inserts are in the sequence of S that is in the database.

The original RmYN02 sequence is still in Genbank, but the new alignment from S & D's analysis is clearly better: it matches its relatives better. The alternative hypothesis is unparsimonious—it has too many gaps, which is considered bad because it makes it look like a product of wishful thinking.

This means that if the evolutionary path is from ZC45→RmYN02→SARS-CoV-2, the furin site in SARS-CoV-2 could not have evolved. It had to be a recombination event of some kind. A laboratory origin still cannot be excluded, and suggesting it as a possibility is not a baseless conspiracy theory. This is an entirely plausible conclusion—just as it was back in March when S & D's paper came out.

Confusion in the popular press

Meanwhile, reporters in the popular press are getting totally confused. The UK Daily Mail claims there are two BSL-4 facilities in China instead of one and that Shi Zhengli denied that anyone in the WIV became ill and asked for names. Yet Pharma Industry Review quotes Beijing News Today [11] as saying Patient Zero was Huang Yanling, a researcher at the WIV who started in 2012.

People tend to forget that the Wuhan CDC, where samples of the virus were undoubtedly transported by courier and stored for safekeeping or study, is mere meters away from the wet market.

Yet throughout the pandemic, the American press had only one criterion for factiness: whether the fact could be used to harm President Trump. The Washington Post called the leak idea a conspiracy theory in an article titled “Tom Cotton keeps repeating a coronavirus conspiracy theory that was already debunked.” Fifteen months later they retroactively changed the headline to “Tom Cotton keeps repeating a coronavirus fringe theory that scientists have disputed.” This is what we have to deal with. Who knows what history will be six months from now? As the Russians in the USSR said: the future is certain, it's the past that keeps changing.

There are also claims, cited in a Redstate article, that the sequence CGG CGG, which codes for two arginines, almost never occurs in viruses. This is not really true: it is rare for these viruses, but not non-existent. Only 5% of arginines in SARS-CoV-2 or RaTG13 are coded by CGG, whereas 16.7% of them should be CGG by chance alone. Certainly, gain of function research could have put them there just as easily as Ma and Pa Nature, but it's fairly weak evidence.

Most scientists consider gain of function to be a menace, as demonstrated by the condemnation of two scientists in 2011 who considered a change that would have made highly pathogenic H5N1 influenza viruses transmissible from ferret to ferret by respiratory droplets.[12] That little stunt led to a US ban, which Obama lifted just before Trump's inauguration. All this is creating comparisons of Fauci, who approved of this kind of research, with Frankenstein in the political media (see here).

(As an aside, there are photos of Anthony Fauci wearing a clean and pressed lab coat, which proves he's not a real scientist. Lab coats are only worn when the risk of getting bad stuff on one's clothes is higher than the risk of knocking a over a beaker of it with those big sleeves. Lab coats also have slots that let you access your pockets. These slots routinely get caught on those lever-style door handles. If you're lucky, the lab coat rips. If you're not, whatever you're carrying goes flying across the hallway and dissolves the emergency eyewash station along with the guy from Safety who was inspecting them. That's why we only wear them when the Safety guy is around—or when talking to the press.)

What if biologists created Covid?

The editors of Nature are obviously worried of what might happen to international scientific cooperation, and even more so to their dream of international hand-holding and swaying, if the PRC were found to have created Covid. Of course, we don't do much swaying in the lab these days. After what online lynch mobs did to Tim Hunt, we try not to hold hands in the lab anymore either.

Congratulations go to Bioessays and Stem Cell Research for their courage in publishing these important articles. But why would Nature not even cite them?

At least they now acknowledge the existence of a controversy. Maybe they realized that if you tell everyone you're all super-woke and political, you can't be surprised when people don't believe what you say. Or maybe they finally figured out that the only thing worse than doing dangerous research that kills thousands of people is covering it up.

1. The COVID lab-leak hypothesis: what scientists do and don't know https://www.nature.com/articles/d41586-021-01529-3 Link

2. Wu Y, Zhao S. Furin cleavage sites naturally occur in coronaviruses. Stem Cell Res. 2020 Dec 9;50:102115. doi: 10.1016/j.scr.2020.102115. PMID: 33340798; PMCID: PMC7836551.

3. Lytras S, Hughes J, Martin D, de Klerk A, Lourens R, Kosakovsky Pond S, Xia W, Jiang X, Robertson D (2021). Exploring the natural origins of SARS-CoV-2 in the light of recombination Preprint at bioRxiv https://doi.org/10.1101/2021.01.22.427830. Link

4. Yan LM, Kang S, Guan J, Hu S (2020). Unusual Features of the SARS-CoV-2 Genome Suggesting Sophisticated Laboratory Modification Rather Than Natural Evolution and Delineation of Its Probable Synthetic Route. https://zenodo.org/record/4028830

5. Zhou H, Chen X, Hu T, Li J, Song H, Liu Y, Wang P, Liu D, Yang J, Holmes EC, Hughes AC, Bi Y, Shi W. A Novel Bat Coronavirus Closely Related to SARS-CoV-2 Contains Natural Insertions at the S1/S2 Cleavage Site of the Spike Protein. Curr Biol. 2020 Jun 8;30(11):2196–2203. doi: 10.1016/j.cub.2020.05.023. Epub 2020 May 11. Erratum in: Curr Biol. 2020 Oct 5;30(19):3896. PMID: 32416074; PMCID: PMC7211627.

6. Segreto R, Deigin Y. The genetic structure of SARS-CoV-2 does not rule out a laboratory origin: SARS-COV-2 chimeric structure and furin cleavage site might be the result of genetic manipulation. Bioessays. 2021 Mar;43(3):e2000240. doi: 10.1002/bies.202000240. PMID: 33200842; PMCID: PMC7744920.

7. Maier HJ, Bickerton E, Britton P (2015). Coronaviruses Methods and protocols. London: Humana Press.

8. Wang N, Luo C, Liu H, Yang X, Hu B, Zhang W, Li B, Zhu Y, Zhu G, Shen X, Peng C, Shi Z. Characterization of a New Member of Alphacoronavirus with Unique Genomic Features in Rhinolophus Bats. Viruses. 2019 Apr 24;11(4):379. doi: 10.3390/v11040379. PMID: 31022925; PMCID: PMC6521148.

9. Li X, Zai J, Zhao Q, Nie Q, Li Y, Foley BT, Chaillon A. Evolutionary history, potential intermediate animal host, and cross-species analyses of SARS-CoV-2. J Med Virol. 2020 Jun;92(6):602–611. doi: 10.1002/jmv.25731. PMID: 32104911; PMCID: PMC7228310.

10. Hu B, Zeng LP, Yang XL, Ge XY, Zhang W, Li B, Xie JZ, Shen XR, Zhang YZ, Wang N, Luo DS, Zheng XS, Wang MN, Daszak P, Wang LF, Cui J, Shi ZL. Discovery of a rich gene pool of bat SARS-related coronaviruses provides new insights into the origin of SARS coronavirus. PLoS Pathog. 2017 Nov 30;13(11):e1006698. doi: 10.1371/journal.ppat.1006698. PMID: 29190287; PMCID: PMC5708621.

11. https://www.rfi.fr/cn/中国/20200216 -武汉病毒研究所再遭聚 焦-研究生染病说法不消 -石正丽再担保绝无感染 Link

12. Herfst S, Schrauwen EJ, Linster M, Chutinimitkul S, de Wit E, Munster VJ, Sorrell EM, Bestebroer TM, Burke DF, Smith DJ, Rimmelzwaan GF, Osterhaus AD, Fouchier RA. Airborne transmission of influenza A/H5N1 virus between ferrets. Science. 2012 Jun 22;336(6088):1534–1541. doi: 10.1126/science.1213362. PMID: 22723413; PMCID: PMC4810786.

13. The SARS-CoV-2 variants associated with infections in India, B.1.617, show enhanced spike cleavage by furin. Peacock TP, Sheppard CM, Brown JC, Goonawardane N, Zhou J, Whiteley M, PHE Virology Consortium, de Silva TI, Barclay WS. bioRxiv 2021.05.28.446163; doi: https://doi.org/10.1101/2021.05.28.446163

june 16 2021, 5:42 am. updated jun 17 2021, 6:36 am. dot plot diagram added jun 18 2021

Related Articles

Be wary of news media bearing gifts about Covid-19
The news media now claim to believe the coronavirus lab leak theory is plausible. Don't trust them

Comments on Yan Li-Meng's second report
The promised critique of Wuhan Institute's RaTG13 is out. What does it say?

A discussion of Li-Meng Yan's paper on SARS-CoV-2
Three dissident virologists claim to have proof that SARS-CoV-2 was artificially created. What is their evidence? (Updated)

Underground virologists question the origins of SARS-CoV-2
Sophisticated structural and sequence analyses of the Wuhan coronavirus are popping up on the Internet

On the Internet, no one can tell whether you're a dolphin or a porpoise

book reviews