A call for an independent inquiry into the origin of the SARS-CoV-2 virus

Neil L. Harrison and Jeffrey D. Sachs

May 19, 2022

119(21) e2202769119

https://doi.org/10.1073/pnas.2202769119

Since the identification of theSARS-CoV-2 in Wuhan, China, in January 2020 (1), the origin of the virus has been a topic of intense scientific debate and public speculation. The two main hypotheses are that the virus emerged from human exposure to an infected animal [“zoonosis” (2)] or that it emerged in a research-related incident (3). The investigation into the origin of the virus has been made difficult by the lack of key evidence from the earliest days of the outbreak—there’s no doubt that greater transparency on the part of Chinese authorities would be enormously helpful. Nevertheless, we argue here that there is much important information that can be gleaned from US-based research institutions, information not yet made available for independent, transparent, and scientific scrutiny.

The data available within the United States would explicitly include, but are not limited to, viral sequences gathered and held as part of the PREDICT project and other funded programs, as well as sequencing data and laboratory notebooks from US laboratories. We call on US government scientific agencies, most notably the NIH, to support a full, independent, and transparent investigation of the origins of SARS-CoV-2. This should take place, for example, within a tightly focused science-based bipartisan Congressional inquiry with full investigative powers, which would be able to ask important questions—but avoid misguided witch-hunts governed more by politics than by science.

Essential US Investigations

The US intelligence community (IC) was tasked, in 2021 by President Joe Biden (4), with investigating the origin of the virus. In their summary public statement, the IC writes that “all agencies assess that two hypotheses are plausible: natural exposure to an infected animal and a laboratory-associated incident” (4). The IC further writes that “China’s cooperation most likely would be needed to reach a conclusive assessment of the origins of COVID-19 [coronavirus disease 2019].” Of course, such cooperation is highly warranted and should be pursued by the US Government and the US scientific community. Yet, as outlined below, much could be learned by investigating US-supported and US-based work that was underway in collaboration with Wuhan-based institutions, including the Wuhan Institute of Virology (WIV), China. It is still not clear whether the IC investigated these US-supported and US-based activities. If it did, it has yet to make any of its findings available to the US scientific community for independent and transparent analysis and assessment. If, on the other hand, the IC did not investigate these US-supported and US-based activities, then it has fallen far short of conducting a comprehensive investigation.

This lack of an independent and transparent US-based scientific investigation has had four highly adverse consequences. First, public trust in the ability of US scientific institutions to govern the activities of US science in a responsible manner has been shaken. Second, the investigation of the origin of SARS-CoV-2 has become politicized within the US Congress (5); as a result, the inception of an independent and transparent investigation has been obstructed and delayed. Third, US researchers with deep knowledge of the possibilities of a laboratory-associated incident have not been enabled to share their expertise effectively. Fourth, the failure of NIH, one of the main funders of the US–China collaborative work, to facilitate the investigation into the origins of SARS-CoV-2 (4) has fostered distrust regarding US biodefense research activities.

Much of the work on SARS-like CoVs performed in Wuhan was part of an active and highly collaborative US–China scientific research program funded by the US Government (NIH, Defense Threat Reduction Agency [DTRA], and US Agency for International Development [USAID]), coordinated by researchers at EcoHealth Alliance (EHA), but involving researchers at several other US institutions. For this reason, it is important that US institutions be transparent about any knowledge of the detailed activities that were underway in Wuhan and in the United States. The evidence may also suggest that research institutions in other countries were involved, and those too should be asked to submit relevant information (e.g., with respect to unpublished sequences).

Participating US institutions include the EHA, the University of North Carolina (UNC), the University of California at Davis (UCD), the NIH, and the USAID. Under a series of NIH grants and USAID contracts, EHA coordinated the collection of SARS-like bat CoVs from the field in southwest China and southeast Asia, the sequencing of these viruses, the archiving of these sequences (involving UCD), and the analysis and manipulation of these viruses (notably at UNC). A broad spectrum of coronavirus research work was done not only in Wuhan (including groups at Wuhan University and the Wuhan CDC, as well as WIV) but also in the United States. The exact details of the fieldwork and laboratory work of the EHA-WIV-UNC partnership, and the engagement of other institutions in the United States and China, has not been disclosed for independent analysis. The precise nature of the experiments that were conducted, including the full array of viruses collected from the field and the subsequent sequencing and manipulation of those viruses, remains unknown.

EHA, UNC, NIH, USAID, and other research partners have failed to disclose their activities to the US scientific community and the US public, instead declaring that they were not involved in any experiments that could have resulted in the emergence of SARS-CoV-2. The NIH has specifically stated (6) that there is a significant evolutionary distance between the published viral sequences and that of SARS-CoV-2 and that the pandemic virus could not have resulted from the work sponsored by NIH. Of course, this statement is only as good as the limited data on which it is based, and verification of this claim is dependent on gaining access to any other unpublished viral sequences that are deposited in relevant US and Chinese databases (7,8). On May 11, 2022, Acting NIH Director Lawrence Tabak testified before Congress that several such sequences in a US database were removed from public view, and that this was done at the request of both Chinese and US investigators.

Blanket denials from the NIH are no longer good enough. Although the NIH and USAID have strenuously resisted full disclosure of the details of the EHA-WIV-UNC work program, several documents leaked to the public or released through the Freedom of Information Act (FOIA) have raised concerns. These research proposals make clear that the EHA-WIV-UNC collaboration was involved in the collection of a large number of so-far undocumented SARS-like viruses and was engaged in their manipulation within biological safety level (BSL)-2 and BSL-3 laboratory facilities, raising concerns that an airborne virus might have infected a laboratory worker (9). A variety of scenarios have been discussed by others, including an infection that involved a natural virus collected from the field or perhaps an engineered virus manipulated in one of the laboratories (3).

Overlooked Details

Special concerns surround the presence of an unusual furin cleavage site (FCS) in SARS-CoV-2 (10) that augments the pathogenicity and transmissibility of the virus relative to related viruses like SARS-CoV-1 (11, 12). SARS-CoV-2 is, to date, the only identified member of the subgenus sarbecovirus that contains an FCS, although these are present in other coronaviruses (13, 14). A portion of the sequence of the spike protein of some of these viruses is illustrated in the alignment shown in Fig. 1, illustrating the unusual nature of the FCS and its apparent insertion in SARS-CoV-2 (15). From the first weeks after the genome sequence of SARS-CoV-2 became available, researchers have commented on the unexpected presence of the FCS within SARS-CoV-2—the implication being that SARS-CoV-2 might be a product of laboratory manipulation. In a review piece arguing against this possibility, it was asserted that the amino acid sequence of the FCS in SARS-CoV-2 is an unusual, nonstandard sequence for an FCS and that nobody in a laboratory would design such a novel FCS (13).

Fig. 1.

This alignment of the amino acid sequences of coronavirus spike proteins, in the region of the S1/S2 junction, illustrates the sequence of SARS-CoV-2 (Wuhan-Hu-1) and some of its closest relatives. The furin cleavage site (FCS) is indicated (PRRAR'SVAS), and furin cuts the spike protein between R and S, as indicated by the red arrowhead. Adapted from Chan & Zhan (15).

OPEN IN VIEWER

In fact, the assertion that the FCS in SARS-CoV-2 has an unusual, nonstandard amino acid sequence is false. The amino acid sequence of the FCS in SARS-CoV-2 also exists in the human ENaC α subunit (16), where it is known to be functional and has been extensively studied (17, 18). The FCS of human ENaC α has the amino acid sequence RRAR'SVAS (Fig. 2), an eight–amino-acid sequence that is perfectly identical with the FCS of SARS-CoV-2 (16). ENaC is an epithelial sodium channel, expressed on the apical surface of epithelial cells in the kidney, colon, and airways (19, 20), that plays a critical role in controlling fluid exchange. The ENaC α subunit has a functional FCS (17, 18) that is essential for ion channel function (19) and has been characterized in a variety of species. The FCS sequence of human ENaC α (20) is identical in chimpanzee, bonobo, orangutan, and gorilla (SI Appendix, Fig. 1), but diverges in all other species, even primates, except one. (The one non-human non-great ape species with the same sequence is Pipistrellus kuhlii, a bat species found in Europe and Western Asia; other bat species, including Rhinolophus ferrumequinem, have a different FCS sequence in ENaC α [RKAR'SAAS]).

Fig. 2.

Amino acid alignment of the furin cleavage sites of SARS-CoV-2 spike protein with (Top) the spike proteins of other viruses that lack the furin cleavage site and (Bottom) the furin cleavage sites present in the α subunits of human and mouse ENaC. Adapted from Anand et al. (16).

OPEN IN VIEWER

One consequence of this “molecular mimicry” between the FCS of SARS CoV-2 spike and the FCS of human ENaC is competition for host furin in the lumen of the Golgi apparatus, where the SARS-CoV-2 spike is processed. This results in a decrease in human ENaC expression (21). A decrease in human ENaC expression compromises airway function and has been implicated as a contributing factor in the pathogenesis of COVID-19 (22). Another consequence of this astonishing molecular mimicry is evidenced by apparent cross-reactivity with human ENaC of antibodies from COVID-19 patients, with the highest levels of cross-reacting antibodies directed against this epitope being associated with most severe disease (23).

We do not know whether the insertion of the FCS was the result of natural evolution (2, 13)—perhaps via a recombination event in an intermediate mammal or a human (13, 24)—or was the result of a deliberate introduction of the FCS into a SARS-like virus as part of a laboratory experiment. We do know that the insertion of such FCS sequences into SARS-like viruses was a specific goal of work proposed by the EHA-WIV-UNC partnership within a 2018 grant proposal (“DEFUSE”) that was submitted to the US Defense Advanced Research Projects Agency (DARPA) (25). The 2018 proposal to DARPA was not funded, but we do not know whether some of the proposed work was subsequently carried out in 2018 or 2019, perhaps using another source of funding.

We also know that that this research team would be familiar with several previous experiments involving the successful insertion of an FCS sequence into SARS-CoV-1 (26) and other coronaviruses, and they had a lot of experience in construction of chimeric SARS-like viruses (2729). In addition, the research team would also have some familiarity with the FCS sequence and the FCS-dependent activation mechanism of human ENaC α (19), which was extensively characterized at UNC (17, 18). For a research team assessing the pandemic potential of SARS-related coronaviruses, the FCS of human ENaC—an FCS known to be efficiently cleaved by host furin present in the target location (epithelial cells) of an important target organ (lung), of the target organism (human)—might be a rational, if not obvious, choice of FCS to introduce into a virus to alter its infectivity, in line with other work performed previously.

Of course, the molecular mimicry of ENaC within the SARS-CoV-2 spike protein might be a mere coincidence, although one with a very low probability. The exact FCS sequence present in SARS-CoV-2 has recently been introduced into the spike protein of SARS-CoV-1 in the laboratory, in an elegant series of experiments (12, 30), with predictable consequences in terms of enhanced viral transmissibility and pathogenicity. Obviously, the creation of such SARS-1/2 “chimeras” is an area of some concern for those responsible for present and future regulation of this area of biology. [Note that these experiments in ref. 30 were done in the context of a safe “pseudotyped” virus and thus posed no danger of producing or releasing a novel pathogen.] These simple experiments show that the introduction of the 12 nucleotides that constitute the FCS insertion in SARS-CoV-2 would not be difficult to achieve in a lab. It would therefore seem reasonable to ask that electronic communications and other relevant data from US groups should be made available for scrutiny.

Seeking Transparency

To date, the federal government, including the NIH, has not done enough to promote public trust and transparency in the science surrounding SARS-CoV-2. A steady trickle of disquieting information has cast a darkening cloud over the agency. The NIH could say more about the possible role of its grantees in the emergence of SARS-CoV-2, yet the agency has failed to reveal to the public the possibility that SARS-CoV-2 emerged from a research-associated event, even though several researchers raised that concern on February 1, 2020, in a phone conversation that was documented by email (5). Those emails were released to the public only through FOIA, and they suggest that the NIH leadership took an early and active role in promoting the “zoonotic hypothesis” and the rejection of the laboratory-associated hypothesis (5). The NIH has resisted the release of important evidence, such as the grant proposals and project reports of EHA, and has continued to redact materials released under FOIA, including a remarkable 290-page redaction in a recent FOIA release.

Information now held by the research team headed by EHA (7), as well as the communications of that research team with US research funding agencies, including NIH, USAID, DARPA, DTRA, and the Department of Homeland Security, could shed considerable light on the experiments undertaken by the US-funded research team and on the possible relationship, if any, between those experiments and the emergence of SARS-CoV-2. We do not assert that laboratory manipulation was involved in the emergence of SARS-CoV-2, although it is apparent that it could have been. However, we do assert that there has been no independent and transparent scientific scrutiny to date of the full scope of the US-based evidence.

The relevant US-based evidence would include the following information: laboratory notebooks, virus databases, electronic media (emails, other communications), biological samples, viral sequences gathered and held as part of the PREDICT project (7) and other funded programs, and interviews of the EHA-led research team by independent researchers, together with a full record of US agency involvement in funding the research on SARS-like viruses, especially with regard to projects in collaboration with Wuhan-based institutions. We suggest that a bipartisan inquiry should also follow up on the tentative conclusion of the IC (4) that the initial outbreak in Wuhan may have occurred no later than November 2019 and that therefore the virus was circulating before the cluster of known clinical cases in December. The IC did not reveal the evidence for this statement, nor when parts of the US Government or US-based researchers first became aware of a potential new outbreak. Any available information and knowledge of the earliest days of the outbreak, including viral sequences (8), could shed considerable light on the origins question.

We continue to recognize the tremendous value of US–China cooperation in ongoing efforts to uncover the proximal origins of the pandemic. Much vital information still resides in China, in the laboratories, hospital samples, and early epidemiological information not yet available to the scientific community. Yet a US-based investigation need not wait—there is much to learn from the US institutions that were extensively involved in research that may have contributed to, or documented the emergence of, the SARS-CoV-2 virus. Only an independent and transparent investigation, perhaps as a bipartisan Congressional inquiry, will reveal the information that is needed to enable a thorough scientific process of scrutiny and evaluation.

https://www.pnas.org/doi/10.1073/pnas.2202769119

Read the pdf