Over and over again, genetic mutations are preventing a protein once thought to be key to the virus’s success from being expressed.
The coronavirus genome is 30,000 letters long, encoding more than two dozen different proteins that enable the virus to hijack our cells. Of these, spike gets all the glory and infamy; it is the protein targeted by vaccines, and it is the protein that keeps shape-shifting in new variants. But lately, something strange has been happening with another protein called ORF8, once thought to be a crucial player. The virus keeps losing ORF8—over and over again.
It happened first in Alpha. Then again more recently with the Omicron subvariant BA.5, and now again with the ascendant XBB.1.16, also known as the Arcturus variant. In a few weeks’ time, more than 90 percent of SARS-CoV-2 viruses circulating will likely be missing an intact ORF8. All of this is especially strange, if you are to believe the scientific literature, which has posited several different key roles for ORF8: evading T cells, disrupting human gene regulation, mimicking a human immune protein, and more. Scientists have published whole papers devoted to the importance of ORF8, only to have it disappear repeatedly.
So is ORF8 simply unimportant enough—contra prior claims—that the coronavirus can keep infecting humans just fine without it? Or is the virus actually gaining an advantage from ditching this protein? Losing ORF8 is unlikely to be a big, Omicron-level evolutionary leap, but no one can say for sure why it’s happening. The evolution of SARS-CoV-2 has repeatedly surprised us over the past three and a half years. Even now, this virus that is perhaps the most intensely scrutinized virus of all time has its mysteries.
These questions have captivated a community of online genetic sleuths. Two of the authors of a recent informal report on the loss of ORF8, Ryan Hisner and Federico Gueli, are not professional scientists. “These guys have become, genuinely, some of the world’s experts of the up-to-the-moment variant stuff,” says Thomas Peacock, a virologist at the Pirbright Institute and the report’s third author. When SARS-CoV-2 sequences get shared publicly, a group of variant hunters—both amateurs and professionals—search through the proverbial haystack, looking for variants and mutations rising in prevalence. Variant-hunting used to be the domain of professional scientists with supercomputers. But the novelty of the coronavirus—no one was a SARS-CoV-2 expert before 2020—and the popularization of new tools that make the work less computationally intensive have allowed dedicated amateurs to become the world’s experts. Hisner, a schoolteacher in Indiana, is now starting a master’s program on account of his mutation work.
The report in which the three researchers shared their ORF8 findings last month has not been peer reviewed; primarily, the trio was hoping to shake loose some hypotheses about what’s going on. Indeed, Peacock told me, he has heard from interested scientists. The variants aren’t necessarily losing the entire sequence of the gene that encodes the ORF8 protein, but they have acquired mutations that prevent it from being expressed. Mutations in ORF8 actually occur frequently in related viruses—in both coronaviruses that infect bats and the original SARS virus from 2002.
Finding a lot of mutations of ORF8 could mean it is an evolutionary hot spot, evidence of a virus adapting to its host. This is certainly true of spike, and it’s what some scientists suggested about ORF8 in SARS-CoV-2 when they observed very early mutations in 2020. But it could also mean the complete opposite: The protein is so unimportant that randomly throwing a bunch of wrenches in it doesn’t matter. The gene encoding ORF8, some scientists suggest, could be a duplicate of one that encodes another somewhat similar viral protein called ORF7a. “When you have two copies of a gene, the second copy usually isn’t so important,” Peacock said. This allows the second copy to take on mutations—most of which will have zero or negative effects, but the right combination of which might eventually give the second copy a new function. “This is kind of like the raw juice of evolution,” he added. ORF8 might still be somewhere in that process.
That could also explain why ORF8 seems to be somewhat involved in many functions but a standout in none. “There’s, like, 20 different things that it might be doing, depending on who you ask,” Peacock said, exaggerating perhaps a touch. The situation is confusing, though. The protein doesn’t make up the virus’s physical structure like spike, nor is it directly involved in the virus’s replication; it instead belongs to an “other” category of proteins called “accessory proteins.” In a lot of viruses the functions of accessory proteins are unclear. With SARS-CoV-2, scientists can’t even agree on exactly how many accessory proteins it has—six, nine, or even more depending, again, on whom you ask. (Hence my vagueness in the first paragraph of this article, if you noticed, on the total number of SARS-CoV-2 proteins.) ORF8 is in the most enigmatic, least understood part of the virus’s genome.
Identifying mutations may be much easier these days, but figuring out their effect on a virus is still challenging and laborious. For example, in the lab, scientists found reasons to suspect that ORF8 tamps down the human immune system: An abundance of ORF8 in infected cells seems to discourage them from flagging down T cells. This, combined with those observations of early mutations, led researchers to wonder whether ORF8 might be mutating to help the virus become even better at hiding from T cells. But no. The first named variants, Alpha, Beta, and Gamma, did not improve upon the original virus in this way. Surprisingly, Omicron did get better at hiding—but not because of changes in ORF8. Rather, mutations in another protein, called E, seem to have enhanced the variant’s abilities on this count. SARS-CoV-2’s proteins can have redundant functions, according to Akiko Iwasaki, an immunologist at Yale, whose lab conducted some of these ORF8 experiments. These overlaps just make elucidating the role of any one protein even harder.
Most of the research into SARS-CoV-2 has focused, rightly, on spike and other potential targets for drugs and vaccines, says Gavin Smith, an infectious-disease researcher at Duke-NUS Medical School Singapore. But spike alone does not a virus make. This tiny capsule of protein and genetic code is a complex biological machine. Three-plus years and hundreds of thousands of studies later, we’re still figuring out how all the protein pieces fit together. We’re still figuring out, actually, exactly how many pieces there are.