No doubt you’ve heard about the novel coronavirus variants that are evolving around the world. There now appear to be more than a dozen versions of SARS-CoV-2, which are of varying degrees of concern because some are linked to increased infectivity and lethality while others are not. It’s easy to be overwhelmed by this diversity and to fear that we’ll never achieve herd immunity. Yet evidence is growing that these variants share similar combinations of mutations. This may not be the multifront war that many are dreading, with an infinite number of new viral versions.
I am an evolutionary microbiologist who studies how bacteria and viruses adapt to new environments or hosts. Like many microbiologists, my colleagues and I have turned our attention to understanding how SARS-CoV-2 is evolving adaptations for reproducing and transmitting in humans. Our favorite laboratory method is experimental evolution, where we grow multiple populations of microbes started from the same strain under identical conditions for weeks or months. We study problems like how antibiotic resistance evolves and how infections become chronic. The power of this method is that using multiple populations allows us to “replay the tape of life” and study how repeatable and ultimately predictable evolution might be.
One pattern we see is called convergent evolution, where the same trait emerges in different independent lineages over time, usually as they adapt to similar environments. Some of the best examples of convergent evolution include the sandy color of diverse desert animals; lobed swimming fins for whales, walruses, and manatees (which are actually distantly related); and even the ability for humans to digest lactose into adulthood, which arose several times in geographically isolated populations.
In the case of SARS-CoV-2, the complete genome sequences of viruses from thousands of patients enable us to look for convergent patterns. While most mutations are one-offs that go extinct, some establish new lineages that become more frequent as the virus succeeds in replicating and infecting many people. If the same part of the virus repeatedly mutates in different samples around the world and becomes more frequent, this mutation very likely encodes an adaptation that helps the virus reproduce and transmit.
With the benefit of increased genome surveillance of the coronavirus, several recent studies have identified signatures of convergent evolution. Here in the U.S. our laboratory found at least seven genetically independent lineages that acquired a mutation at one particular spot on the virus’s infamous spike protein, the one it uses to latch onto human cells. Spike has a sequence of linked amino acids, and the mutation occurs at position number 677. In the original SARS-CoV-2 this is the amino acid glutamine, abbreviated as Q.
In six lineages, this Q mutated to another amino acid, histidine (H) and is called 677H. In the seventh lineage, Q mutated to another amino acid, proline (P). Each lineage also has a mutation called S:614G, which was the first notable change in the virus to be identified several months ago and spread so widely it is now found in 90 percent of all infections. We named these seven U.S. lineages after common birds—“robin,” for example, and “pelican” —to help us distinguish and track them, and also to avoid creating prejudice by naming them after the areas where they were first detected.
Lineages outside the U.S. have also acquired 677H, including in Egypt, Denmark, India and a large cluster in Macedonia. A new variant of concern called B.1.525 also has 677H, as do several lineages that descended from B.1.1.7, one of the first worrisome versions to be spotted. The coincident, global emergence of S:677 mutations and their fivefold gain in prevalence offers strong evidence that these changes must improve viral fitness in some way. We don’t know how yet, but it is noteworthy that S:677 borders a region of the spike protein that helps the virus enter and infect human cells.
This is far from the only example of convergence in SARS-CoV-2. Mutations in at least eight different positions in the spike protein are simultaneously on the rise around the world, appearing in B.1.1.7 and in other major variants of concern known as B.1.351, P.1 and P.3. These variants share combinations of mutations at positions 18, 69–70, 417, 452, 501, 681 and a particularly concerning E484K mutation that evades neutralizing antibodies. For this reason, two of the leading scientific websites (http://covariants.org/ and http://outbreak.info) that track variants now report these shared, defining mutations to simplify and consolidate our attention. The U.S. Centers for Disease Control and the media have been slow to follow the importance of these key mutations, but this is changing, because it is these changes that likely alter virus functions such as contagiousness or the ability to evade vaccines.
One way to envision this type of convergent evolution is as a game of Tetris, where a limited number of building blocks can be assembled in different ways, in different combinations, to achieve the same winning structures. For example, it is now known that the combination of mutations in B.1.1.7 make it especially contagious, and that the B.1.351 lineage can evade antibodies because of E484K.
Because many newly discovered variants appear to be resampling the mutations found in other established variants, we can speculate that the virus is beginning to run out of new, major adaptations. But this doesn’t mean that that the forces of evolution will stop as we begin to approach herd immunity and loosen restrictions. History tells us that viruses can evolve rapidly to evade barriers to transmission, especially when infections remain numerous. We must remember that the more infections there are, the more chance mutations will occur, and those that best help the virus to survive will proliferate. This is why stopping new infections is key. These viral adaptations are already rewriting our biology textbooks on convergent evolution; let’s strive to limit new material.
It’s also critical that we make significant investments in building an early-warning system to detect new SARS-CoV-2 variants as well as many other emerging pathogens, both known and yet to be discovered. Viral genome surveillance and sequencing is the key. The reason why many variants have been detected in the U.K. is because of visionary investments by researchers and public health officials in these technologies.
In the U.S., a significant influx of money to the CDC from the new federal stimulus package is already increasing the frequency with which researchers can sequence and analyze virus samples. This must be sustained by building the public health expertise and research infrastructure to decode genetic changes in the virus and anticipate the need for future vaccine modifications. It was basic science that provided hope in this pandemic through new vaccine technology; and given renewed support it will also be our guardian against future threats.