Conservation by Coupled Evolution

The cis-acting sequences of RNA viruses are very well conserved. The following presents a model for the mechanism whereby they remain so well conserved.

The model: Coupled evolution

Suppose that the specificity of recognition is determined by a host protein. The host protein evolves at very much slower rates and remains unchanged for relatively long time spans. The virus, however, mutates at high rates, such that the cis-acting sequence is rapidly selected to achieve an optimal interaction with the host protein. Once this occurs, most mutations in the cis-acting sequence will be sub-optimal, and be selected against. Thus, the cis-acting sequence will now evolve only at a rate comparable to the cognate host protein.

If, instead, recognition of the cis-acting sequence is mediated by a viral protein, their interaction should also be rapidly optimized [Steinhauer and Holland, 1987]. Once this occurs, they become mutually constrained: neither can change independently without disturbing the optimized interaction. Change is possible only when both mutate coincidentally, and in an exactly compensatory fashion.

Assuming that compensating mutations are possible, and that the mutation rate is , then the probability of both mutating simultaneously, and in an exactly compensating manner, is approximately

Probability of compensating mutations

(See discussion on mutations for derivation). For example, if = 10-4, and the genome length L is 104, the probability of simultaneous and mutually compensating mutations is about 4 x 10-10. We will call these events specificity shifts, in analogy to changes in the influenza virus hemagglutinin and neuraminidase.

With the same assumptions, the probability of a point mutation is 4 x10-5 (1 in 27000). If a point mutant, although sub-optimal, is able to produce reasonably large numbers of progeny genomes (e.g., 105) in one or more infection cycles, the compensatory, suppressing mutation will be present in one of its progeny genomes with high probabilities. These events will be called specificity drifts. Its rate also will be low, as it depends on the initial mutation, the viability of the mutant, plus a second mutation event.

mutually compensating mutations


The model makes 2 strong predictions.

  1. The wildtype sequence is optimal, i.e., it is functionally superior over most or all mutants.
  2. The recognition of the cis-acting sequence is functionally conserved, i.e., any virus is able to recognize efficiently the cis-acting sequence of a closely related virus.

The available evidence support both predictions.

The conserved 'wildtype' sequence is optimal

Although RNA viruses have high mutation rates, the predominant or wildtype genome persists with remarkable stability during passaging in culture. This is true even though substantial numbers of mutants are detectable at each passage (for examples, see Domingo, et al., 1978; Steinhauer and Holland, 1987). To reconcile this apparent paradox, it was proposed that the relevant sequences are quickly optimized when environmental conditions change (e.g., adaptation to culture) resulting in a predominant, wildtype sequence [Steinhauer and Holland, 1987]. The wildtype sequence persists because, among the distribution of mutants generated during virus growth, none have a competitive advantage over the wildtype, so long as the environmental conditions remain stable [Steinhauer and Holland, 1987]. The initial optimization process is likely to be facilitated by the high mutation rates and large population sizes that generate an enormous diversity for selection to operate upon efficiently. When the environmental conditions are altered, some other sequence might be selectively advantageous, and it becomes the dominant species, superior to most of the mutants that arise. This explanation for the persistence of the wildtype in culture may be generalized to evolution in nature: As the viruses diverge over time, to adapt to disparate niches or environmental conditions, only those features that are the most strongly selected for under a variety of environmental conditions will remain conserved.

Hantaviruses provide an example of the conservation of the cis-acting sequences, despite the evolutionary divergence of many other properties of the viruses. Hantaviruses have been found in Asia (e.g., Hantaan virus, the type species), Europe (Puumala, Dobrova-Belgrade viruses), North America (Sin Nombre virus), and South America (Andes virus). They are found in Arctic to tropical environments, and they are vectored by different rodent species [Clement, et al., 1997; Schmaljohn and Hjelle, 1997]. Infection by the different viruses results in quite different severity and nature of clinical abnormalities in humans. The available data indicate that the RNA sequence of different strains of the same virus diverge by 0.1 to 1% per year, while differences between different hantavirus species are substantially greater, 20% or more [Hjelle, et al., 1995; Plyusnin, et al., 1996].

In contrast, the terminal sequences at the ends of the genome segments are very well conserved. These are presumed to be the cis-acting sequences for initiating viral RNA synthesis. The exact extent of the cis-acting sequences remains to be determined, but if they are comparable to that of the Rift Valley fever virus (13 nt; Prehaud, et al., 1997), then most of the cis-acting sequence is still absolutely conserved. Thus, despite the evolutionary divergence of the viruses, that presumably took place over many thousands of years [Schmaljohn and Hjelle, 1997], and adaptation to potentially quite different host environments, their cis-acting sequences have remained largely unchanged. In all that time, a very large number of mutants must have been generated. Any mutant that is superior to the wildtype should have had an opportunity to expand in numbers, to leave descendants that differ from the wildtype. Such sequences have not been found to date. This suggests that the wildtype cis-acting sequences had been optimized a long time ago, and have remained superior over the ensuing time and under all the environmental conditions encountered by the viruses. Similar arguments apply to the conservation of the cis-acting sequences of other RNA viruses.

The notion that the wildtype sequence is optimal is sufficiently unusual that we sought experimental evidence to support or refute it. The strategy is simply to randomize a portion of the cis-acting sequence, to make a library of viruses that together contain all possible sequence in the region that was randomized. The viruses that grow best are then selected for by passaging the viruses through multiple infection cycles. The question is, does the wildtype sequence become enriched at the expense of mutants, finally to dominate the population?

A 5-nt region of the Sindbis virus promoter was randomized, and the library of viruses was passaged in cultured insect or mammalian cells. In both cases, the wildtype sequence became the dominant sequence in the population within 2-4 infection cycles [Hertz and Huang, 1995; Hertz and Huang, 1995]. We conclude that the wildtype sequence is indeed better than any other sequence, at least in the 5-nt region examined. To test the generality of this conclusion, randomization of other regions of the promoter and other cis-acting sequences of Sindbis virus are in progress.

The recognition of the cis-acting sequence is functionally conserved

Whether the cis-acting sequence is recognized by a host or a viral protein, the model predicts that it should evolve quite slowly compared to most of the rest of the genome. If this is true, then the recognition of cis-acting sequences should be functionally conserved: the mechanism for recognizing the cis-acting sequence should not have diverged much, despite the divergence in many other properties of the viruses. We should expect that the cis-acting sequence of a given virus to be recognized efficiently by closely related viruses. In principle, the efficiency of recognition might vary, depending on exactly how much the cis-acting sequences have diverged.

The following are some examples of the functional conservation of the cis-acting sequences.

Picornaviruses: enteroviruses and rhinoviruses
Poliovirus type 3 grew with varying efficiencies when its 3' untranslated region was replaced with that from coxsackievirus B4, hepatitis A, bovine enterovirus, or human rhinovirus 14 (HRV14). Rohill, et al., 1994
A chimera of poliovirus type 1 (PV1) with the 5' cis-acting cloverleaf structure of human rhinovirus type 2 is viable, but a similar PV1-HRV14 chimera was not. Xiang, et al., 1995
Togaviruses: alphavirus
Sindbis virus is able to recognize the promoter of other alphaviruses Hertz and Huang, 1992
Ross River virus-Sindbis virus chimeras are viable, i.e., they are able to use each other's cis-acting sequences for replication and transcription. Kuhn, et al., 1996; Kuhn, et al., 1991
The Western equine encephalitis virus is a recombinant between parental viruses related to Sindbis and eastern equine encephalitis viruses, suggesting that the progenitor viruses were able to use each other's cis-acting sequences, and that the newly formed chimera grew well enough to become the ancestor of the present day WEE virus. Hahn, et al., 1988
Positive strand plant viruses
Brome mosaic virus and cowpea chlorotic mottle virus (bromoviruses) are able to recognize each other's cis-acting sequences for replication and transcription Pacha and Ahlquist, 1991
The 3' end of brome mosaic virus can functionally replace the 3' end of the tobacco mosaic virus, a tobamovirus. Ishikawa, et al., 1991
Parainfluenza viruses 1 and 3 (PIV1, PIV3) are able to support the replication of a Sendai virus DI genome. Curran and Kolakofsky, 1991
On the other hand, PIV3 could not rescue a respiratory syncytial virus (RSV) vRNA, while the homologous RSV could. Dimock and Collins, 1993
Lyssavirus serotypes 2, 3 and 4 can all support the replication and transcription of a rabies virus DI genome. Conzelmann, et al., 1991
Reconstituted cores of the tick-borne Thogoto virus are able to transcribe an influenza A virus vRNA-like promoter and hybrid Thogoto-influenza A promoters. Hence, the vRNA promoter of Orthomyxoviruses appear to be structurally and functionally conserved Leahy, et al., 1997
Recognition of the cis-acting sequences of influenza A and B are similar, with subtle differences Lee, et al., 1998
Several bunyaviruses (LaCrosse and Tahyna; La Crosse and snowshoe hare; Batai virus, Bunyamwera virus, and Maguari viruses) reassort freely under some conditions, to produce all of the expected genotypes, suggesting that they are able to recognize each other's cis-acting sequences. Chandler, et al., 1990; Chandler, et al., 1991; Pringle, et al., 1984
Bunyamwera viral proteins are able to promote the propagation of a reassortant of Bunyamwera and Maguari viruses Bridgen and Elliot, 1996
Reoviruses: rotaviruses
The simian and chicken rotaviruses efficiently complements the replication and gene expression by a synthetic analog of porcine rotavirus gene 9 RNA. Gorziglia and Collins, 1992

In some instances, viruses were unable to recognize the cis-acting sequence of a related virus. This should not be surprising, since the cis-acting sequences are very well conserved, but not immutable. As discussed above, viable mutants should be produced at low frequencies during evolution, and become amplified over time. As the mutational events accumulate, the divergent viruses become less and less able to recognize each other's cis-acting sequences, i.e., functional conservation can and do break down over time.

Implications for antiviral drug design

The conservation of the cis-acting sequences suggest that they may be good targets for antiviral drug inhibition. We might hope that as the cis-acting sequences are slow to change during evolution, they will also be slow to change when confronted with antiviral drug inhibition. If so, we may be able to design broad-spectrum drugs with much decreased rates of drug-resistance.