SARS-CoV-2: Variants of Concern
The definitive reference sequence of the SARS-CoV-2 RNA is that of the Wuhan strain, isolated from a patient admitted to Hospital in Wuhan on 26th December 2019, published in Nature [5]. It is no more important than any of the other closely related sequences. Currently it is not established that this sequence has a seminal status, from which other human strains were in fact derived, but because it is the arbitrarily chosen reference strain, it will seem to be seminal.
Single-stranded RNA is notoriously susceptible to mutation (which may be why 'higher' forms of life settled on double-stranded DNA to carry genetic information).
As early as January 2020 a variant was sequenced in Germany in which the amino acid at position 614 of the Spike protein was not Aspartate (D) as in the Wuhan strain but was Glycine (G); thus described as a D614G mutation [4]. This form spread rapidly among humans and by the summer was found in 97.5% of isolates world-wide. In December this mutation was shown to cause tighter binding to the ACE-2 receptor and higher infectivity [3] than the Wuhan reference strain. However, it is now essentially ubiquitous in humans, to whom it seem advantageously adapted (in the Darwinian sense). Further adaptations to humans are to be expected in the human population. No doubt 'cat' variants are arising and spreading in cat populations, but cat travel is limited in comparison with human.
The GISAID repository, set up in 2008 as a place where flu virus sequences could be housed and shared, now (2020-12-20) contains some 270,000 SARS-CoV-2 sequences, of which 120,000 were contributed by the UK consortium "Covid-19 Genomics UK (COG-UK)". There are currently some 4,000 mutations in the spike gene alone [7]. Many are 'silent'; not every mutation in the RNA shows in the protein as most amino acids have several alternative codons, so giving 'synonymous' mutations. Many are trivial, in that an amino acid may be replaced by a similar one; so 'conservative' mutations.
Two publications from the European Centre for Disease Control (ECDC) are useful [1,6]. There are now two "variants of concern" (VOC): the rapid-spreader that saw Britain cut off from Europe on 21 December, named VOC 202012/01; and a rapid-spreading variant in South African called 501.V2.
The British rapid-spreading variant was first sequenced in October [1,7], but may have occurred earlier. As of 2020-12-26 there have been 3000 cases of VOC 202012/01 identified in UK by sequencing. Throughout December the UK has been sequencing some 3000 – 7000 COVID genomes a week. From being a rarity (<0.1%) the variant has grown exponentially to become 11% of the genomes sequenced by the end of November, roughly doubling its market share each week [1]. In Norfolk it accounts for 20% of COVID cases [7]
VOC 202012/01 is defined by nine spike protein mutations (deletion 69-70, deletion 144, N501Y, A570D, D614G, P681H, T716I, S982A, D1118H), and 8 mutations in other genomic regions [1,7]. The D614G substitution (see above) is present. The P681H replacement is in the binding domain [8]. N501Y replacement probable affects binding also, as it is near the binding site. It is also found in the South African rapid-spreader; but by a different change in the RNA, so it arose there indepencently.
There are 5 other substitutions in Spike, and two small deletions. The deletion 69-70 causes one of the PCR diagnostic probes for covid-19 (ThermoFisher TaqPath probe) to come up negative [2], though other primer pairs can still identify the presence of SARS-CoV-2 RNA. The group of Volz et al.[2] tentatively use the failure to detect covid with that probe as a surrogate identifier for the VOC 202012/01, saving them the need to sequence the genome. Using that assumption they conclude that VOC is over-represented in the younger (0-20 yrs) cohorts of COVID-positive patients.
Till now (2021-01-02), no one has found any evidence of a more aggressive disease with the variant viruses.
It is estimated that VOC 202012/01 is 56% more transmissible than preexisting SARS-CoV-2 [9]. Under a regime where the R0 was 1.0, it might rise to 1.56. There is as yet little evidence as to why the variant seems to have a higher reproduction rate; possibilities include (a) tighter binding of mutant spike to human ACE-2 receptor requiring lower titre for infection, (b) shorter lag between infection and shedding, (c) longer survival of infective RNA in the air or on surfaces, (d) evasion of RNA hydrolysis inside cells. And doubtless others.
References
[1] https://www.ecdc.europa.eu/sites/default/files/documents/SARS-CoV-2-variant-multiple-spike-protein-mutations-United-Kingdom.pdf
[2] https://www.imperial.ac.uk/media/imperial-college/medicine/mrc-gida/2020-12-31-COVID19-Report-42-Preprint-VOC.pdf
[3] Science 18 Dec 2020:Vol. 370, Issue 6523, pp. 1464-1468 DOI: 10.1126/science.abe8499
[4] Cell. 2020 Oct 29;183(3):739-751.e8. doi: 10.1016/j.cell.2020.09.032. Epub 2020 Sep 15
[5] Nature 2020 Mar;579(7798):265-269. doi: 10.1038/s41586-020-2008-3. Epub 2020 Feb 3. https://pubmed.ncbi.nlm.nih.gov/32015508/
[6] https://www.ecdc.europa.eu/en/publications-data/covid-19-risk-assessment-spread-new-sars-cov-2-variants-eueea
[7] https://www.bmj.com/content/371/bmj.m4857
[8] https://www.who.int/csr/don/21-december-2020-sars-cov2-variant-united-kingdom/en/
[9] https://doi.org/10.1101/2020.12.24.20248822