Biologists have a way of coming up with cool abbreviations; ChIP(Chromatin immunoprecipitation), DAPPER (DAtabase for Protein-Protein intERactions), ERGIC (ER–Golgi intermediate compartment) , TESCO (testis specific enhancer of Sox9 core element), but one of my personal favourites is CRISPR* – clustered regularly interspersed short palindromic repeats.
*And as Hermione might say, it’s not crispaar, it’s, crisper.
You may have heard the word ‘CRISPR’ a lot recently, so hopefully this will explain the basics and elucidate some radical recent studies with it!
Let’s clear CRISPR up:
When you think of CRISPR, you may think of gene editing, but there is a lot of CRISPR lingo to get the hang of first. Plus, it’s interesting to know about the original function of CRISPR.
As the name suggests, CRISPR are short, palindromic segments of DNA that are interspersed with so-called ‘spacer’ sequences of DNA (more of them in a sec) found in many prokaryotes (40% of bacteria and almost all archaea) for use in anti-viral defence. They are found in arrays as can be seen in Figure 1.
When a bacterium gets infected by a virus, the virus injects its genetic information which can be read by the bacterium’s transcriptional machinery, allowing for mass viral production. This is bad for the bacterium. This can be prevented if the bacterium can detect the ‘foreign’ viral DNA and destroy it. CRISPR, in combination with CRISPR associated proteins (Cas proteins), provides a way.
The mechanism is best explained in three stages:
- SPACER ACQUISITION
First the bacterium or archaea needs to acquire its immunity. This involves getting the foreign DNA into the CRISPR array. That’s right, the foreign DNA make up the ‘spacer’ regions of the CRISPR array (Figure 1). These spacer sequences can then be used to recognise the foreign DNA if it enters again.
Before insertion, the ‘to be inserted’ foreign DNA is referred to as a protospacer. The sequence of protospacers can be highly variable, indicated by the various colours of the spacers in Figure 1, but selection requires proximity to a protospacer adjacent motif (PAM) sequence. PAM sequences are required for correct DNA binding, cleavage and discrimination of self vs. non-self DNA. Recognition is achieved by the only Cas proteins required for new spacer acquisition, Cas1 and Cas2.
The exact mechanism for how the Cas1-Cas2 complex recognises and integrates into the CRISPR array is still incomplete, however, studies have shown that the Cas1-Cas2 proteins uniquely recognise the AT-rich leader sequence and may be why the two proteins are highly conserved throughout the CRISPR types*(1)
Transcription of the CRISPR repeats in combination with a spacer region form the CRISPR complex with Cas proteins that can detect foreign DNA if it re-enters.
After the recognition of foreign DNA, degradation is triggered, saving the cell from infection. Cas9 is one protein that can achieve this, possessing two nuclease domains.
*There are different classes and subtypes of CRISPR which I won’t go into here!
My introduction to CRISPR came when I watched this TED-talk given by Jennifer Doudna (one of the pioneers of CRISPR technology); https://www.youtube.com/watch?v=TdBAHexVYzc). Doudna explains the CRISPR-Cas9 system in Streptococcus pyogenes that can be modified allowing for targeted gene editing. Bypassing the ‘spacer acquisition’ phase, DNA containing the Cas9 protein sequence, CRISPR and a sequence to be targeted can be introduced, transcribed and then form the CRISPR complex inside cells, which can then directly target the cell’s DNA.
The ease which we can edit genomes blew me away, but also made me realise the huge responsibility associated with its potential for use in humans. Doudna has recently published a book ‘A Crack in Creation: Gene Editing and the Unthinkable Power to Control Evolution’ alongside co-worker Sam Sternberg – it is definitely worth a read if you are interested in the origin of the technology and the moral consequences in store.
The CRISPR-Cas9 gene editing technology has far reaching potential across various fields beyond medical treatment. Genetically engineered salmon (AquAdvantage Salmon) containing a growth hormone gene are already being sold in Canada providing a means for more sustainable food production. (https://www.nature.com/news/first-genetically-engineered-salmon-sold-in-canada-1.22116)
The revamp of CRISPR
The CRISPR technology evidently has a vast diversity of uses. Many of these are valuable for experimental studies. Mutations created in the Cas9 protein de-activating the nuclease domains, allows for the CRISPR complex to still recognise DNA sequences without cutting it. This form has been given the name dCas9. This special form can then be fused to various other proteins and domains. CRISPR activation (CRISPRa) uses dCas9 fused with the ω-subunit of E.coli RNA polymerase, allowing for full recruitment of the holoenzyme, working thus as a transcriptional activator if the targets a promoter region. Conversely, if dCas9 is fused with a KRAB effector domain, the CRISPR complex acts as a transcriptional repressor – this is otherwise known as CRISPR interference (CRISPRi). Both CRISPRa and CRISPRi can be used simultaneously and provide a powerful new tool to control gene expression levels for experimental studies.
dCas9 can also be tagged with eGFP (the enhanced version of the biologist’s favourite green fluorescent protein) allowing for chromosome imaging in live cells! However, despite even using eGFP over the conventional GFP, either multiple constructs with overlapping target sequences or repetitive DNA sequences are required to get sufficient signal for detection. This can be taken one step further using multiple dCas9 proteins tagged with different fluorophores allowing for the determination of distances between loci – this can be useful for studying promoter-enhancer interactions. What is most brilliant about these applications is the ease with which different target sequences can be designed increasing the efficiency of gaining experimental results(For more detail see (2))
CRISPR and DNA storage
What’s compact, long-lasting, durable, and perfect for storing information? DNA, of course! Besides the slight snag of having to code and synthesise the DNA needed for storage this serves as an efficient and sustainable method to solve our current information overflow crisis. Storing information in DNA is not a brand-new idea. In fact, it has already been tried and tested by Nick Goldman of the European Bioinformatics Institute (EBI) at Hinxton, UK when they encoded all Shakespeare’s 154 sonnets into the sequence of A, T, C and G’s. (To hear more about this and how a test-tube of DNA can store the same as 1 million CD-ROMs watch Nick’s talk https://www.youtube.com/watch?v=tBvd7OSDGgQ ) A year earlier, George Churches’ lab encoded the text to one of his books.
Since the CRISPR system inserts foreign segments of DNA into the genome, this provides a mechanism that can be exploited to store introduced DNA sequences. To recover the information, all that is required is sequencing.
A galloping success
Recent work by Seth Shipman and colleagues have done just that, highlighting the many hurdles that must be overcome for it to work reliably (3). Working with E. coli they managed to store 5 frames of the classic movie by Muybridge of the galloping horse, Annie G. One of the first hurdles to overcome was how to code the information into DNA. Since each spacer is 33 bases, many individual synthetic protospacers needed to be synthesised with the pixel values transformed into a nucleotide code. Many considerations were required when creating the code – the %GC bases, no mononucleotide repeats and no internal PAMs – to increase acquisition efficiency. Each frame was electroporated successively over 5 days into a population of E. coli.
Once the sequences were added to the CRISPR arrays the next challenge was to sequence and recall the information. The key feature this dependent on was the chronological insertion of DNA spacers; as seen in Figure 1, (almost) all new spacers are inserted at the leader-proximal end effectively ‘pushing’ older inserted sequences further away. The ordering information must therefore be taken from a single cell for temporal reconstruction. However, one cell will not contain all the protospacers to reconstruct the movie, as instead it is distributed across the bacterial population so mass sequencing was required. After computational reconstruction, the movie was re-created – awesome!
This method involved the targeting of introduced DNA. However, other CRISPR-Cas systems can convert RNA to DNA before insertion, which has great potential for following gene expression levels without having to extract RNA.
Honestly, I think it’s rather clever, but are some bacteria better to use than others – E.coli was used in this pioneering study, but as you now know CRISPR repeats are much more widespread.
(1) J.Nuñez. Cas1-Cas2 complex formation mediates spacer acquisition during CRISPR-Cas adaptive immunity Nature Structural &Molecular Biology 21, 528-534 (2014)
(2) A. Dominguez. Beyond editing: repurposing CRISPR–Cas9 for precision genome regulation and interrogation Nature Reviews Molecular Biology 17, 5-15 (2016)
(3) S.Shipman. CRISPR-Cas encoding of a digital movie into the genomes of a population of living bacteria Nature 547, 345-349 (2017)
P.S I’ve come up with my own acronym for this page – CRISPR Overview and storing movies in DNA (COSMID). COSMID that.