The silencing efficiency of various crRNAs is highly variable
To elucidate PspCas13b crRNA design principles, we developed a quantitative fluorescence-based silencing assay, in which we targeted the transcript of the mCherry reporter gene. To achieve this, we cotransfected HEK 293T cells with three plasmids encoding mCherry, PspCas13b-BFP and either nontargeting (NT) or mCherry-targeting crRNAs (Fig. 1a). Fluorescence microscopy analysis of cells transfected with mCherry-targeting crRNAs showed pronounced silencing activity in contrast to no appreciable silencing in cells expressing NT crRNAs. However, when we used a second crRNA with a spacer sequence fully matching the coding sequence of mCherry RNA, we did not observe any silencing activity (Fig. 1b). This contrast in the silencing efficiency of the two crRNAs led us to question whether parameters such as the efficiency of crRNA transcription, crRNA loading, spacer nucleotide composition, target accessibility and the presence of a potential protospacer-flanking sequence (PFS) may influence the efficiency of PspCas13b. To test these hypotheses, we designed 16 crRNAs that fully base pair with the coding sequence of the mCherry mRNA at various positions (Fig. 1c). To accurately determine the silencing efficacy of each crRNA, we performed crRNA dose-dependent silencing assays in which cells were transfected with 0, 1, 5 and 20 ng of each of the 16 mCherry-targeting crRNAs. We noticed marked differences in the silencing efficacy of the various crRNAs, even when they were designed to target neighboring RNA locations (Fig. 1d and Extended Data Fig. 1a,b). We confirmed that the mCherry silencing is mediated by the nuclease activity of PspCas13b crRNA, as catalytically inactive dPspCas13b or crRNA alone failed to silence mCherry (Fig. 1e–g).
Overall, this finding suggested there are key determinants of PspCas13b silencing activity beyond target accessibility. Identifying such determinants is crucial for efficient reprogramming.
Single-base resolution view on PspCas13b silencing activity
To further understand the spectrum of crRNA potency, we investigated the silencing activity of PspCas13b across a defined targeted region, reasoning that silencing efficiency is likely intrinsic to the spatial characteristics of the crRNA sequence and binding sites. We focused our study on crRNA 12 (binding position 455) and crRNA 16 (binding position 655), which exhibited high and moderate silencing, respectively (Fig. 1d). We designed three-nucleotide-resolution tiled crRNAs spanning a 30-nucleotide target region surrounding the crRNA 12 and crRNA 16 binding positions (Fig. 2a; and Extended Data Fig. 2a). In this tiled design, the binding sites of each adjacent crRNA are spaced by just three nucleotides; thus, their silencing profiles should reveal the relationship between efficacy and target accessibility. We again observed considerable heterogeneity in the potency of these tiled crRNAs despite their physical proximity, with some adjacent crRNAs demonstrating contrasted silencing efficacy. These data indicated that physical barriers such as RNA-binding proteins or structured RNA motifs are unlikely to explain the fluctuation in silencing between spatially adjacent crRNAs that are separated by just three nucleotides (Fig. 2a and Extended Data Fig. 2a–c).
To further enhance our understanding, we maximized the spatial resolution of this approach by designing 61 single-base tiled crRNAs targeting the mCherry coding sequence (Fig. 2b). Consistent with previous data, we again observed markedly diverse silencing profiles of neighboring crRNAs. For instance, crRNA 13 achieved silencing exceeding 95% efficiency but shifting the targeted region by only one nucleotide (crRNA 14) dramatically reduced the efficiency to ~30%. Similarly, crRNA 51 yielded ~99% silencing efficiency while its adjacent crRNA 52 did not show any appreciable silencing activity (Fig. 2b). We observed similarly high fluctuations in the silencing profiles of single-base crRNAs targeting the oncogenic fusion transcript BCR-ABL1. For instance, the binding sites of crRNAs 14 and 15 have 29 overlapping consecutive nucleotides and are separated by just a single nucleotide yet their silencing efficiency is drastically different (Fig. 2c).
These data strengthened our contention that silencing efficacy is unlikely to be solely dependent on the target accessibility and that other factors including specific nucleotide positions within the spacer or target, as well as a possible PFS, may influence key steps of target silencing such as crRNA transcription, loading and target recognition.
Key design principles for potent RNA silencing
In an effort to uncover fundamental principles that dictate PspCas13b silencing efficiency, we expanded our dataset by analyzing the silencing profiles of 201 individual crRNAs targeting various transcripts17. First, we questioned whether crRNA or target folding, spacer–target stability and spacer nucleotide content correlate with PspCas13b potency. The data suggest that the folding of the crRNA and the targeted sequence into complex secondary structures can only moderately limit PspCas13b silencing efficiency, possibly perturbing crRNA loading or target accessibility (Extended Data Fig. 3). Interestingly, we observed a strong negative correlation between C nucleotide enrichment in the spacer sequence and crRNA potency (r = −0.30, P < 0.0001; Extended Data Fig. 3h). In contrast, spacers that are rich in G nucleotides showed a positive correlation with crRNA potency (Extended Data Fig. 3i).
Next, we pooled these 201 crRNAs and ranked them by silencing efficiency. The crRNAs that achieved >90% silencing efficiency were designated as potent crRNAs and those with less than 50% efficiency were considered ineffective crRNAs. The crRNAs with ambiguous silencing profiles (efficiencies ranging from 50% to 90%) were excluded from the analysis. We sought to identify molecular features capable of differentiating potent and ineffective crRNA cohorts (Fig. 3a). Many CRISPR–Cas variants possess an upstream or downstream PFS that restricts targeting activity and prevents degradation of their own nucleic acids42. To investigate the existence of a PFS that could constrain PspCas13b silencing, we generated weight matrix plots that analyzed nucleotide composition at each position four bases upstream and downstream of the targeted sequence in the highly potent and ineffective cohorts of crRNAs. We found no detectable bias in nucleotide composition at various target flanking sites, suggesting that PspCas13b activity is not subject to PFS motifs (Fig. 3b).
Lastly, we questioned whether the nucleotide composition of the spacer could influence PspCas13b silencing efficiency. Nucleotide content analysis revealed an enrichment of G bases in the potent crRNA group and an enrichment of C bases in the ineffective crRNA cohort (Extended Data Fig. 4c–e), indicating that G-enriched spacers are associated with higher potency, whereas C-enriched spacers are associated with low potency.
To elucidate the significance of G and C bases at specific positions within the spacer sequence, we performed unbiased analyses of nucleotide composition at all 30 positions of the spacer in both highly potent and ineffective crRNA cohorts (Fig. 3c,d and Extended Data Fig. 4f,g). The analysis revealed marked differences in nucleotide positions between the two crRNA cohorts. We show that G bases at the 5′ end, particularly a G-G sequence at the first and second positions, was strongly associated with highly potent crRNAs. Conversely, G bases were depleted and C bases were enriched at the 5′ end of spacers in the ineffective crRNA cohort. In addition to this C-rich motif at the 5′ end of ineffective crRNAs, we also identified a significant enrichment of C bases at positions 11, 12, 15, 16 and 17 (Fig. 3c,d and Extended Data Fig. 4f,g). These data revealed key nucleotide positions that may determine the potency of crRNAs, which could serve as predictive parameters of crRNA potency.
Functional validation of crRNA prediction and design
The above in silico analysis enabled us to identify consensus motifs within the spacer to predict potent and ineffective crRNAs. We postulated that potent crRNAs should include a G-G sequence at the first and second positions of the spacer and should lack C bases in positions 11, 12, 15, 16 and 17 (GGNNNNNNNNDDNNDDDNNNNNNNNNNNNN, where D is a G, U or A nucleotide and N is any nucleotide). Conversely, crRNAs containing a C base in spacer positions 1, 2, 3, 4, 11, 12, 15, 16 and 17 are predicted to yield poor silencing efficiency (CCCCNNNNNNCCNNCCCNNNNNNNNNNNNN).
We tested the predictive accuracy of this spacer-based design through a prospective unbiased design of crRNAs targeting eGFP (enhanced green fluorescent protein) and TagBFP (blue fluorescent protein), two mRNA targets that we did not investigate previously. Notably, of the 21 predicted potent crRNAs, 20 achieved very high silencing efficiency of either eGFP or TagBFP mRNA. Conversely, the majority of predicted ineffective crRNAs failed to efficiently silence eGFP and TagBFP transcripts (Fig. 3e–j). By generating predictions on the basis of an existing dataset and verifying their accuracy in previously unexplored transcripts, these findings illustrated that our design, which relies on spacer nucleotides, is both precise and applicable across different transcripts.
Previous studies used library screening and machine learning approaches to uncover the hidden design principles of RfxCas13d (refs. 30,32,33,34), the most commonly used Cas13 ortholog. We sought to compare the efficiency of our design of PspCas13b to the benchmark crRNA design tool that is available for RfxCas13d (Fig. 3k)32. We selected the top ten predicted potent crRNAs for RfxCas13d targeting the coding sequence of the mCherry reporter and probed their silencing efficiency, which achieved an average silencing of 80.7% (Fig. 3l). Our PspCas13b design of potent crRNAs showed ~87.8% average silencing efficiency (eGFP and TagBFP together; Fig. 3g,j), slightly outperforming the RfxCas13d design algorithm (Fig. 3l), which further validated the accuracy of our prediction tool.
To further investigate the enrichment of a G-rich motif at the 5′ end of potent crRNAs and C bases at the 5′ end of ineffective crRNAs, we hypothesized that altering these sequences in a bona fide spacer sequence may either worsen or improve their silencing efficiency. First, we selected 11 crRNAs that possess a G-G sequence at the first and second positions of the spacer, which we altered to C-C by spacer mutagenesis. The data showed substantial compromise in the silencing efficiency of these crRNAs (Extended Data Fig. 5a). We also mutated one, two or three G bases at the 5′ end of the spacer to C and found that the substitution of three or two C bases at the 5′ end of the spacer reduced silencing by >99% and ~70%, respectively, while a single G>C base substitution at spacer positions 1, 2 or 3 had no significant effect on the potency of the crRNA (Extended Data Fig. 5b,c).
Next, we selected seven ineffective crRNAs lacking a G-G sequence at their 5′ end, and then modified them by inserting an additional G at the first position, substituting the first nucleotide to G or substituting the first and second nucleotides to G-G (Fig. 4a–g). Importantly, the data demonstrated that G sequences at the 5′ end of the spacer greatly increased the potency of crRNA despite the introduction of spacer–target mismatch (Fig. 4a–g).
A previous study indicated that promoters dependent on RNA polymerase III, such as U6, can achieve an increased transcription rate when the resulting small RNA possesses A or G bases at the 5′ end43. We questioned whether the improvement in silencing efficiency of crRNAs harboring a G-rich motif at their 5′ end could be secondary to changes in crRNA abundance. We quantified the expression levels of the original crRNA or mutated crRNAs harboring G motifs at their 5′ end using reverse transcription (RT)–qPCR. Although not statistically significant, we observed a trend of increased crRNA abundance when a G-rich motif was present at the 5′ end (Extended Data Fig. 6a–e).
To further understand whether a G-G sequence at the 5′ end can enhance crRNA potency beyond transcription upregulation, we in vitro transcribed (IVT) four pairs of crRNAs that possessed or did not possess a 5′ G-G motif (Extended Data Fig. 6f). A 15% PAGE and TapeStation analysis confirmed that the IVT crRNAs were 66 nucleotides in length, consistent with the size of synthetic crRNAs (purchased from Integrated DNA Technologies (IDT)) (Extended Data Fig. 6f).
We cotransfected equal amounts of these IVT crRNAs into HEK 293T cells together with the PspCas13b plasmid to compare their silencing efficiency (Fig. 4h). At 24 h after transfection, all four crRNAs with a 5′ G-G sequence achieved significantly higher silencing efficiency than their unmodified counterparts. At 48 h, one IVT crRNA containing a 5′ G-G sequence maintained significantly superior silencing activity compared with its unmodified counterparts (Fig. 4i). Similar results were obtained using synthetic commercial crRNAs with or without a 5′ G-G motif (Fig. 4h,j). Together, these experiments showed that the 5′ G-G motif further enhances the silencing activity of PspCas13b beyond augmented crRNA transcription.
The 5′ G-G motif confers higher catalytic activity
We questioned whether the superior RNA silencing activity obtained with crRNAs harboring a 5′ G-G motif could be because of enhanced crRNA loading and/or increased cleavage activity. To test this hypothesis, we purified recombinant PspCas13b (Fig. 4k) and tested its crRNA loading and RNA cleavage activity in vitro. We first preincubated synthetic crRNAs with increasing concentrations of recombinant PspCas13b and performed electrophoretic mobility shift assays (EMSAs). The data indicated that PspCas13b could bind both crRNAs with a similar affinity regardless of the 5′ G-G motif (Extended Data Fig. 7). Next, we developed an in vitro cleavage assay by reconstituting a tertiary nucleoprotein complex containing recombinant PspCas13b, synthetic crRNA and a 100-nucleotide-long mCherry RNA as a target, which was labeled with 6-carboxyfluorescein (6-FAM) at its 5′ end. When loaded with a targeting crRNA (crRNA 39) containing a 5′ G-G motif, the recombinant PspCas13b exhibited potent cleavage of the target, which resulted in two RNA fragments with sizes ranging from 40 to 50 nucleotides. This cleavage pattern was absent when we used an NT crRNA or a targeting crRNA without PspCas13b protein (Extended Data Fig. 6g). When we compared the cleavage activity obtained with wild-type or mismatched 5′ G-G crRNAs, the latter exhibited a higher cleavage potency in vitro as evidenced by the appearance of two cleaved RNA fragments (Fig. 4l,m). Together, these data suggested that the incorporation of a 5′ G-G sequence enhances both crRNA transcription and PspCas13b cleavage activity. The results also indicated that when crRNA design choices are restricted, the de novo design of crRNAs incorporating mismatched G bases at these key positions can substantially increase their potency despite introducing nucleotide mismatches with the target.
The 5′ G-G motif enables potent silencing of endogenous RNAs
Next, we aimed to determine whether the design principles we discovered using reporter fluorescent target transcripts could also apply to endogenous mRNA targets. To achieve this, we prospectively predicted ten potent crRNAs (5′ G-rich) and ten ineffective crRNAs (5′ C-rich) targeting the endogenous β-2-microglobulin (B2M) transcript that encodes a surface protein component of the major histocompatibility complex class I complex (Fig. 5a,b).
The findings supported our predictions, as all predicted potent crRNAs effectively downregulated the B2M protein, while the majority of predicted ineffective crRNAs failed to do so (Fig. 5a,b). We further showed that substituting the G bases of three potent B2M crRNAs with C bases at spacer position 1 or positions 1 and 2 compromised their silencing activity (Fig. 5c). Together, these data further supported the generalizability of our targeting rules and their relevance for silencing exogenous and endogenous transcripts.
To facilitate the use of our optimized and validated spacer nucleotide-based design of potent crRNA, we created a user-friendly web page (https://cas13target.azurewebsites.net/) to assist the community with their silencing assays. This in silico tool requires only the target sequence as input to create single-base tiled spacer sequences and rank them on the basis of their predicted potency. The web application can also assess potential off-target transcripts within the human transcriptome for the top ten most potent spacers (see Methods for more details).
Spacer mutagenesis reveals PspCas13b mismatch tolerance
Understanding the specificity, off-targeting potential and capability of PspCas13b to discriminate between two transcripts that share extensive sequence homology is extremely important for evaluating the potential and limitations of PspCas13-based RNA silencing. To study crRNA spacer promiscuity and the consequent PspCas13b targeting resolution, we conducted a comprehensive spacer mutagenesis to introduce mismatches with the target at various spacer positions. First, we introduced successive mismatches of 3, 6, 9, 12, 15, 18, 21, 24, 27 and 30 nucleotides starting from the 3′ and 5′ ends of the spacer (Fig. 6a,b). The three-nucleotide mismatches at the 3′ end of the spacers (positions 28–30) did not affect the silencing efficiency of this crRNA, whereas mismatches greater than three nucleotides completely abrogated its silencing (Fig. 6a). In contrast to the 3′ end, all 5′ end mismatches resulted in a complete loss of silencing, including three-nucleotide mismatches at the 5′ end (Fig. 6b). According to our earlier findings (Extended Data Fig. 5), silencing loss consequent to the introduction of a three-nucleotide mutation at the 5′ end is likely attributable to the substitution of a G-G-G motif by a C-C-C sequence rather than spacer–target mismatch itself (Figs. 3–5).
We also created crRNA constructs harboring mismatches of six, five, four and three nucleotides at different spacer positions and probed their silencing efficiency in live cells (Fig. 6c–f). Overall, the six-nucleotide mismatches largely compromised the efficiency of PspCas13b regardless of the mismatch position (Fig. 6c). The five-nucleotide mismatches at positions 6–10, 11–15 and 26–30 exhibited a partial loss of silencing, while mismatches at positions 1–5, 16–20 and 21–25 led to a near-complete or complete loss of silencing (Fig. 6d). The four-nucleotide mismatches at positions 9–12, 13–16 and 17–20 retained partial silencing activity, whereas mismatches at positions 1–4, 5–8, 21–24 and 25–28 yielded a complete loss of silencing (Fig. 6e). Notably, crRNA constructs harboring three-nucleotide mismatches at various spacer positions were well tolerated and yielded no or minor loss of silencing, except for mutations at positions 1–3 that, as anticipated, led to a total loss of silencing, likely because of G-G-G removal at the 5′ end (Fig. 6f).
Whilst the preceding experiments established the tolerance for consecutive spacer–target mismatches, we questioned whether the silencing profile of nonconsecutive mismatches may differ. We destabilized the spacer–target interaction by introducing nonconsecutive matches of 2, 3, 4, 5, 6, 7, 10 and 15 nucleotides spread throughout the spacer (Fig. 6g). We noticed that nonconsecutive mismatches of 2–4 nucleotides were tolerated and led to a negligible loss of silencing. However, nonconsecutive mismatches of more than four nucleotides led to a substantial or complete loss of silencing. Likewise, multiple successive mismatches of two or three nucleotides spread throughout the spacer sequence also completely abolished its silencing activity (Fig. 6g).
Next, we wanted to benchmark the mismatch tolerance of PspCas13b with the commonly used RfxCas13d ortholog. A bona fide RfxCas13d crRNA1 (used in Fig. 3l) efficiently silenced the mCherry transcript. We generated 14 additional RfxCas13d crRNA mutants that harbored either consecutive or nonconsecutive mismatches with the target at various spacer positions. Overall, by comparing the mismatch tolerance profiles of PspCas13b and RfxCas13d, we showed that these two enzymes exhibit distinct patterns of position-dependent mismatch sensitivity (Extended Data Fig. 8a,b). This is unsurprising given the poor sequence homology between these two Cas13 orthologs.
We further questioned whether introducing mismatches versus truncations in PspCas13b spacers at the 5′ or 3′ ends would lead to similar or distinct silencing profiles. The truncation of a motif three nucleotides or longer from the 5′ end led to a substantial loss of silencing. Conversely, the truncation of a three-nucleotide motif from the 3′ end did not reduce the silencing efficiency, whereas the excision of 6, 9, 12 and 15 nucleotides led to a gradual loss of silencing activity (Extended Data Fig. 8c,d). Interestingly, the comparison of mismatched and truncated spacers suggested that unpaired nucleotides within the spacer–target duplex may create further steric hindrances that exacerbate the loss of silencing activity.
Next, we questioned whether mismatch tolerance can be influenced by the length of the spacer. To test this hypothesis, we used a 27-nucleotide spacer sequence with a truncated 3′ end (Δ28–30) that previously exhibited full silencing activity compared to the full-length spacer. Then, we generated 16 additional mutant spacers by introducing consecutive or nonconsecutive mismatches at various positions. As expected, the substitution of the 5′ G-G-G nucleotides with a C-C-C motif led to a complete loss of silencing. The introduction of three consecutive mismatches at spacer positions 22–24 also led to near-complete loss of silencing, whereas unpaired bases at spacer positions 4–6, 7–9 and 10–12 led to substantial loss of silencing. Conversely, three consecutive mismatches at positions 13–15, 16–18 and 19–21 were fully tolerated (Extended Data Fig. 8e). Additionally, single-nucleotide mismatches at spacer positions 7 and 14 were fully tolerated, whereas a single mismatch at position 21 led to a substantial loss of silencing activity. Likewise, two, three, four and five nonconsecutive mismatches led to a substantial or complete loss of activity regardless of their positions (Extended Data Fig. 8f). Together, these data suggested that shortening the PspCas13b spacer exacerbates its mismatch intolerance.
These data revealed the targeting resolution of PspCas13b and suggested that nonconsecutive mismatches of more than four nucleotides compromises PspCas13b activity. In addition, the data suggested that endogenous targets with partial sequence homology are unlikely to be impacted by off-target silencing because of the required minimum base pairing of ~25 consecutive or nonconsecutive nucleotides. These mutagenesis data provided further evidence that highly effective crRNAs can be readily designed with minimal or no off-target effects.
Quantitative proteomics reveals the high specificity of PspCas13b
Permissive target recognition because of spacer–target mismatch tolerance, together with collateral degradation of neighboring RNAs, can cause off-target effects and undermine the specificity of various RNA-targeting CRISPR enzymes28,29,36,44. The comprehensive mutagenesis described above (Fig. 6) suggested that PspCas13b is highly specific and is unlikely to silence other cellular transcripts because of its stringent requisite base pairing and low mismatch tolerance. Yet, a direct assessment of the specificity of PspCas13b through a proteome-wide analysis in mammalian cells could simultaneously probe off-target effects related to both mismatch tolerance and collateral activity. We aimed to quantitate global protein abundance when the PspCas13b nuclease domain was in an active state (through spacer–target base pairing) or inactive state (no spacer–target base pairing). We chose the BCR-ABL1-eGFP transcript as an RNA target to assess the specificity of PspCas13b in HEK 293T cells. If PspCas13b possesses high fidelity, its on-target activity would suppress BCR-ABL1-eGFP protein expression alone, without impacting the expression profile of other endogenous cellular proteins. However, if target-activated PspCas13 exhibits lower specificity through mismatch tolerance and/or collateral activities, nonspecific endogenous protein degradation would occur. To test these hypotheses, we harvested protein lysates from HEK 293T cells after 48 h of expression of BCR-ABL1-eGFP, PspCas13b and an NT or BCR-ABL1-targeting crRNA (Fig. 7a). Fluorescence, RT–qPCR and western blot analyses all showed potent cleavage of the BCR-ABL1 transcript and efficient silencing of its protein (Fig. 7b–d). In addition to BCR-ABL1, we observed a minor reduction (~18%) in the expression level of PspCas13b protein in the targeting condition, whereas the housekeeping protein β-actin remained unchanged (Fig. 7d). We then performed a comparative mass spectrometry (MS) analysis of global protein abundance in NT and targeting conditions. The data revealed that, among the 3,837 human proteins we detected, only the target BCR-ABL1 protein and its eGFP reporter (encoded in the same BCR-ABL1-IRES-eGFP transcript) were strongly silenced when PspCas13b was activated, without any noticeable perturbation of human endogenous proteins (Fig. 7e,f). Consistent with the western blot analysis, the PspCas13b expression level was only moderately reduced. Importantly, this proteomic analysis further indicated that PspCas13b is highly specific and the activation of its HEPN nuclease domains does not lead to global off-target or collateral silencing of endogenous transcripts. To confirm that the silencing of BCR-ABL1 is mediated by the activation of the nuclease domain of PspCas13b, we performed additional control experiments using dPspCas13b or a targeting crRNA alone (no PspCas13b). Fluorescence, RT–qPCR, western blot and proteomic analyses showed that dPspCas13b or a BCR-ABL1-specific crRNA alone is unable to silence the target (Fig. 7f–m).
To further investigate the specificity of PspCas13b, we benchmarked its nuclease activity against the RfxCas13d ortholog that is known to possess pronounced collateral activity27,28,36,37,38. We reprogrammed PspCas13b to silence mCherry in HEK 293T cells while coexpressing eGFP as a nontarget transcript. Fluorescence analysis demonstrated strong suppression of mCherry with no observable collateral effect on eGFP (Extended Data Fig. 9a–c). However, RfxCas13d exhibited a robust silencing of both mCherry (intended target) and eGFP (unintended nontarget) (Extended Data Fig. 9d–f). Together, these results indicated that PspCas13b exhibits superior specificity because of the absence of noticeable collateral activity.