November 15, 2025

Rapid Multi

Transforming Spaces, Enriching Lives

Generative and predictive neural networks for the design of functional RNA molecules

Generative and predictive neural networks for the design of functional RNA molecules

Implementing an efficient representation of RNA secondary structure

In order to predict function using both sequence and structure, we first developed a representation of RNA secondary structure that could be supplied to a predictive model in parallel with one-hot-encoded sequences (Fig. 1, Supplementary Fig. 1). To do so, we took inspiration from the field of image classification, where manually engineered data features have been surpassed by deep convolutional models which learn their own abstractions of input data40. By constructing a basic array that encoded only the locations of possible base pairing interactions, we hypothesized that a deep convolutional neural network (CNN) would be able to construct similar abstractions related to the structure of an RNA molecule during the training process. The novel structural array we present (Supplementary Fig. 2) is thematically similar to others that have been developed for RNA inverse design15,19,35, but offers the time efficiency necessary to be practically useful in a deep learning pipeline while avoiding any assumptions inherent to classic structural prediction algorithms (Supplementary Fig. 3).

Fig. 1: Overview of the SANDSTORM and GARDN models.
figure 1

a SANDSTORM expands upon previous sequence-to-function neural networks by incorporating both sequence and secondary structure array input channels. These paired inputs are passed through parallel convolutional stacks that form an ensemble prediction of input RNA function (see Supplementary Fig. 1a for a detailed depiction of SANDSTORM). b GARDN is a generative adversarial network architecture which accepts a random variable input (Z) and is tasked with designing realistic examples of functional RNAs (see Supplementary Fig. 1b for a detailed depiction of the GARDN generator). c A trained GARDN generator can be paired with a SANDSTORM predictive model to return realistic sequences with targeted experimental values.

Learning interpretable structural information with a simulated dataset

We evaluated a CNN’s ability to convert our structural array into interpretable structural information using a simulated dataset of toehold switches and several decoy sequences (Fig. 2a). Toehold switches are riboregulators that modulate translation of an output protein in response to a complementary RNA input molecule3. These devices are designed to hide the RBS and start codon from the translational machinery by placing them within a cis-repressing stem. This stem can be unwound by an RNA trigger that is complementary to the 5′ end of the switch, releasing both the RBS and start codon. Because these devices require both sequence and structural motifs to function, they serve as an ideal candidate to demonstrate the performance improvement realized by using both properties as inputs. A dataset was constructed consisting of canonical toehold switches and four types of decoy sequences: sequences containing only an RBS motif and random nucleotides; sequences containing only a start codon motif; sequences containing both a start codon and RBS, but do not fold into the toehold switch structure; and sequences that fold into the toehold switch structure, but do not contain the necessary sequence-level motifs.

Fig. 2: Extracting secondary structure information using a simulated dataset.
figure 2

a A dataset of toehold switches and several types of decoy sequences was utilized to determine whether a CNN could differentiate sequences on a structural basis. Sequences consisted of canonical toehold switches, decoys containing only an RBS motif (RBS decoys), decoys containing only a start codon motif (AUG decoys), decoys containing both an RBS and start codon motif (RBS + AUG decoys), and decoys that adopted the canonical secondary structure but did not contain the necessary sequence motifs (binding decoys). b A CNN trained only on one-hot-encoded sequences was not able to perfectly classify the canonical toehold switches from the RBS + AUG decoys, which are only differentiable at a structural level. c The SANDSTORM CNN accepting paired sequence and structure arrays was able to identify both the sequence and structural features required to classify the simulated dataset. Source data are provided as a Source Data file.

We compared the performance of a model accepting only one-hot-encoded sequences as inputs to the SANDSTORM model accepting a paired sequence-structure input at the task of classifying each member of this simulated dataset (Fig. 2b, c). As expected, the one-hot-encoding model did not perform as well as the dual-input model when differentiating between canonical toehold switches and decoys that contained the necessary sequence motifs but did not fold into the correct secondary structure (single-input sequence-only model AUC = 0.72, SANDSTORM model AUC = 0.97). As these two categories are only differentiable at the secondary structure level, we concluded that our novel array enabled the model to learn useful abstractions of RNA structure to inform predictions.

To further validate that the paired-input classification model was identifying meaningful representations of secondary structure, we applied the integrated gradients technique41 to the trained predictive model along the structural input channel (Supplementary Fig. 4a, b). When examining a canonical toehold switch, this approach reveals a strong model preference for base pairing interactions in the region of the stem and a strong aversion for proximal off-target base pairing interactions. This agrees closely with the minimum free energy predicted structure of the sequence, indicating the model’s ability to correctly identify and leverage interpretable structural motifs. As structural features are products of primary sequence relationships, we further hypothesized that a sufficiently complex sequence-only model would be able to eventually approach the performance of the dual-input classifier. We explored this question by increasing the size of the receptive field for the sequence-only model and by increasing the number of trainable hyperparameters (Supplementary Fig. 5), both of which should allow the sequence-only model to construct more complex abstractions of the inputs it receives. In agreement with this hypothesis, we found a steady increase in the sequence-only model’s ability to correctly classify the structure-dependent decoy class as we increased the size of the receptive field and total number of trainable hyperparameters. Despite these improvements for the sequence-only model, the dual-input method demonstrated consistently better performance across parameter variations (Supplementary Fig. 5).

SANDSTORM models predict the function of multiple classes of RNAs

After demonstrating the capability of the SANDSTORM architecture to account for secondary structure in a more efficient manner than a sequence-only comparator, we sought to determine if this approach could be generalized to predict the function of different classes of real RNA molecules (Fig. 3). As our goal was to develop a generalized framework for predicting RNA function, we designed a single, efficient SANDSTORM CNN architecture that could be applied to many RNA functional prediction tasks without sacrificing performance. Our model consisted of a sequence input channel and a structure input channel, each connected to an independent stack of convolutional layers. The outputs of the convolutional channels were concatenated before being passed through a final series of densely connected layers to allow the model to learn complex relationships between the features abstracted from each input. The model consisted of as few trainable parameters as possible and the hyperparameters were not tuned for each presented test case. The mean squared error (MSE), R2, and Spearman correlation for SANDSTORM and the previously published model for each regression task are compared, averaging across three randomized training-testing splits (Supplementary Data 1 for datasets).

Fig. 3: SANDSTORM models can predict the function of several classes of RNA molecules.
figure 3

a SANDSTORM prediction metrics for toehold switch ON and OFF values compared to NuSpeak/STORM. b SANDSTORM predictions for 5′ UTR (untranslated region) mean ribosome loading compared to Optimus 5-prime. c SANDSTORM model predicting RBS (ribosome binding site) translation efficiency compared to SAPIENS. d SANDSTORM predictions of Cas13a collateral cleavage efficiency using guide RNA-target pairs compared to ADAPT. Bars represent means across three independent training-testing splits ± s.d. Links to comparator datasets and code repositories are available in Supplementary Data 1. Source data are provided as a Source Data file.

Toehold switch prediction

Angenent-Mari et al.35 conducted an assay that sorted a large library of toehold switches based on their resulting fluorescence levels measured in both the native folding conformation, which corresponds to the OFF state, and when fused to the cognate trigger, which is equivalent to the ON state. This dataset therefore consists of toehold switch sequences that are mapped to a fluorescence readout in the ON and OFF states, with desirable sequences having the highest ON/OFF ratio possible. Despite the wealth of data mapping sequence to function for toehold switches, there is currently no theoretical framework that can reliably link sequence or thermodynamic characteristics to ON/OFF ratios. Furthermore, previous studies attempting to use engineered thermodynamic features as the inputs to a multi-layer-perceptron similarly failed to produce strong predictions35. The SANDSTORM-toehold architecture can match the most successful model from previous studies35,36 (MSE p-value = 0.13, two-sided t-test) values while using 67% fewer trainable parameters (Fig. 3a).

5′ UTR prediction

Several cis-regulatory elements affect the translational efficiency of full-length messenger RNAs in eukaryotic cells. One such element that can have a profound effect on protein production is the 5′ UTR; however, a detailed prediction of the regulatory effect cannot be captured by first-order sequence or structural relationships42,43. To survey the space of regulatory UTRs, Sample et al.37 conducted a high-throughput polysome profiling assay and constructed a predictive CNN using the resulting sequence-function dataset. Our SANDSTORM-UTR architecture was able to match the previously reported performance metrics (MSE p-value = 0.32, two-sided t-test) on this dataset (Fig. 3b) while using only 9.7% of the number of trainable parameters, suggesting that the predictive capabilities of SANDSTORM models can be applied across RNA classes.

RBS-flanking sequence prediction

The RBS sequence motif is required for translation initiation in prokaryotic systems. While the core motif is necessary for translation to occur, the variable nucleotides surrounding the RBS can have a large impact on the resulting degree of protein production11,44. This effect was quantified in a high-throughput manner by promoting translation of a site-specific DNA recombinase off a library of different RBS-flanking sequences39. When translated, this recombinase can flip the orientation of downstream section of the DNA molecule, allowing the fraction of DNA readouts that are flipped after sequencing to serve as a proxy for translation efficiency. As with the other cases reported here, the relationship between RBS-flanking sequences and translation efficiency was originally modelled using a sequence-only probabilistic deep learning framework. The SANDSTORM-RBS model was able to again match the predictive performance of the previously published approach (R2 p-value = 0.50, Spearman p-value = 0.95, two-sided t-test) while using 95% fewer trainable parameters (Fig. 3c). As the original model published with this dataset used a probabilistic loss instead of MSE, the performance comparison consists of only the correlations between predicted and ground-truth outputs.

CRISPR Cas13a target-guide prediction

CRISPR/Cas nucleases enable site-specific cleavage of nucleic acids based on the spacer sequence supplied by a guide RNA2. Predicting the efficiency of on- and off-target sequences that could be cleaved by Cas nucleases is a major challenge in the development of CRISPR systems and has been approached with a variety of different methods. To tackle this challenge, high-throughput methods31,38 have been developed to map the relationship between gRNA/target RNA pairs and resulting Cas13a collateral cleavage efficiency. The SANDSTORM-Cas model is capable of significantly improving on the performance of the previously characterized sequence-only model38 (MSE p-value = 2.5 × 10−8, two-sided t-test) while using only 42% of the trainable parameters (Fig. 3d). Additionally, this test case demonstrates that SANDSTORM architectures are capable of predicting RNA functions that are not tied directly to a translational readout, further supporting the generalizability of this approach.

Implementing a generative architecture for RNA design

After establishing a generalized approach to predict RNA function, we aimed to develop a similarly robust framework for designing functional RNA sequences. While generative modelling has recently gained traction across a number of different molecular design tasks45,46,47,48,49,50,51,52,53 the heterogeneous functions of RNA create a distinct set of design constraints that are not easily accounted for in a generalized manner. Certain functional RNA molecules, such as 5′ UTR sequences, do not require the inclusion of specific sequence or structural motifs in order to function, while many naturally occurring RNA sequences must strictly adhere to a target structure or incorporate regulatory motifs to demonstrate their intended effect. More complex still are synthetic RNA switches, which must adopt a target conformation while retaining nucleotide diversity within their programmable target recognition regions. Various generative and thermodynamic design tools have flourished individually in each of these domains9,10,11,13,14,15,18,22,23,24,26,28,36,37,38,39,45,47,49,50,51,52,53; however, a generative counterpart to our SANDSTORM architecture that can be used regardless of the sequential or structural constraints of the design task has yet to be developed for functional RNA molecules.

In order to unify the strengths of both thermodynamic- and deep learning-based design approaches, we developed Generative Adversarial RNA Design Network (GARDN) models that could incorporate both variable and conserved sequence motifs, experimental predictions, and adherence to a target secondary structure. Generative Adversarial Networks54,55 have demonstrated striking results across a number of different domains, and further offer the ability to be directly optimized by a trained predictive model45. As a proof of concept for utilizing a generative adversarial network in tandem with our structure-informed predictive tools, we began with the least-constrained task of generating 5′ UTR sequences. While these sequences can have a profound effect on translation levels, there are no specific requirements that a design algorithm must adhere to for a candidate UTR to be considered valid. We first characterized a GARDN model that could faithfully recreate the sequential and structural attributes of the entire dataset used during training (Appendix 2, Supplementary Fig. 6). When optimizing this model with our trained SANDSTORM-UTR predictor, we could furthermore fine-tune the generated outputs to recreate the distribution of lowest- or highest-expression sequences while reflecting both global and local structural profiles (Appendix 2, Supplementary Fig. 7).

Following this proof of concept, we investigated the utility of the GARDN approach on the more specific design task presented by RBSs. These sequences must incorporate some portion of the Shine-Dalgarno sequence (AGGAGG) to efficiently promote translation; however, the context surrounding the core motif can greatly impact the resulting protein expression. We trained a GARDN model on the RBS dataset and optimized the output sequences using a trained SANDSTORM-RBS predictor. The resulting post-optimization sequence group favored the incorporation of the core Shine-Dalgarno sequence compared to the non-optimized sequences returned by the GARDN model (Fig. 4a, top, middle), indicating that sequence-function relationships learned by the predictor can directly guide the sequences returned by the generator during optimization. This preference was in line with the three highest-performing sequences reported in the experimental dataset (Fig. 4a, bottom), which also demonstrate the inclusion of this motif. We experimentally validated the performance of 5 RBS sequence pairs optimized by the GARDN-SANDSTORM routine, demonstrating a geometric mean fold change of 28.2 between pre- and post-optimized sequences (Fig. 4b). Strikingly, 3 out of 5 of the GARDN-SANDSTORM optimized sequences also exceeded the translational activity of the three highest performing sequences reported in the original dataset used for training (Fig. 4b). Adherence to the constraints of RBS design while also accounting for experimental performance highlights the utility of the GARDN-SANDSTORM across diverse RNA settings.

Fig. 4: GARDN design of ribosome binding sites.
figure 4

a The nucleotide distributions of RBS (Ribosome-binding site) sequences returned by the GARDN model (top), the GARDN model optimized by a trained SANDSTORM-RBS predictor (middle) and the three highest performing sequences from the experimental dataset used for training (bottom) (see Supplementary Data 2 for sequences). b GFP expression measurements in BL 21 Star (DE3) E. coli as determined by flow cytometry for RBS sequences designed using the GARDN-SANDSTORM approach and the three highest-performing sequences reported in the original dataset used for training. See Methods (Flow Cytometry Analysis) and Supplemental Information for representative gating strategy. Bars represent the fluorescence measurement mean ± s.d. Post-optimized bars denoted with * indicate a statistically significant (p < 0.05) increase in fluorescence compared to all three high-throughput examples as measured by a one-sided t test. Source data are provided as a Source Data file.

While the GARDN-RBS designs show significant performance improvements, the specificity of this RNA design task represents a challenge that generative models are intrinsically well-suited to address. Notably, when a highly conserved sequence motif is predominantly driving RNA function, a generative model must primarily identify and reproduce those nucleotides which are conserved across active members of the sequence group in order to return a valid candidate. We aimed to expand this framework to the far more challenging task proposed by the design of RNA switches, in which candidate sequences must adhere to a target secondary structure in a sequence-independent manner while also incorporating conserved nucleotide domains. For this task, we focused on the design of toehold switch riboregulators.

During preliminary tests, we found the generative architectures that enabled UTR and RBS generation struggled to adhere exactly to the target stem-loop structure required for toehold switch devices to function (Supplementary Fig. 8). Previous generative approaches to toehold switch design have approached this problem by manually correcting generated constructs that have been optimized using activation maximization, which introduces non-trivial changes to the predicted score36. Here, toehold switch generation was accomplished by adapting a thematically similar approach to previous RNA inverse design software, where a target secondary structure can be specified by a user. This took the form of a reverse complementation layer in the generator network which operated on the latent nucleotide array, mapping the regions upstream and downstream of the RBS that base pair in the toehold switch before being passed through a final convolutional output layer. This development allowed several improvements offered through differentiable design while converging efficiently (Supplementary Fig. 9) and retaining the ability to design molecules with a programmable target secondary structure.

We analyzed the resulting secondary structures of GARDN-generated toehold switches as well as those designed using activation maximization, constrained activation maximization (where the conserved sequence regions are included in the input seed)49,52, and NUPACK14. For a qualitative assessment of design quality, we computed the MFE structures of the toehold switch designs and overlaid them on top of one another with a degree of transparency to visualize the ensemble of secondary structures generated by the different design strategies (Fig. 5a–f). In this representation, the most likely sequential character at each position was additionally calculated, and the resulting overall structure dictated by this calculation is displayed in bold (a linear, unpaired structure is displayed for sequence groups that return an invalid structure). For quantitative assessment, we employed a metric that could provide single-nucleotide resolution of structural agreement based on the expected paired or unpaired state of each nucleotide across an entire group of sequences (see Methods). This structural agreement metric was used to score how well the toehold switches designed using each algorithm adhered specifically to the target canonical toehold switch structure (Fig. 5g, h). Sequences designed using NUPACK, without any added constraints on the target RNA sequence, showed the optimal structural distribution, as expected (Fig. 5a, h). The experimental toehold switch dataset used during the training process, which featured target RNA sequences from viruses and human genome sites, displayed lower structural agreement due to the additional sequence constraints but adopted the expected toehold switch secondary structure (Fig. 5b, h). In contrast, toehold switches designed using either standard or constrained activation maximization did not converge to a structure that would be expected to exhibit the key switching mechanism required to function (Fig. 5c, d), resulting in a significant reduction in structural agreement (Fig. 5h). Designs generated using activation maximization also failed to recapitulate the conserved RBS and start codon sequences necessary for translation (Supplementary Fig. 10). GARDN-generated toehold switches, however, demonstrated a steady increase in structural agreement over adversarial training iterations, until finally converging to a value which replicates the distribution of secondary structures of the experimental toehold switch dataset used during the training process (Fig. 5g, h). We also used GARDN in tandem with the SANDSTORM-toehold predictor to generate toehold switches optimized to have extreme ON values. These GARDN-SANDSTORM designs not only retained the ensemble secondary structures of the training dataset during optimization, but additionally incorporated specific sequence preferences that matched those previously correlated with high ON-state expression in experiments based on thermodynamic considerations (Fig. 5f, h and Supplementary Fig. 10). These results demonstrate that a combination of the adversarial training setup and optimization guided by a structurally informed predictive model can allow for thermodynamically realistic, high-performance sequences to be designed using GARDN models.

Fig. 5: GARDN Models can design toehold switches with realistic secondary structures.
figure 5

af Ensemble overlays of the secondary structures of toehold switch sequences designed using a variety of algorithms and those included in the experimental training data: (a) NUPACK design algorithm applied without target RNA sequence constraints, (b) the high-throughput experimental dataset35 used for model training, (c) single sequence activation maximization, (d) single sequence activation maximization with starting seeds containing the constant motifs, (e) GARDN, and (f) GARDN-SANDSTORM with toehold switches optimized to have high ON values (n = 300 af). Predicted structures were calculated using standard structural prediction software13,14,15, with the bolded structure representing the most likely structural character at each position (unpaired for heterogenous sequence groups where the most likely structure is not valid). g Structural agreement of GARDN-generated toehold switch sequences increases over adversarial training iterations, until converging to the experimental average. h Final structural agreement between toehold switches designed using different algorithms and the target canonical toehold switch secondary structure (quantification of af see Methods). i ON-value optimization results for GARDN-designed toehold switch sequences against the SANDSTORM-toehold predictive model. 300 calls to the predictive model resulted in a 37% increase in average ON score. Source data are provided as a Source Data file.

Previous approaches have relied on post-hoc editing of designed sequences to achieve a plausible binding structure. Introducing manual corrections after completing the optimization procedure introduces a non-trivial disconnect between the model’s prediction of the sequence’s function before and after the adjustment. By optimizing a generator with an output that adheres to the expected grammar of a toehold switch, the latent space providing inputs to the GARDN models can be efficiently explored with respect to ON values without having to make manual edits to the resulting sequence that are destructive to the validity of the prediction (Fig. 5f). The structural agreement traces of both GARDN and GARDN-SANDSTORM constructs (Supplementary Fig. 11) reveal that these sequences deviate from the target toehold switch structure at the same loci as the experimental dataset, demonstrating that both global and local secondary structural profiles are faithfully recreated by the generator. While sequences designed using NUPACK showed the optimal structural distribution, this effect is expected as the GARDN models were trained to recreate the experimental distribution (Fig. 5b) encountered during training.

We found that a pre-trained GARDN model also demonstrates better nucleotide variability (Supplementary Fig. 10) and faster generation time (Supplementary Fig. 11) than standard inverse design algorithms13,14,15, with the added benefit of having a score predicting design function. This variability is critical in in diagnostic applications where a limited set of solutions can impede the range of possible target molecules which activate the system and synthetic biology applications where high overlap between multiple constructs can lead to unintended cross-talk3. The lightweight SANDSTORM prediction architecture results in a simplified optimization landscape allowing for efficient and reliable sequence design. Using GARDN-SANDSTORM, we were able to achieve an average 37% percent increase in predicted normalized ON values of GARDN-generated sequences with only 300 calls to the predictive model (Fig. 5i). We repeated this procedure using several variations of the SANDSTORM architecture that varied in total parameter count, finding that models with a larger number of trainable parameters tended to over-estimate the level of improvement demonstrated between pre- and post-optimization sequences (Supplementary Fig. 12). Improving the predicted scores of generated constructs was possible without sacrificing the realistic sequence contents of the resulting designs (Supplementary Fig. 10). Toehold switches optimized by GARDN-SANDSTORM demonstrated superior variation in the trigger and stem domains compared to other approaches while maintaining the constant RBS and start codon motifs. Moreover, comparing the sequence contents of GARDN-optimized toehold switches to those optimized with standard activation maximization and constrained activation maximization demonstrates that in the absence of a generator network, optimized sequences will trend towards solutions that trivially maximize the score of the predictor at the expense of both sequence and structural design constraints (Fig. 5c, d).

Experimental validation of GARDN-SANDSTORM toehold switches

To verify the performance improvement afforded by GARDN-SANDSTORM generative RNA design, we experimentally characterized the performance of several groups of toehold switch sequences designed using different algorithms. These groups consisted of non-optimized outputs from the GARDN models, GARDN sequences optimized to have high ON-state GFP expression using an ON-only SANDSTORM predictive model, GARDN sequences designed to have both a high ON state GFP expression and low OFF state GFP expression using a joint ON/OFF SANDSTORM model, and those designed using NUPACK (see Source Data for sequences). For the most straightforward comparison between the design algorithms, we utilized the core NUPACK algorithm to design the 60-nt switch sequence only, rather than incorporating potential additional context such as the coding sequence or RBS, as the effect of incorporating these factors is not represented in the high-throughput training data. The resulting designs were characterized in Escherichia coli in conditions matching the toehold switch dataset used for training35 with the switch RNA on its own used to evaluate the OFF state and an RNA featuring a switch-trigger fusion used for the ON state (see Methods). Compared to the default GARDN output, sequences optimized with an ON-only SANDSTORM predictive model show a 3.8-fold increase in the geometric mean of GFP fluorescence from their parent sequences in the presence of the cognate trigger after only 300 calls to the predictive model (Fig. 6a; total optimization runtime ~11 s). The ON-optimized GARDN-SANDSTORM sequences additionally demonstrated an average fluorescence 4.8-fold higher than NUPACK (Fig. 6b). This result highlights the ability of our optimization process to improve specific attributes of RNA sequences.

Fig. 6: GARDN-designed toehold switches show improved experimental performance in E. coli.
figure 6

a Individual toehold switch sequences designed by GARDN to have a high ON-state expression (see Supplementary Data 3 for sequences) show a geometric mean of 3.8-fold increase in GFP expression in the presence of their cognate trigger in E. coli after optimization by a SANDSTORM predictive model (p = 0.016, one-sided Wilcoxon ranked-sum test). b GARDN and ON-optimized GARDN-SANDSTORM toehold switches demonstrate a 1.9-fold and 4.8-fold increase in ON state fluorescence (p = 0.008, p = 0.002, one-sided Mann–Whitney U test). c, Toehold switches designed to exhibit high ON/OFF ratios by the GARDN-SANDSTORM routine demonstrate a geometric mean increase in ON/OFF ratio of 3.7-fold using 300 calls to a predictive SANDSTORM model (p = 0.016, Wilcoxon ranked-sum test). d Both non-optimized GARDN sequences and GARDN-SANDSTORM optimized groups show a significant increase in performance of 3.8-fold (p = 0.002, one-sided Mann–Whitney U test) and 11.9-fold (p = 0.002, one-sided Mann–Whitney U test), respectively, compared to those designed using classic inverse design software. See Methods (Flow Cytometry Analysis) and Supplemental Information for representative flow cytometry gating strategy. Bars represent the mean ± s.d. Source data are provided as a Source Data file.

Optimization of ON/OFF ratios is a more complex task, as the predictor must convey meaningful information related to the performance of each sequence in two independent experimental states. Furthermore, the sequence or structural factors that minimize leakage in the OFF state, such as high GC content that creates a strong stem, are frequently at odds to those that maximize expression in the ON state, where an AU-rich stem can more readily unwind leading to increased expression3. Despite this dynamic tradeoff, GARDN-SANDSTORM optimization by a predictor trained on the ON/OFF ratios of toehold switch sequences demonstrated a geometric mean of 3.7-fold increase in ON/OFF ratios compared to their non-optimized counterparts (Fig. 6c). These findings demonstrate the capability of the GARDN-SANDSTORM approach to optimize RNA molecules with complex, multi-faceted functions, a key factor for the design of RNA across modern biotechnology settings.

Both the non-optimized GARDN and GARDN-SANDSTORM sequence groups demonstrated remarkable improvements of 3.8-fold and 11.9-fold, respectively, in experimental ON/OFF ratios compared to those designed using NUPACK (Fig. 6d). Importantly, this increase in performance does not come at the expense of either the sequence or structural realism of the designs.

Screening a limited library of aptaswitches for SANDSTORM training

The use of generative modeling for molecular design is fundamentally constrained by the need to train on large quantities of experimentally collected data. This limitation can be prohibitive for the creation of new classes of riboregulators, in which only a handful of examples may show the desired regulatory effect in initial experiments. Despite this, there are many common design motifs that appear across a diverse range of functional RNA classes. For example, many recent works have utilized programmable RNA-RNA interactions to control cellular translation levels, or they have coupled toehold-switch-like secondary structures to the formation of a various regulatory motifs56,57,58,59,60,61. These systems have been engineered in settings ranging from cell-free systems to in vivo animal models. However, these same design motifs appear to consistently regulate diverse RNA processes. The use of generative modelling for these applications has thus far been limited by a lack of available training data.

We investigated if the design approach of coupling toehold switch-like structures to new regulatory motifs could be achieved by optimizing our GARDN model with a predictor trained in a new domain. To emulate settings in which high-throughput data may not be available, we used a low-throughput, manually collected library of 384 sequences. As the template for this investigation, we used the recently developed aptaswitch system58, which couples a stem-loop RNA secondary structure to the formation of a reporter aptamer via toehold-mediated strand displacement (Fig. 7a). Toehold switches and aptaswitches have common design features and critical differences that made them a plausible, yet challenging, candidate for our design strategy. Similar to the toehold switch, the aptaswitch stem-loop structure is unwound in response to a complementary RNA or DNA target. Unlike the toehold switch, however, the region of the aptaswitch released by target binding is used to form a downstream stem domain, promoting folding of the reporter aptamer and enabling binding of the ligand. For diagnostic purposes, the aptaswitch reporter aptamer is often Broccoli, which associates with its ligand DFHBI-1T to generate a bright green fluorescent signal62 Thus, output from the aptaswitches is not tied to translation of a reporter and aptaswitch activation requires a more complicated shift in structural conformations between the ON and OFF states compared to toehold switches. We constructed and screened an aptaswitch library by sampling 384 sequences from the Angenet-Mari et al.35 toehold switch dataset and converting these template hairpins into 137-nt aptaswitch constructs using the Broccoli aptamer as the reporter (Fig. 7b, c).

Fig. 7: Screening a limited dataset of aptaswitches.
figure 7

a Aptaswitches couple a toehold switch-like RNA hairpin to the conditional formation of a reporter aptamer. Fluorescence is generated in the presence of both the complementary target DNA molecule and the aptamer ligand. b A library of 384 aptaswitches were screened experimentally to evaluate the SANDSTORM model’s capacity on limited datasets. c The aptaswitch library was constructed by converting toehold switch sequences from Angenet-Mari et al35. into aptaswitches and manually characterized for their ON/OFF ratios (see Supplementary Data 4 for sequences). d The performance of converted aptaswitch elements varied greatly from their original context, demonstrating the need for a domain-specific predictor to be trained on the novel dataset. Toehold switch ON/OFF ratios are reported in the normalized value range of −1 to 1 from the original high throughput dataset. e The corresponding aptaswitch for all possible tiles of the covid N gene (n = 1230) were ranked in silico using SANDSTORM predicted ON/OFF ratios. The same process was repeated, instead using the NUPACK ensemble defect of the switch complex or switch-trigger complex to identify optimal regions from the same pool of candidate tiles. f The ON/OFF ratios of the top 6 aptaswitches identified using each approach are reported in addition to the 6 predicted lowest sequences identified using SANDSTORM. Bars represent the population mean ± s.d. Source data are provided as a Source Data file. Created in BioRender. Riley, A. (2025) https://BioRender.com/j80a514.

The converted aptaswitch library demonstrated a range of ON/OFF ratios spanning three orders of magnitude (Fig. 7b). While these devices contain many of the same structural elements as toehold switches, they were screened against free DNA targets and controlled a distinct biological output, resulting in performance that varied greatly from their original context (Fig. 7d). In order to apply our generative approach given this variation, a SANDSTORM predictive model needed to be characterized on the sequences as measured in their aptaswitch conformation. In data-scarce conditions such as these, deep learning predictive models are extremely susceptible to overfitting. While transfer learning can be efficiently used to fine-tune a pretrained predictive model on scarce examples in a new domain, we sought to investigate if the computational efficiency of the SANDSTORM architecture would enable us to train from scratch using this dataset. After benchmarking the model on 100 unique training-testing splits, we confirmed that the SANDSTORM-aptaswitch model reliably converged in 37 out of 100 training instances (Supplementary Figs. 13 and 14), achieving a median spearman rank correlation of 0.195 (Supplementary Fig. 13a, Appendix 3). We further validated that variation in predictive model performance was not due to favorable splitting of the data (Supplementary Fig. 13a–e, Appendix 3). This allowed us to confidently filter for models that demonstrated accurate training with minimal risk of selecting based on trivial performance driven by specific, favorable data samples.

We experimentally validated the predictive capacity of the trained SANDSTORM-aptaswitch models on unseen sequences in a setting emulating novel diagnostic design, where a high-performance aptaswitch sequence providing a high ON/OFF ratio against a specific pathogen target must be rapidly identified from a pool of hundreds of candidate sequences (Fig. 7e). A library of unseen aptaswitch sequences designed to recognize all 30-nt tiles of the SARS-CoV-2 N gene (N = 1230) was created in silico, and these sequences were ranked according to their predicted ON/OFF values using mean predictions across each of the successfully trained models. For comparison, we used NUPACK to create a ranked list of sequences from the same pool of possible tiles, using the ensemble defect of the switch (NUPACK Switch) or switch/trigger complex (NUPACK Complex) to rank preferrable sequences. The six highest predicted sequences for SANDSTORM demonstrated a mean ON/OFF ratio of 62.46, nearly tenfold higher than the mean ON/OFF ratio of the six predicted lowest of 6.69 (Fig. 7f, p = 0.0076, one-sided Mann–Whitney U test), confirming the performance of the model on unseen sequences despite the extreme scarcity of available data. In comparison, the top six sequences identified by NUPACK Switch and Complex approaches demonstrated mean ON/OFF ratios of 38.59 and 56.25 (Fig. 7f). SANDSTORM additionally identified the highest individual aptaswitch from the pool of tiles across each of the algorithmic approaches tested, a critical result for the deployment of this technology for rapid diagnostic design. Evaluating the entire N gene using a single SANDSTORM model required less than a second, while structural evaluations of this pool using NUPACK required 34.7 s. This efficiency is particularly relevant for real pandemic response settings, where the full-length genomes of multiple variants and closely related pathogens must be considered for diagnostic targeting on rapid timelines. This overall performance of SANDSTORM compared to thermodynamic tools demonstrates the potential for utilizing this framework to rapidly design RNA devices with limited experimental screening.

GARDN design of aptaswitches

To investigate whether the GARDN-toehold model could be optimized by the aptaswitch predictors to return high-performance sequences in this new domain, we conducted a computational sequence design experiment using each of the 37 successfully trained models. During optimization, the 60-nt toehold switch returned by the GARDN model was concatenated to the aptamer core and stem-forming b domain before being scored by a specific SANDSTORM predictor (Fig. 8a). Each SANDSTORM model was tasked with independently optimizing 100 converted aptaswitch sequences returned by the GARDN-toehold model for 300 optimization steps per sequence. From this group, we identified the specific model that demonstrated the greatest improvement between the pre- and post-optimized sequences (Model A) as ranked by the mean predicted ON/OFF of the 36 other models to carry forward for experimental validation. We hypothesized that the consensus between models regarding the performance of sequences designed under the guidance of model A would translate to the largest improvement in experimental performance. The GARDN-SANDSTORM optimized sequences demonstrated a geometric mean improvement in ON/OFF ratio of 1.49 across twelve pre- and post-optimization aptaswitches (Fig. 8b), with the post-optimization group demonstrating a mean ON/OFF ratio of 65.21 compared to the pre-optimization average of 40.45 (p = 0.1, one-sided Wilcoxon signed-rank test). For comparison, we utilized NUPACK to design a group of 12 aptaswitches without any additional sequence constraints, which demonstrated a similar mean ON/OFF ratio of 67.92 (Fig. 8b).

Fig. 8: GARDN design of aptaswitches using limited data.
figure 8

a The pretrained GARDN-toehold model was optimized by a SANDSTORM predictor trained on aptaswitch performance. The GARDN model generates 60-nt toehold switch hairpins from a random variable (Z), which were manually concatenated to the aptamer core as well as the necessary repeat of the (b) domain for aptamer formation during runtime. b The ON/OFF ratios of sequences designed using NUPACK, GARDN-SANDSTORM, and GARDN-SANDSTORM under an extended optimization routine (see Supplementary Data 5 for sequences). Data points are the normalized ON/OFF ratios for individual aptaswitch sequences. Bars are the population mean ± s.d. c The ON/OFF ratios of the paired pre- and post-optimization samples designed by SANDSTORM model A for 300 optimization steps. 7/12 samples show an improved post-optimization ON/OFF value. d The ON/OFF ratios for paired samples resulting from 600 optimization steps under the guidance of model A. 10/12 samples show an improved post-optimization ON/OFF value. Bars indicate the normalized ON/OFF value extrapolated from the ON and OFF state measurements demonstrated at three different aptaswitch concentrations. Source data are provided as a Source Data file.

Unlike in our investigation of toehold switches and RBSs, where every optimized sequence showed improvement, we found that 7/12 sequence optimization pairs demonstrated improvements for model A under identical optimization conditions to those explored throughout our previous experiments (Fig. 8c). While this change was not as drastic as the case of using SANDSTORM alone to screen aptaswitch candidates, this was to be expected given that the pre-optimization sequences were not initially selected for low performance, but were random. We further explored if a prolonged optimization relative to the initial settings could improve the downstream performance of the returned aptaswitches. We increased the number of steps in the GARDN-SANDSTORM design process from 300 to 600, again using model A to guide the procedure. Under these conditions, the post-optimization group demonstrated a mean ON/OFF ratio of 74.02 compared to the average of 52.57 for the initial group (p = 0.039, one-sided Wilcoxon signed-rank test) with 10/12 sequences improving. The geometric mean of the improvement demonstrated by optimized sequences was 2.07, (Fig. 8d) indicating the utility of this approach despite the limited availability of data. The highest ON/OFF ratios for the post-optimized GARDN-SANDSTORM aptaswitch groups were 161.9× and 158.3×, despite encountering only 14 sequences (3.6% of the dataset) greater than or equal to this activity during training. These results demonstrate the potential for SANDSTORM and GARDN models to serve as valuable components of the RNA design toolkit regardless of the functional complexity or data limitations of the design task.

link

Leave a Reply

Your email address will not be published. Required fields are marked *

Copyright © All rights reserved. | Newsphere by AF themes.