1. Introduction
The trajectory of modern pharmacology—if one pauses to reflect on it—appears to be shifting in ways that are both subtle and profoundly consequential. For decades, drug discovery has been guided, almost instinctively, by a protein-centric paradigm: identify a disease-associated protein, characterize its structure, and design small molecules capable of modulating its function. This approach, while undeniably successful in many therapeutic domains, has gradually begun to reveal its limitations. Estimates suggest that only a modest fraction—roughly 10–15%—of disease-associated proteins have been effectively targeted by existing drugs, leaving a vast portion of the proteome either inaccessible or, more frustratingly, “undruggable” due to structural constraints such as the absence of suitable ligand-binding pockets (Hopkins & Groom, 2002; Overington et al., 2006).
At the same time, advances in genomics have quietly but decisively expanded our understanding of what constitutes the functional genome. It is now widely recognized that a substantial proportion of the human genome—approaching 70%—is transcribed into RNA, yet only a small subset encodes proteins. The remainder comprises a diverse and increasingly significant class of molecules collectively termed non-coding RNAs (ncRNAs). Initially dismissed as transcriptional noise, these molecules are now understood to play critical regulatory roles across nearly all layers of gene expression (Bartel, 2004). This realization, while perhaps unsurprising in hindsight, has fundamentally altered the conceptual landscape of therapeutic targeting.
Among the various classes of ncRNAs, microRNAs (miRNAs), long non-coding RNAs (lncRNAs), and circular RNAs (circRNAs) have emerged as particularly influential. These molecules do not merely accompany gene expression processes; rather, they orchestrate them. miRNAs, for instance, regulate post-transcriptional gene expression by binding to target mRNAs, thereby influencing translation and degradation pathways (Bartel, 2004; Calin & Croce, 2006). lncRNAs, in contrast, exhibit a broader functional repertoire, participating in chromatin remodeling, transcriptional regulation, and scaffolding of protein complexes (Chen et al., 2013). CircRNAs—once considered rare anomalies—have now been recognized as stable, conserved molecules capable of acting as miRNA sponges, thereby modulating gene regulatory networks in a more indirect but equally impactful manner (Hansen et al., 2013; Memczak et al., 2013; Salzman et al., 2012).
The biological significance of these ncRNAs is further underscored by their involvement in disease. Aberrant expression or dysfunction of miRNAs has been linked to cancer progression, neurodegenerative disorders, and metabolic diseases (Calin & Croce, 2006; Jiang et al., 2009). Similarly, lncRNAs and circRNAs have been implicated in pathological processes ranging from tumorigenesis to drug resistance mechanisms (Chen et al., 2013; Hansen et al., 2013). In this context, ncRNAs are not merely biomarkers; they represent a vast, largely untapped reservoir of therapeutic targets.
Yet, translating this promise into practical drug discovery strategies has proven to be anything but straightforward. The concept of targeting RNA is not entirely new. Indeed, the bacterial ribosome—an RNA-rich complex—has long served as a successful target for antibiotics, demonstrating that RNA structures can, under certain conditions, be selectively modulated by small molecules (Davis, 1987; Poehlsgaard & Douthwaite, 2005). More recently, the development and clinical approval of risdiplam, a small molecule that modifies mRNA splicing in spinal muscular atrophy, has provided compelling evidence that RNA-targeted therapies can achieve clinical efficacy in humans (Ratni et al., 2018).
Despite these advances, several challenges persist. One of the most immediate—and perhaps underappreciated—limitations lies in our incomplete understanding of RNA structure. Unlike proteins, which often adopt well-defined and relatively stable three-dimensional conformations, many RNAs exhibit dynamic, context-dependent structures that are difficult to characterize experimentally (Rouskin et al., 2014; Rivas et al., 2017). Techniques such as X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy, while powerful, are not always feasible for large or flexible RNA molecules. Consequently, a substantial portion of the RNA structural landscape remains unresolved.
This structural ambiguity has direct implications for computational modeling. Traditional structure-based approaches, such as molecular docking, rely heavily on accurate three-dimensional representations of both targets and ligands. When applied to RNA, these methods often struggle with issues related to conformational flexibility, inadequate sampling, and limitations in scoring functions (Ruiz-Carmona et al., 2014; Trott & Olson, 2010). While adaptations of docking algorithms have been developed for nucleic acids, their predictive accuracy remains inconsistent, particularly for complex or poorly characterized RNA targets.
In response to these challenges, the field has increasingly turned toward data-driven methodologies, particularly those grounded in artificial intelligence (AI) and machine learning (ML). These approaches, rather than relying solely on explicit structural information, leverage patterns embedded within large-scale biological and chemical datasets. Early efforts in this direction often employed similarity-based methods, drawing on curated databases of known RNA–ligand interactions to infer potential binding relationships. Platforms such as Inforna exemplify this strategy, enabling sequence-based design of small molecules targeting structured RNAs (Disney et al., 2016; Velagapudi et al., 2014).
More recently, however, there has been a noticeable shift toward more sophisticated modeling frameworks. Deep learning architectures—particularly graph convolutional networks (GCNs) and attention-based models—have demonstrated remarkable capacity to capture complex, non-linear relationships within high-dimensional data (Kipf & Welling, 2017; Alipanahi et al., 2015). By representing RNA sequences and drug molecules as graphs or embeddings, these models can extract meaningful features without requiring detailed structural annotations. This capability is especially valuable in the context of ncRNAs, where structural data are often sparse or unavailable.
The application of these AI-driven approaches to ncRNA–drug interaction prediction holds considerable promise, particularly in areas such as drug resistance and personalized medicine. miRNAs, for example, can regulate the expression of genes involved in drug metabolism and efficacy, thereby influencing therapeutic outcomes (Rees et al., 2016). CircRNAs, through their interaction with miRNAs, add an additional layer of regulatory complexity that may impact drug sensitivity in ways that are only beginning to be understood (Hansen et al., 2013; Jeck et al., 2013). Accurately modeling these interactions could enable the identification of novel therapeutic targets and inform the design of more effective, individualized treatment strategies.
Nevertheless, it would be premature to suggest that current models have fully overcome the inherent challenges of this domain. Data sparsity remains a significant obstacle, as experimentally validated RNA–drug interaction datasets are limited in both size and diversity. Moreover, issues of model interpretability and generalization persist, raising important questions about the reliability of predictions in novel biological contexts. Addressing these limitations will likely require the integration of heterogeneous data sources—combining sequence information, structural predictions, functional annotations, and clinical data—alongside the development of more robust validation frameworks.
In sum, the field of ncRNA-targeted therapeutics, particularly when viewed through the lens of AI-driven modeling, appears to be at a critical juncture. There is, undeniably, a sense of cautious optimism: the tools are becoming more powerful, the data more abundant, and the biological insights more nuanced. Yet, the path forward is not without uncertainty. It is within this tension—between possibility and limitation—that the present review seeks to situate itself, examining both the progress achieved and the challenges that remain in the prediction of ncRNA–drug interactions.