1. Introduction
Rare diseases, by their very nature, occupy a paradoxical space in global health—each condition is uncommon, yet collectively they affect hundreds of millions of individuals worldwide. This duality creates a persistent tension in clinical practice: while the burden is substantial, the evidence base often remains fragmented, sparse, and unevenly distributed. For many patients, the journey toward diagnosis is neither linear nor timely; rather, it unfolds over years, sometimes decades, marked by uncertainty, misclassification, and, not infrequently, frustration. It is within this landscape—defined by complexity, rarity, and unmet clinical need—that artificial intelligence (AI) has begun to attract attention as a potentially transformative tool.
AI in healthcare is not entirely new, yet its recent evolution feels qualitatively different. Earlier computational approaches relied heavily on predefined rules or relatively constrained datasets, but contemporary machine learning (ML) and deep learning (DL) systems are increasingly capable of identifying subtle, nonlinear relationships across vast and heterogeneous biomedical data (Beam & Kohane, 2018; Jiang et al., 2017). These advances have led to notable successes in domains such as medical imaging and pattern recognition, where algorithms have, in some instances, approached or even surpassed human-level diagnostic performance (Esteva et al., 2017; He et al., 2019). Still, whether such achievements can be translated effectively into the rare disease context remains an open—and, perhaps, more complicated—question.
One of the most pressing challenges in rare disease care is delayed or missed diagnosis. Conventional diagnostic pathways often depend on clinician expertise, access to specialized testing, and the availability of prior comparable cases—resources that are inherently limited when dealing with rare conditions. AI-based systems, however, offer a different kind of promise. By integrating multimodal data—ranging from clinical symptoms and imaging to genomics and electronic health records—they can uncover patterns that may not be immediately apparent to human observers (Davenport & Kalakota, 2019). For instance, ML algorithms applied to acoustic signals have demonstrated potential in the early detection of neurological disorders such as Parkinson’s disease, highlighting how even non-traditional data sources can contribute to earlier clinical insight (Alalayah et al., 2023; Dao et al., 2025). Yet, these advances should be interpreted cautiously, particularly given ongoing concerns about generalizability across datasets and populations (Hireš et al., 2023).
Beyond diagnosis, AI is increasingly being explored for its role in modelling disease trajectories and informing clinical decision-making. Rare diseases often lack well-characterized natural histories, largely due to small patient populations and fragmented data collection. In such settings, predictive modelling may serve as a surrogate for long-term observational evidence. Multi-omics integration, for example, has begun to reveal complex biological signatures associated with disease subtypes and progression, particularly in areas such as hematological malignancies (Alhamrani et al., 2025). While these approaches hold considerable promise, they also raise methodological questions—especially regarding model validation, reproducibility, and the interpretation of effect sizes in small or heterogeneous cohorts (Deeks et al., 2008).
Still, it would be overly simplistic to frame AI as a purely technical solution to a clinical problem. Its integration into healthcare systems introduces a range of ethical, legal, and societal considerations that are difficult to disentangle from the technology itself. Issues of bias, for instance, are not merely theoretical; they emerge from the data on which models are trained and can perpetuate or even amplify existing inequities in care (Challen et al., 2019). Similarly, concerns about transparency and explainability—particularly in deep learning models—have prompted calls for greater accountability in algorithmic decision-making (Amann et al., 2020; Grote & Berens, 2020). The notion that clinicians should trust systems they cannot fully interpret remains, for many, an unresolved tension.
These concerns extend into broader discussions about the governance of AI in healthcare. Regulatory bodies, including the U.S. Food and Drug Administration, have begun to outline frameworks for evaluating AI-based medical technologies, emphasizing the need for continuous monitoring, validation, and post-market surveillance (FDA, 2021). Parallel efforts, such as the SPIRIT-AI and CONSORT-AI guidelines, aim to standardize the reporting of clinical trials involving AI interventions, thereby improving transparency and reproducibility (Cruz Rivera et al., 2020). Yet, despite these developments, there remains a lack of consensus on how best to operationalize such standards across diverse healthcare settings.
Ethical frameworks have also evolved in response to these challenges, often emphasizing principles such as fairness, accountability, and respect for human autonomy. Scholars have proposed unified approaches to AI ethics that attempt to balance innovation with societal responsibility, though the practical implementation of these principles is far from straightforward (Floridi & Cowls, 2019). In the context of precision medicine and rare diseases, questions of fairness take on additional significance, particularly when data scarcity may disproportionately affect already underrepresented populations (Ferryman & Pitcan, 2018). Likewise, concerns about the unintended consequences of large-scale AI systems—such as the propagation of misleading or biased outputs—have been raised in broader discussions about the limitations of modern machine learning architectures (Bender et al., 2021).
At the clinical level, the deployment of AI systems also necessitates careful consideration of workflow integration and practitioner training. While the theoretical benefits of AI—improved efficiency, enhanced diagnostic accuracy, and personalized treatment strategies—are often highlighted, their realization depends heavily on contextual factors such as infrastructure, clinician acceptance, and interoperability with existing systems (Char et al., 2018; Gerke et al., 2020). In some cases, the introduction of AI may even introduce new forms of complexity, particularly if systems are poorly calibrated or insufficiently validated.
Despite these challenges, it would be difficult to ignore the growing body of evidence suggesting that AI can meaningfully contribute to rare disease research and care. From early detection and phenotypic classification to predictive modelling and clinical decision support, AI technologies are beginning to reshape how rare diseases are understood and managed. Yet, this transformation is neither uniform nor complete. It unfolds unevenly, shaped by technical limitations, ethical considerations, and the realities of clinical practice.
This review, therefore, takes a deliberately balanced approach. Rather than presenting AI as a definitive solution, it seeks to critically examine its current capabilities and limitations within the context of rare disease research. Specifically, the review synthesizes existing evidence on diagnostic accuracy, modelling approaches, and clinical utility, while also considering the methodological, ethical, and regulatory challenges that accompany their implementation. In doing so, it aims not only to highlight what has been achieved, but also to clarify what remains uncertain—and, perhaps more importantly, what must still be addressed before AI can fully realize its potential in rare disease healthcare.

