Bioinfo Chem

System biology and Infochemistry | Online ISSN 3071-4826
1
Citations
13.7k
Views
32
Articles
Your new experience awaits. Try the new design now and help us make it even better
Switch to the new experience
Figures and Tables
REVIEWS   (Open Access)

AI-Driven Predictive Toxicology: Integrating Systems Biology and Machine Learning for the Future of Drug Safety

Meenakshi Singh 1*, Sonia Yadav 2*
 

 

+ Author Affiliations

Bioinfo Chem 2 (1) 1-13 https://doi.org/10.25163/bioinformatics.2110738

Submitted: 17 June 2020 Revised: 10 August 2020  Published: 18 August 2020 


Abstract

Toxicology, as a discipline, seems to be standing at a subtle yet decisive crossroads. For decades, safety assessment relied on animal-based models—robust in some respects, yet increasingly questioned for their cost, ethical implications, and, perhaps most critically, their imperfect translation to human biology. This review explores the emerging paradigm of Integrative Predictive Toxicology (IPT), where systems biology, machine learning, and biokinetic modeling converge to offer a more mechanistically grounded and human-relevant framework.Rather than focusing solely on late-stage apical outcomes, IPT shifts attention toward early molecular perturbations, often structured through Adverse Outcome Pathways (AOPs). Advances in high-throughput screening and multi-omics technologies now generate data at a scale that, not long ago, would have seemed unmanageable. Yet, when paired with machine learning algorithms and supported by curated databases, these datasets begin to reveal predictive patterns of toxicity. Importantly, the integration of physiologically based pharmacokinetic (PBPK) modeling and QIVIVE provides a necessary bridge between in vitro signals and real-world human exposure. Still, this transition is not without uncertainty. Questions of model interpretability, data harmonization, and regulatory acceptance remain. And yet, taken together, these approaches suggest something quite compelling: a shift from observing toxicity after it occurs to anticipating it before it manifests—arguably redefining the future of drug safety itself.

Keywords: Predictive toxicology; Machine learning; Systems biology; PBPK modeling; Adverse outcome pathways

1. Introduction

The science of drug and chemical safety assessment—once grounded in a relatively stable set of experimental traditions—now appears to be undergoing a rather profound, if somewhat uneven, transformation. For decades, toxicological evaluation has relied predominantly on animal-based models, with endpoints that are, in many cases, overt and late-stage: mortality, organ pathology, or measurable biochemical disruption. These approaches have undeniably contributed to regulatory safety frameworks, yet their limitations have become increasingly difficult to overlook. High financial costs, extended study durations, and ethical concerns surrounding animal use have long been cited. Perhaps more importantly, however, there is a growing recognition that interspecies differences—in gene expression, metabolic capacity, and physiological regulation—complicate the extrapolation of animal findings to human outcomes, often contributing to unexpected failures in clinical development (Hartung, 2010; Leist et al., 2014).

This sense of inadequacy is not entirely new. As early as the late 20th century, discussions around improving risk assessment frameworks were already emerging (National Research Council [NRC], 1983). Yet, it was the landmark report Toxicity Testing in the 21st Century: A Vision and a Strategy that crystallized a more decisive shift in thinking. The report proposed, somewhat boldly at the time, that toxicology should transition from an observational discipline—focused on identifying adverse outcomes after exposure—to a predictive science rooted in mechanistic understanding at the cellular and molecular levels (Council, 2007). In retrospect, this proposal did not merely suggest methodological refinement; it called for a conceptual reorientation of the field.

The momentum generated by this vision has since been reinforced by regulatory and societal pressures. Legislative initiatives, particularly within the European Union, have accelerated the move away from animal testing, effectively compelling the scientific community to explore alternative methodologies. At the same time, advances in computational biology, high-throughput experimentation, and systems-level data integration have made such alternatives not only feasible but increasingly compelling. Still, the transition has been anything but linear. While enthusiasm for new approaches is evident, their validation, standardization, and regulatory acceptance remain ongoing challenges.

Central to this evolving landscape is the emergence of New Approach Methodologies (NAMs)—a collective term encompassing in vitro assays, in silico models, and high-content data streams derived from omics technologies. These approaches attempt to capture biological responses at a resolution that traditional methods often cannot achieve. For instance, high-throughput screening platforms, such as those developed under programs like ToxCast and Tox21, generate vast datasets describing chemical interactions across diverse biological targets (Kavlock et al., 2012). When combined with computational techniques, these datasets offer the potential to identify patterns and predictive signatures that would otherwise remain obscured.

Within this framework, the concept of the Adverse Outcome Pathway (AOP) has emerged as a particularly useful organizing principle. Rather than treating toxicity as a singular endpoint, AOPs describe a sequence of causally linked events, beginning with a molecular initiating event and progressing through intermediate biological changes to an adverse outcome at the organism level (Ankley et al., 2010). This structured representation allows researchers to focus on early, mechanistically relevant perturbations, thereby enabling earlier and potentially more sensitive predictions of toxicity. Yet, even here, one encounters a degree of complexity: biological systems are rarely linear, and mapping these pathways often involves navigating overlapping networks and context-dependent responses.

At the intersection of these developments lies an increasingly influential role for machine learning and computational modeling. Techniques such as random forests and other ensemble learning methods have demonstrated considerable utility in handling high-dimensional toxicological data (Breiman, 2001). Similarly, quantitative structure–activity relationship (QSAR) models and broader in silico toxicology approaches aim to predict chemical hazards based on molecular features and known biological interactions (Raies & Bajic, 2016). These tools, while powerful, are not without limitations. Their predictive performance depends heavily on data quality, representativeness, and the underlying assumptions embedded within model architectures. As such, their integration into regulatory decision-making continues to require careful scrutiny.

A persistent challenge in this domain—one that perhaps sits at the core of predictive toxicology—is the translation of in vitro findings to in vivo human contexts. Cellular assays, while highly informative, often lack the physiological complexity of whole organisms. They capture snapshots of biological activity but may not fully account for processes such as absorption, distribution, metabolism, and excretion (ADME). To address this, physiologically based pharmacokinetic (PBPK) modeling and quantitative in vitro-to-in vivo extrapolation (QIVIVE) have become essential components of modern toxicological workflows. These approaches aim to contextualize in vitro bioactivity within realistic exposure scenarios, effectively bridging the gap between experimental systems and human biology (Paini et al., 2017; Wetmore, 2015).

PBPK models, in particular, simulate the movement of chemicals through the body using mathematical representations of physiological compartments. When integrated with in vitro data, they allow for the estimation of internal exposure metrics, such as tissue concentrations over time. This integration supports the calculation of safety margins that are more directly relevant to human health than traditional external dose metrics (Bessems et al., 2014; Rotroff et al., 2010). Still, these models require extensive parameterization and validation, and uncertainties can propagate through each stage of the modeling process.

Large-scale collaborative initiatives have played a significant role in advancing these methodologies. Programs such as SEURAT-1 have sought to develop integrated testing strategies capable of replacing repeated-dose animal studies, emphasizing data sharing, methodological standardization, and interdisciplinary collaboration (Whelan & Schwarz, 2011). Similarly, efforts to create centralized data repositories and standardized workflows have improved the accessibility and reproducibility of toxicological research (Berggren et al., 2017). Yet, despite these advances, the aspiration of a fully animal-free regulatory paradigm remains, at least for now, an ongoing process rather than an accomplished reality.

What becomes increasingly apparent is that no single methodology—whether experimental or computational—can fully address the complexities of toxicological prediction. Instead, the field appears to be moving toward integrated, systems-level approaches, where diverse data types and modeling strategies are combined into cohesive frameworks. These “metamodels,” as they are sometimes described, attempt to unify mechanistic insights, exposure assessments, and predictive analytics into a single interpretive structure. The promise here is considerable: a more accurate, efficient, and ethically aligned approach to drug safety evaluation. And yet, the realization of this promise depends on overcoming substantial scientific, technical, and regulatory hurdles.

In this context, the integration of systems biology with machine learning does not merely represent a technological advancement; it reflects a broader shift in how toxicity itself is conceptualized. No longer seen as an isolated endpoint, toxicity is increasingly understood as an emergent property of complex biological networks responding to chemical perturbation. Capturing this complexity—without oversimplifying it—remains one of the central challenges of modern toxicology.

2. Methodology

2.1 Study Design and Conceptual Framework

This study adopts a narrative review methodology, chosen deliberately for its flexibility in synthesizing interdisciplinary evidence across computational toxicology, systems biology, and regulatory science. Unlike systematic reviews that prioritize quantitative aggregation, the present approach seeks to construct a conceptual and integrative understanding of how predictive toxicology is evolving. In doing so, it aligns with the broader shift in toxicology from descriptive observation toward mechanistic interpretation and predictive modeling (Council, 2007).

The methodological framework was guided by principles of Integrative Predictive Toxicology (IPT), wherein diverse streams of evidence—ranging from in silico models to in vitro assays and biokinetic simulations—are interpreted collectively rather than in isolation. Particular emphasis was placed on frameworks such as the Adverse Outcome Pathway (AOP), which provides a structured means of linking molecular events to adverse biological outcomes (Ankley et al., 2010). This conceptual lens informed both the selection and synthesis of literature.

2.2 Literature Identification and Selection Strategy

Relevant literature was identified through a targeted search of scientific databases and authoritative reports, including peer-reviewed journals, regulatory publications, and foundational methodological papers. Priority was given to studies published before 2019, ensuring consistency with the reference framework of this review.

The selection strategy focused on four core domains:
(i) computational toxicology and machine learning approaches,
(ii) high-throughput in vitro and omics-based methodologies,
(iii) physiologically based pharmacokinetic (PBPK) modeling and QIVIVE, and
(iv) regulatory and collaborative initiatives supporting non-animal testing.

Rather than applying rigid inclusion/exclusion criteria, studies were selected based on conceptual relevance, methodological rigor, and citation prominence. Foundational works—such as those describing QSAR modeling (Raies & Bajic, 2016), ensemble learning techniques (Breiman, 2001), and PBPK frameworks (Bessems et al., 2014)—were included to anchor the discussion, while key programmatic initiatives such as ToxCast and SEURAT-1 were incorporated to reflect real-world implementation (Dix et al., 2007; Whelan & Schwarz, 2011).

2.3 Data Extraction and Thematic Synthesis

Data extraction was conducted in a qualitative manner, focusing on methodological features, application domains, and reported limitations of each study. Rather than compiling numerical datasets, the process involved identifying recurring themes and conceptual linkages across the literature.

The extracted information was subsequently organized into four thematic domains, which are reflected in the structured tables presented in this review:

  • computational modeling techniques
  • data repositories and knowledge infrastructures
  • biokinetic parameters for translational modeling
  • strategic frameworks and regulatory initiatives

These tables function not merely as summaries but as analytical anchors, enabling cross-comparison of methodologies and highlighting areas of convergence. For instance, computational models described in Table 1 were interpreted in light of the data ecosystems outlined in Table 2, emphasizing the interdependence between algorithmic performance and data quality (Hardy et al., 2012).

2.4 Integration of Computational and Biological Evidence

A central methodological step involved the integration of computational predictions with biological and physiological context. This was achieved by aligning in silico outputs—such as toxicity classifications or structure–activity relationships—with mechanistic insights derived from AOP frameworks and omics-based studies (Ankley et al., 2010; Berggren et al., 2017).

To address the well-recognized gap between in vitro findings and human outcomes, particular attention was given to QIVIVE and PBPK modeling approaches. These methods enable the translation of bioactive concentrations into human-relevant exposure metrics, thereby providing a bridge between experimental systems and clinical scenarios (Rotroff et al., 2010; Wetmore, 2015). The inclusion of these approaches reflects an effort to move beyond isolated data points toward a systems-level interpretation of toxicity.

2.5 Limitations of the Methodological Approach

While the narrative review design allows for conceptual depth and interdisciplinary synthesis, it is not without limitations. The absence of formal meta-analytic procedures means that findings are inherently interpretative and may be influenced by the selection of sources. Additionally, the reliance on pre-2019 literature, while intentional, may exclude more recent methodological advancements.

Furthermore, the integration of diverse data types—ranging from computational predictions to biological assays—introduces challenges related to comparability and standardization. These limitations were mitigated, where possible, by prioritizing well-established frameworks and widely cited studies.

2.6 Ethical and Scientific Considerations

The methodological approach is grounded in the ethical principles of the 3Rs—Reduction, Refinement, and Replacement, which advocate for minimizing animal use in scientific research (Russell & Burch, 1960). By focusing on NAMs and predictive modeling strategies, this review aligns with global efforts to develop more humane and human-relevant approaches to safety assessment.

At the same time, scientific rigor remains paramount. The integration of computational and experimental evidence was approached cautiously, with attention to model limitations, data quality, and the need for validation. In this sense, the methodology reflects a balance between innovation and critical evaluation, acknowledging both the promise and the current constraints of predictive toxicology.

3. The Renaissance of Risk Assessment: Bridging Human Biology and Artificial Intelligence in Integrative Predictive Toxicology

The field of toxicology, for much of its history, has operated with a certain pragmatic simplicity—observe, record, and infer. Yet, as the complexity of chemical exposure has expanded and the limitations of traditional methods have become increasingly visible, this simplicity has begun to feel insufficient, perhaps even misleading. What is now emerging, somewhat gradually but unmistakably, is a reconfiguration of toxicological

Table 1. Computational Methods and Algorithms for Toxicity Prediction.  This table outlines the primary in silico tools used to link chemical structures to biological outcomes, emphasizing the move toward ensemble "metamodels" (Raies & Bajic, 2016; Rodríguez-Belenguer et al., 2023).  It highlights the transition from traditional QSAR models to advanced ensemble “metamodels,” integrating machine learning and biokinetic approaches to improve predictive accuracy and mechanistic interpretability

 

Method

Core Algorithm

Primary Application

Toxicological Endpoint

Input Data Type

Performance Metric

Key Strengths

Key Limitations

QSAR

Linear Regression

Activity Prediction

Mutagenicity

2D Descriptors

R² > 0.70

Fast, cost-effective

Limited domain applicability

Random Forest

Decision Trees (Ensemble)

Classification

Hepatotoxicity

Molecular fingerprints

Accuracy ≈ 0.76

Captures non-linearity

Risk of overfitting

Support Vector Machine (SVM)

Hyperplane Optimization

Binary Classification

Cardiotoxicity

Molecular graphs

AUC ≈ 0.85

High robustness

Low interpretability (“black box”)

Artificial Neural Networks (ANN)

Deep Learning

Complex Prediction

LD50

Physicochemical data

RMSE ≈ 0.3

High modeling capacity

Requires large datasets

LASSO

Regularization

Feature Selection

Biomarker discovery

Multi-omics

Model efficiency

Simplicity, sparsity

Linear assumptions

Bayesian Inference

Probabilistic Modeling

Parameter Estimation

PBPK inputs

Prior + assay data

Uncertainty metrics

Probabilistic outputs

Computationally intensive

HHTK

Reverse Dosimetry

Dosimetry

Steady-state conc. (Css)

In vitro clearance

~3.2-fold error

High-throughput

Simplified physiology

PBTK Models

Differential Equations

Biokinetics

Tissue concentration

ADME parameters

Mechanistic traceability

Biologically realistic

Data-intensive

TTC

Decision Tree

Safety screening

Low exposure risk

Chemical structure

Threshold-based

Rapid screening

No biological context

Read-Across

Similarity-Based

Data gap filling

Genotoxicity

Analog compounds

Weight-of-evidence

No new experiments

Subjective interpretation

Table 2. Key Databases for Systems Toxicology and Predictive Modeling. High-quality, curated databases are the foundation of machine learning models in drug safety, facilitating the integration of chemical and biological information (Gaulton et al., 2012; Igarashi et al., 2015).  This table presents widely used repositories that enable the integration of chemical, biological, and clinical data for predictive modeling and systems-level toxicological analysis.

Database

Data Type

Scale

Key Feature

Format

Source

IPT Application

ChEMBL

Bioactivity

>1M records

Structure–activity mapping

SQL/CSV

Literature

Target identification

PubChem

Chemical

>100M compounds

High-throughput assay data

JSON/XML

Global labs

Large-scale screening

e-Drug3D

FDA-approved drugs

1,800+

3D structures

MOL2/SDF

Drug labels

PK modeling

DrugBank

Pharmacological

10K+

Drug–target interactions

XML/Web

Clinical data

Drug–drug interaction prediction

SIDER

Side effects

1,400+ drugs

Adverse drug reactions

TSV/Text

Post-marketing

AOP mapping

ToxCast

In vitro assays

4,000+ chemicals

700+ bioassays

MySQL

US EPA

Pathway perturbation

TG-GATEs

Omics

170+ drugs

Time-course gene expression

CEL/FASTQ

JSTP

Biomarker discovery

DrugMatrix

Toxicogenomics

600+ compounds

Organ-specific responses

Microarray

NTP

Mechanistic toxicology

ACToR

Integrated

500K+ entries

Data aggregation

Web portal

US EPA

Data mining

ToxBank

Framework

Standardized protocols

API/Wiki

SEURAT-1

Data harmonization

 

thinking—one that attempts to bridge biological realism with computational precision. This evolving paradigm, often referred to as Integrative Predictive Toxicology (IPT), does not rely on a single methodological breakthrough. Rather, it is defined by its integration of data, of disciplines, and, importantly, of perspectives.

At its core, IPT seeks to harmonize New Approach Methodologies (NAMs), systems biology, and machine learning into a unified framework capable of predicting human-relevant toxicity. This shift—from apical observation to mechanistic interpretation—is not merely technical. It reflects a deeper conceptual transition: toxicity is no longer treated as an endpoint alone, but as a dynamic process unfolding across biological scales. As such, the renaissance of risk assessment is less about replacing old tools and more about rethinking how evidence is constructed, interpreted, and ultimately trusted (Council, 2007; Hartung, 2010).

3.1 The Legacy Problem: Why Traditional Models Are Falling Short

For decades, animal models have served as the cornerstone of safety evaluation. Toxicologists, working within well-established protocols, have relied on observable outcomes—mortality rates, weight changes, histopathological alterations—to infer potential human risks (Whelan & Schwarz, 2011). These models have undoubtedly provided a protective framework for public health. And yet, their reliability, particularly in predicting human-specific responses, has come under increasing scrutiny.

One of the more persistent challenges lies in interspecies variability. Biological systems, even among mammals, differ in subtle but consequential ways—enzyme expression, receptor sensitivity, metabolic pathways. These differences can lead to divergent toxicological outcomes, making extrapolation inherently uncertain (Leist et al., 2014). It is perhaps not surprising, then, that a significant proportion of drug candidates fail during clinical trials due to unforeseen toxicities, despite having passed preclinical animal testing (Council, 2007).

Beyond scientific concerns, practical limitations also weigh heavily. Traditional toxicological studies are resource-intensive, often requiring years to complete and substantial financial investment. At the same time, ethical considerations—formalized through the 3Rs principle of Reduction, Refinement, and Replacement—have shifted from aspirational guidelines to regulatory expectations (Russell & Burch, 1960). Taken together, these factors suggest that while animal models remain informative, they are no longer sufficient as the sole foundation for modern risk assessment.

3.2 A Global Turning Point: The 2007 Vision

If one were to identify a moment when the trajectory of toxicology began to change more decisively, the publication of the National Research Council’s 2007 report would stand out. Toxicity Testing in the 21st Century: A Vision and a Strategy did not merely critique existing practices; it proposed an alternative vision—one centered on mechanistic understanding and predictive capability (Council, 2007).

The report argued that toxicology should move away from observational endpoints and instead focus on how chemicals perturb biological pathways at the cellular and molecular levels. This idea, while conceptually straightforward, had far-reaching implications. It suggested that toxicity could be anticipated—modeled, even—before it manifests as overt damage. Regulatory developments, particularly within the European Union, reinforced this shift. Policies such as REACH and restrictions on animal testing for cosmetics created both pressure and opportunity for innovation. In response, the scientific community began to invest more heavily in NAMs, exploring alternatives that could offer greater human relevance while aligning with ethical mandates (Berggren et al., 2017). Still, the transition has been incremental. Translating vision into validated practice remains, in many ways, an ongoing process.

3.3 The Adverse Outcome Pathway (AOP): Opening the “Black Box”

One of the more compelling conceptual tools to emerge from this transition is the Adverse Outcome Pathway (AOP) framework. Traditionally, toxicity has often been treated as something of a “black box”: a chemical is administered, and an adverse outcome is observed. The AOP framework attempts to illuminate what happens in between. An AOP begins with a Molecular Initiating Event (MIE)—for example, a chemical binding to a receptor or disrupting a key enzyme. From there, it traces a sequence of Key Events (KEs) across different levels of biological organization, ultimately leading to an adverse outcome at the tissue or organism level (Ankley et al., 2010; Zaldívar-Comenges et al., 2016). This structured representation allows researchers to identify mechanistic links rather than relying solely on empirical associations. There is, however, an inherent complexity in this approach. Biological systems are not strictly linear, and pathways often intersect, diverge, or compensate for perturbations. Despite these challenges, the AOP framework provides a scaffold for organizing knowledge, enabling the identification of early biomarkers and improving the interpretability of predictive models (Hartung, 2010). In this sense, it transforms the “black box” into something more like a network—still complex, but increasingly navigable.

3.4 Methodological Pillars: Integrating In Silico, In Vitro, and Omics

If IPT has a defining strength, it lies in its ability to integrate diverse methodologies. No single system—whether computational or experimental—can fully capture biological complexity. Yet, when combined thoughtfully, these approaches begin to approximate it.

3.4.1 The Digital Shield: In Silico Modeling

Computational toxicology has evolved rapidly, moving beyond simple structure-based predictions toward more sophisticated, data-driven models. Quantitative Structure–Activity Relationship (QSAR) models, for instance, use molecular descriptors to estimate toxicity, while machine learning techniques—such as random forests—enable the analysis of high-dimensional datasets with improved predictive accuracy (Breiman, 2001; Raies & Bajic, 2016). More recently, there has been a shift toward ensemble or “metamodel” approaches, where multiple algorithms are combined to capture different aspects of chemical behavior. These models can screen large chemical libraries efficiently, identifying potential hazards early in the development pipeline (Enoch et al., 2011). Still, their performance depends heavily on training data quality and domain applicability—limitations that must be acknowledged when interpreting results.

3.4.2 The Biological Mirror: High-Throughput In Vitro Systems

Parallel to computational advances, in vitro technologies have undergone a transformation of their own. Programs such as Tox21 and ToxCast have demonstrated the feasibility of testing thousands of compounds across hundreds of biological targets using automated, high-throughput platforms (Kavlock et al., 2012).

These systems, often based on human-derived cell lines, provide a more direct window into human biology than animal models. They allow researchers to observe how chemicals perturb specific pathways—oxidative stress, receptor signaling, DNA damage—at a scale that would be impractical using traditional methods (Tice et al., 2013). Yet, they are not without limitations. Cellular systems, by design, lack the systemic interactions present in whole organisms, raising questions about how well these findings translate beyond the dish.

3.4.3 The Deep Lens: Multi-Omics Integration

To address this gap, IPT increasingly relies on multi-omics data—integrating transcriptomics, proteomics, and metabolomics to capture the molecular “fingerprint” of exposure. These datasets provide a level of detail that is both powerful and, at times, overwhelming. One of the key challenges is distinguishing between adaptive responses—temporary adjustments that maintain cellular homeostasis—and true toxicological tipping points. Multi-omics analysis, when combined with pathway frameworks such as AOPs, offers a way to make this distinction more systematically (Berggren et al., 2017). Still, interpreting these high-dimensional datasets requires careful statistical and biological validation.

3.5 Biokinetics: Translating Signals into Human Reality

Despite the advances described above, a fundamental challenge remains: how to translate experimental findings into meaningful predictions of human exposure and risk. This is where biokinetic modeling becomes indispensable.

Physiologically Based Pharmacokinetic (PBPK) models simulate how chemicals are absorbed, distributed, metabolized, and excreted within the human body. By integrating physiological parameters with chemical-specific data, these models provide a dynamic representation of internal exposure (Bessems et al., 2014). When combined with Quantitative In Vitro to In Vivo Extrapolation (QIVIVE), they allow researchers to convert in vitro concentrations into oral equivalent doses (OEDs)—a process sometimes referred to as reverse dosimetry (Rotroff et al., 2010; Paini et al., 2017). This integration is not merely technical; it is essential for ensuring that in vitro findings are interpreted within a realistic biological context. Without it, experimental results risk remaining abstract—detached from the conditions under which humans are actually exposed. Even so, uncertainties in parameter estimation and model assumptions continue to present challenges, underscoring the need for ongoing refinement.

3.6 Collaborative Ecosystems: Building Confidence Through Integration

The transition toward IPT has been facilitated, in large part, by collaborative initiatives that bring together expertise across disciplines. The SEURAT-1 program, for example, represents a coordinated effort to develop alternative approaches capable of replacing repeated-dose animal testing (Whelan & Schwarz, 2011).

A key outcome of such initiatives has been the development of shared infrastructures, such as the ToxBank data warehouse, which standardizes experimental protocols and promotes data accessibility (Hardy et al., 2012). These platforms are critical for ensuring reproducibility and for building the kind of scientific confidence required for regulatory acceptance. Importantly, collaboration extends beyond data sharing. It involves aligning methodologies, validating models across different contexts, and, perhaps most challenging of all, establishing common standards for interpretation. In this sense, IPT is as much a social and institutional endeavor as it is a scientific one.

3.7 Toward a Human-Centered Future of Toxicology

What emerges from this convergence of methodologies is a vision of toxicology that is, in many ways, more aligned with human biology than ever before. By integrating mechanistic insights, computational models, and biokinetic frameworks, IPT offers a pathway toward predicting risk with greater accuracy and ethical responsibility.

And yet, it would be premature to suggest that the transition is complete. Regulatory frameworks continue to evolve, validation studies are ongoing, and methodological uncertainties persist. Still, the direction is clear. Toxicology is moving—perhaps cautiously, but decisively—beyond the constraints of the “black box” toward a more transparent, predictive, and human-relevant science (Council, 2007; Paini et al., 2017).

4. Synthesizing the New Era of Predictive Toxicology

The transition from traditional toxicology toward a predictive, human-centered paradigm is often described as a technological shift. Yet, that description, while convenient, feels somewhat incomplete. What is unfolding is not simply the adoption of new tools, but a gradual redefinition of how toxicological knowledge itself is constructed. The earlier reliance on animal-based observations—valuable though it has been—now appears increasingly constrained by its inability to capture the complexity and variability of human biology. In its place, a more integrated framework is emerging, one that draws simultaneously on computational modeling, high-throughput experimentation, and systems-level biological insight.

This synthesis is not abstract; it is concretely illustrated in the structured progression of methodologies summarized across Tables 1–4. Taken together, these tables do more than catalog tools—they reveal a layered architecture of modern toxicology, where prediction, data integration, mechanistic reasoning, and regulatory alignment begin to converge. As shown in Table 1, computational models form the analytical backbone of this transformation, while Table 2 highlights the data ecosystems that sustain them. Meanwhile, Table 3 introduces the biokinetic parameters that anchor predictions in physiological reality, and Table 4 situates these advances within broader strategic and regulatory frameworks. When considered collectively, these components suggest not a fragmented field, but one that is slowly, perhaps cautiously, becoming coherent.

4.1 The Algorithmic Landscape: From Prediction to Interpretation

At the heart of predictive toxicology lies an expanding repertoire of computational methods. Early approaches, particularly QSAR models, offered a relatively straightforward means of linking chemical structure to biological activity. They remain useful, especially for screening endpoints such as mutagenicity (Raies & Bajic, 2016). However, as highlighted in Table 1, these models are increasingly supplemented—or even replaced—by more sophisticated machine learning techniques capable of capturing non-linear relationships.

Algorithms such as Random Forest and Support Vector Machines have demonstrated notable success in classifying complex toxicological outcomes, including hepatotoxicity and cardiotoxicity (Breiman, 2001; Raies & Bajic, 2016). Yet, even these methods, when used in isolation, can feel insufficient when confronted with the layered complexity of biological systems. It is here that the concept of ensemble or “metamodel” approaches becomes particularly relevant. By integrating outputs from multiple algorithms, these models can account for structural, mechanistic, and statistical dimensions simultaneously.

Still, there is a subtle tension embedded within these

Table 3. Critical Parameters in PBPK and Biokinetic Modeling. This table outlines translating in vitro concentrations to human equivalent doses (OED) requires precise parameterization of biokinetic models to capture internal exposure scenarios (Bessems et al., 2014; Paini et al., 2017).

Parameter

Symbol

Definition

Source

Role in IVIVE

Biological Focus

Units

Intrinsic Clearance

Clint

Metabolic capacity

Hepatocytes

Clearance scaling

Liver

µL/min/10⁶ cells

Fraction Unbound

fu

Free drug fraction

Plasma assays

Bioavailability

Plasma

Ratio

Partition Coefficient

Kp

Tissue:blood ratio

QSPR models

Distribution

Multi-organ

Ratio

Maximum Velocity

Vmax

Enzyme capacity

Microsomes

Saturation kinetics

Metabolism

nmol/min/mg

Affinity Constant

Km

Substrate affinity

Microsomes

Enzyme scaling

Metabolism

µM

Absorption Rate

kabs

Uptake rate

Caco-2 assays

Dosimetry

Gut

h⁻¹

No-Effect Concentration

NEC

Toxicity threshold

VCBA

Safety limits

Cellular

µM

Killing Rate

kr

Cell death rate

VCBA

Potency estimation

Cellular

h⁻¹

Bioavailability

F

Systemic exposure fraction

PBPK

OED calculation

Whole body

%

Blood Flow

Q

Organ perfusion

Physiology

ADME simulation

Systemic

L/h

Table 4. Integrated Frameworks and Strategic Initiatives in Predictive Toxicology. Global initiatives have provided the infrastructure for animal-free safety assessment, linking molecular initiating events to regulatory decisions (Berggren et al., 2017; Council, 2007).

Initiative

Objective

Method Focus

Driver

Endpoint

Strategy

Outcome

SEURAT-1

Replace animal testing

NAMs/IPT

EU policy

Systemic toxicity

Case studies

ToxBank, AOP integration

Tox21

High-throughput screening

Robotics, assays

US government

Pathway activity

Automation

10K chemical library

NRC Vision (2007)

Paradigm shift

Mechanistic toxicology

Scientific reform

Human risk

Systems approach

Strategic framework

COSMOS

Cosmetic safety

In silico tools

EU ban

Repeated-dose toxicity

KNIME workflows

Database generation

REACH

Chemical regulation

Non-animal methods

EU law

Hazard identification

Read-across

Data sharing

7th Amendment

Animal testing ban

Alternatives

Ethical/regulatory

Cosmetics safety

Ban enforcement

Industry transformation

AOP Framework

Mechanistic mapping

Systems biology

OECD

Organ toxicity

MIE → AO

AOP-Wiki

QIVIVE

Dose extrapolation

PBPK modeling

Regulatory need

Safe exposure

Reverse dosimetry

OED estimation

TTC

Risk thresholding

Decision tree

Food/cosmetics

Low-risk exposure

Structural rules

Screening tool

3Rs Principle

Animal welfare

Replace/refine/reduce

Ethics

Welfare endpoints

Alternative methods

Global policy

 

advances. As predictive accuracy improves, interpretability can diminish. The so-called “black box” nature of certain machine learning models raises legitimate concerns, particularly in regulatory contexts where transparency is essential. Consequently, there is a growing emphasis on aligning computational predictions with mechanistic frameworks—an effort that echoes throughout the broader IPT paradigm.

4.2 Data as Infrastructure: Integration, Standardization, and Trust

If algorithms represent the analytical core of predictive toxicology, then data serves as its lifeblood. The diversity and scale of datasets now available to researchers are, in many ways, unprecedented. As summarized in Table 2, databases such as ChEMBL and PubChem provide extensive bioactivity information, while resources like DrugBank and SIDER add clinical and pharmacological context (Wishart et al., 2018). These repositories enable the mapping of molecular interactions to real-world adverse outcomes, bridging a gap that once seemed difficult to traverse.

However, the mere availability of data does not guarantee its utility. Historically, toxicological datasets have been fragmented, inconsistently formatted, and often difficult to access. Initiatives such as ToxCast and ToxBank have sought to address these challenges by standardizing experimental protocols and promoting data harmonization (Dix et al., 2007; Hardy et al., 2012). As indicated in Table 2, such infrastructures are not simply repositories—they are enabling platforms that support reproducibility, interoperability, and collaborative analysis.

This standardization has practical implications. For instance, read-across approaches, which infer the toxicity of untested chemicals based on structurally similar compounds, depend heavily on the reliability and comparability of existing data (Berggren et al., 2017). Without consistent data governance, such methods would remain speculative at best. Thus, the integration of data is not merely a technical challenge; it is a prerequisite for building scientific confidence.

4.3 Bridging the Translational Divide: Biokinetics and Dosimetric Anchoring

Despite the advances in computational modeling and data integration, a persistent question remains: how do these insights translate into human-relevant risk? The answer, increasingly, lies in the domain of biokinetics, where experimental observations are contextualized within physiological systems.

As detailed in Table 3, parameters such as intrinsic clearance (Clint), fraction unbound (fu), and tissue partition coefficients (Kp) form the basis of Physiologically Based Pharmacokinetic (PBPK) models (Bessems et al., 2014). These models simulate the movement of chemicals through the body, capturing processes of absorption, distribution, metabolism, and excretion. When combined with Quantitative In Vitro to In Vivo Extrapolation (QIVIVE), they allow researchers to convert in vitro concentrations into oral equivalent doses (OEDs) (Rotroff et al., 2010; Wetmore, 2015).

This process, often described as reverse dosimetry, is critical for translating laboratory findings into meaningful exposure metrics. Without it, in vitro data remains disconnected from real-world scenarios. Encouragingly, studies have shown that PBPK models, when properly parameterized, can predict human pharmacokinetics with reasonable accuracy, even in the absence of animal data (Gajewska et al., 2015; Paini et al., 2017).

Yet, challenges persist. The accurate modeling of extra-hepatic metabolism, transporter dynamics, and inter-individual variability remains an area of ongoing research. These uncertainties do not undermine the value of biokinetic modeling, but they do suggest that its application must be approached with careful consideration.

4.4 Strategic Alignment: Frameworks, Policy, and Global Coordination

The scientific advances described thus far do not exist in isolation. They are embedded within a broader landscape of regulatory frameworks and collaborative initiatives that shape their development and application. As outlined in Table 4, programs such as SEURAT-1 and Tox21 have played a central role in advancing integrative predictive toxicology, fostering collaboration across disciplines and institutions (Whelan & Schwarz, 2011; Dix et al., 2007).

Equally significant is the influence of conceptual frameworks such as the Adverse Outcome Pathway (AOP). By linking molecular initiating events to adverse outcomes through a series of key biological events, AOPs provide a structured means of organizing mechanistic knowledge (Ankley et al., 2010). This structure is essential for aligning experimental data with computational predictions, effectively serving as the connective tissue of modern toxicology.

Regulatory drivers have further accelerated this transition. Policies such as REACH and restrictions on animal testing have transformed ethical considerations into enforceable standards, promoting the adoption of alternative methodologies (Berggren et al., 2017). In parallel, the principles of the 3Rs—Reduction, Refinement, and Replacement—continue to guide the development of more humane and scientifically robust approaches (Russell & Burch, 1960).

Taken together, these initiatives suggest that the shift toward IPT is not merely scientific, but systemic. It reflects a coordinated effort to align methodology, policy, and ethics in pursuit of a more predictive and human-relevant toxicology.

4.5 Synthesis and Future Directions

When viewed collectively, the elements presented across Tables 1–4 begin to coalesce into a coherent framework. Computational models provide predictive capability; databases supply the necessary data; biokinetic models ensure physiological relevance; and strategic initiatives establish the conditions for implementation. This integration, while still evolving, represents a significant departure from the fragmented approaches of the past.

There is, however, a certain humility required in interpreting these advances. While predictive toxicology has made remarkable progress, it is not without limitations. Model uncertainty, data gaps, and regulatory challenges remain. Moreover, the expansion of chemical space—particularly in areas such as environmental toxicants and complex mixtures—continues to test the boundaries of existing methodologies.

And yet, despite these challenges, the direction of travel seems clear. Toxicology is moving toward a framework that is not only more predictive, but also more transparent, more efficient, and more aligned with human biology. It is, perhaps, less about replacing one paradigm with another, and more about weaving together multiple strands of evidence into a more coherent whole.

In this sense, Integrative Predictive Toxicology does not simply represent the future of drug safety—it reflects a broader reimagining of how science can anticipate, rather than merely observe, the consequences of chemical exposure (Council, 2007; Paini et al., 2017).

5. Limitations

Despite its integrative scope, this review is not without limitations—some methodological, others perhaps more conceptual. As a narrative synthesis, it does not follow formal systematic review protocols, which means that study selection, while deliberate, may still reflect a degree of interpretative bias. The reliance on pre-2019 literature, although intended to maintain consistency with foundational frameworks, inevitably excludes more recent advances in deep learning architectures and emerging omics technologies. Additionally, while the review emphasizes integration across computational and biological domains, the heterogeneity of available data presents challenges. Differences in experimental design, assay conditions, and reporting standards complicate direct comparisons. There is also an inherent uncertainty in extrapolating in vitro findings to human scenarios, even when supported by PBPK and QIVIVE models. Finally, regulatory acceptance of these approaches remains uneven, limiting their immediate translation into standardized safety assessment pipelines.

6. Conclusion

The trajectory of toxicology, while not entirely linear, is undeniably shifting toward a more predictive and human-centered framework. Integrative Predictive Toxicology, by combining computational intelligence with mechanistic biology and physiological modeling, offers a compelling alternative to traditional paradigms. Yet, its promise lies not in replacing existing methods outright, but in refining how evidence is generated and interpreted. There is still work to be done—particularly in validation, standardization, and regulatory alignment. Even so, the field appears to be moving toward something more anticipatory than reactive, where toxicity is not merely observed, but understood in advance, and, ideally, avoided altogether.

Author Contributions

M.S. conceptualized the study, designed the review framework, and drafted the original manuscript. S.Y. contributed to literature analysis, interpretation of findings, and critically reviewed and edited the manuscript for important intellectual content.  All authors read and approved the final version of the manuscript.

References


Ankley, G. T., Bennett, R. S., Erickson, R. J., Hoff, D. J., Hornung, M. W., Johnson, R. D., Mount, D. R., Nichols, J. W., Russom, C. L., Schmieder, P. K., Serrrano, J. A., Tietge, J. E., & Villeneuve, D. L. (2010). Adverse outcome pathways: A conceptual framework to support ecotoxicology research and risk assessment. Environmental Toxicology and Chemistry, 29(3), 730–741. https://doi.org/10.1002/etc.34

Berggren, E., White, A., Ouedraogo, G., Paini, A., Richarz, A. N., Bois, F. Y., ... & Worth, A. (2017). Ab initio chemical safety assessment: A workflow based on exposure considerations and non-animal methods. Computational Toxicology, 4, 31–44.

Berggren, E., White, A., Ouedraogo, G., Paini, A., Richarz, A. N., Bois, F. Y., Exner, T., Leite, S., van Grunsven, L. A., Worth, A., & Mahony, C. (2017). Ab initio chemical safety assessment: A workflow based on exposure considerations and non-animal methods. Computational Toxicology, 4, 31–44.

Bessems, J. G., Loizou, G., Krishnan, K., Clewell, H. J., Bernasconi, C., Bois, F., Coecke, S., Collnot, E. M., Diembeck, W., Farcal, L., Geraets, L., Gundert-Remy, U., Kramer, N., Küsters, G., Leite, S. B., Pelkonen, O., Schröder, K., Testai, E., Wilk-Zasadna, I., & Zaldívar-Comenges, J. M. (2014). PBTK modelling platforms and parameter estimation tools to enable animal-free risk assessment. Regulatory Toxicology and Pharmacology, 68(1), 119–139. https://doi.org/10.1016/j.yrtph.2013.11.008

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324

Council, N. R. (2007). Toxicity testing in the 21st century: A vision and a strategy. National Academies Press. https://doi.org/10.17226/11970

Cramer, G. M., Ford, R. A., & Hall, R. L. (1978). Estimation of toxic hazard—a decision tree approach. Food and Cosmetics Toxicology, 16(3), 255–276.

Dix, D. J., Houck, K. A., Martin, M. T., Richard, A. M., Setzer, R. W., & Kavlock, R. J. (2007). The ToxCast program for prioritizing toxicity testing of environmental chemicals. Toxicological Sciences, 95(1), 5–12.

Enoch, S. J., Cronin, M. T., & Ellison, C. M. (2011). The use of a chemistry-based profiler for covalent DNA binding in the development of chemical categories for read-across for genotoxicity. ATLA - Alternatives to Laboratory Animals, 39(2), 131–145.

Gajewska, M., Paini, A., Sala Benito, J. V., Burton, J., Worth, A., Urani, C., Briesen, H., & Schramm, K. W. (2015). In vitro to in vivo correlation of the skin penetration, liver clearance and hepatotoxicity of caffeine. Food and Chemical Toxicology, 75, 39–49. https://doi.org/10.1016/j.fct.2014.10.018

Ganter, B., Snyder, R. D., Halbert, D. N., & Lee, M. D. (2006). Toxicogenomics in drug discovery and development: Mechanistic analysis of compound/class-dependent effects using the DrugMatrix database. Pharmacogenomics, 7(7), 1025–1044.

Gaulton, A., Bellis, L. J., Bento, A. P., Chambers, J., Davies, M., Hersey, A., Light, Y., McGlinchey, S., Michalovich, D., Al-Lazikani, B., & Overington, J. P. (2012). ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Research, 40(D1), D1100–D1107.

Hardy, B., Apic, G., Carthew, P., Clark, D., Cook, D., Dix, I., Escher, S., Hastings, J., Heard, D. J., Jeliazkova, N., Judson, P., Matis-Mitchell, S., Mitic, D., Myatt, G., Shah, I., Spjuth, O., Tcheremenskaia, O., Toldo, L., Watson, D., ... & Yang, C. (2012). A toxicology ontology roadmap. ALTEX, 29(2), 129–137. https://doi.org/10.14573/altex.2012.2.129

Hartung, T. (2010). Lessons learned from alternative methods and their validation for a new toxicology in the 21st century. Journal of Toxicology and Environmental Health, Part B, 13(2-4), 277–290. https://doi.org/10.1080/10937401003734435

Igarashi, Y., Nakatsu, N., Yamashita, T., Yamada, H., & Urushidani, T. (2015). Open TG-GATEs: a large-scale toxicogenomics database. Nucleic Acids Research, 43(D1), D921–D927.

Ingle, B. L., Veber, B. C., Nichols, J. W., & Tornero-Velez, R. (2016). Informing the human plasma protein binding of environmental chemicals by machine learning in the pharmaceutical space: Applicability domain and limits of predictability. Journal of Chemical Information and Modeling, 56(11), 2243–2252.

Judson, R. S., Martin, M. T., Egeghy, P., Gangwal, S., Reif, D. M., Kothiya, P., Wolf, M., Cathey, T., Transue, T., Smith, D., ... & Richard, A. M. (2012). Aggregating data for computational toxicology applications: The U.S. Environmental Protection Agency (EPA) Aggregated Computational Toxicology Resource (ACToR) system. International Journal of Molecular Sciences, 13(2), 1805–1831.

Kavlock, R., Chandler, K., Houck, K., Hunter, S., Judson, R., Kleinstreuer, N., Knudsen, T., Martin, M., Padilla, S., Reif, D., Richard, A. M., Rotroff, D., Sipes, N. S., & Dix, D. (2012). Update on EPA’s ToxCast program: Providing high throughput decision support tools for chemical risk management. Chemical Research in Toxicology, 25(7), 1287–1302. https://doi.org/10.1021/tx3000939

Kuhn, M., Letunic, I., Jensen, L. J., & Bork, P. (2015). The SIDER database of drugs and side effects. Nucleic Acids Research, 44(D1), D1075–D1079.

Leist, M., Hasiwa, N., Rovida, C., Daneshian, M., Basketter, D., Kimber, I., Clewell, H., Gocht, T., Goldberg, A., Busquet, F., Rossi, A. M., Schwarz, M., Stephens, M., Taalman, R., Knudsen, T. B., McKim, J., Harris, G., Pamies, D., & Hartung, T. (2014). Consensus report on the future of animal-free systemic toxicity testing. ALTEX, 31(3), 341–356. https://doi.org/10.14573/altex.1406091

Lipscomb, J. C., Haddad, S., Poet, T., & Krishnan, K. (2012). Physiologically-based pharmacokinetic (PBPK) models in toxicity testing and risk assessment. Advances in Experimental Medicine and Biology, 745, 76–95.

National Research Council (NRC). (1983). Risk assessment in the federal government: Managing the process. National Academies Press.

National Research Council (NRC). (2007). Toxicity testing in the 21st century: A vision and a strategy. National Academies Press.

Paini, A., Sala Benito, J. V., Bessems, J., & Worth, A. P. (2017). From in vitro to in vivo: Integration of the virtual cell based assay with physiologically based kinetic modelling. Toxicology in Vitro, 45, 241–248.

Pihan, E., Colliandre, L., Guichard, G., & Bonnet, P. (2012). e-Drug3D: 3D structures and physicochemical properties of FDA-approved drugs. Journal of Computer-Aided Molecular Design, 26(11), 1273–1280.

Raies, A. B., & Bajic, V. B. (2016). In silico toxicology: Computational methods for the prediction of chemical toxicity. WIREs Computational Molecular Science, 6(2), 147–172.

Rodgers, T., & Rowland, M. (2006). Physiologically based pharmacokinetic modelling 2: Predicting the tissue distribution of acids, very weak bases, neutrals and zwitterions. Journal of Pharmaceutical Sciences, 95(6), 1238–1257.

Rotroff, D. M., Wetmore, B. A., Dix, D. J., Ferguson, S. S., Clewell, H. J., Houck, K. A., Lecluyse, E. L., Andersen, M. E., Judson, R. S., Smith, C. M., Sochaski, M. A., Kavlock, R. J., Boellmann, F., Martin, M. T., Reif, D. M., Wambaugh, J. F., & Thomas, R. S. (2010). Incorporating human dosimetry and exposure into high-throughput in vitro toxicity screening. Toxicological Sciences, 117(2), 348–358. https://doi.org/10.1093/toxsci/kfq220

Russell, W. M. S., & Burch, R. L. (1960). The principles of humane experimental technique. Medical Journal of Australia, 1(13), 500.

Russell, W. M. S., & Burch, R. L. (1960). The principles of humane experimental technique. Methuen.

Schmitt, W. (2008). General approach for the calculation of tissue to plasma partition coefficients. Toxicology in Vitro, 22(2), 457–467.

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288.

Tice, R. R., Austin, C. P., Kavlock, R. J., & Bucher, J. R. (2013). Improving the human hazard characterization of chemicals: A Tox21 update. Environmental Health Perspectives, 121(7), 756–765. https://doi.org/10.1289/ehp.1205784

Wang, Y., Xiao, J., Suzek, T. O., Zhang, J., Wang, J., & Bryant, S. H. (2009). PubChem: a public information system for analyzing bioactivities of small molecules. Nucleic Acids Research, 37(suppl_2), W623–W633.

Wetmore, B. A. (2015). Quantitative in vitro-to-in vivo extrapolation in a high-throughput environment. Toxicology, 332, 94–101.

Whelan, M., & Schwarz, M. (2011). SEURAT: Vision, research strategy and execution. Towards the Replacement of in vivo Repeated Dose Systemic Toxicity, 1, 47–57.

Wishart, D. S., Feunang, Y. D., Guo, A. C., Lo, E. J., Marcu, A., Grant, J. R., Sajed, T., Johnson, D., Li, C., Sayeeda, Z., ... & Wilson, M. (2018). DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Research, 46(D1), D1074–D1082.

Zaldívar Comenges, J. M., & Baraibar, J. (2011). A virtual cell based assay model for the prediction of in vitro toxicity. Toxicology in Vitro, 25(8), 1670–1679.

Zaldívar-Comenges, J. M., Joossens, E., Sala Benito, J. V., Worth, A., & Paini, A. (2016). Theoretical and mathematical foundation of the virtual cell based assay – A review. Toxicology in Vitro, 45, 209–221. https://doi.org/10.1016/j.tiv.2016.07.013

Zhang, J. H., Fraczkiewicz, R., Bolger, M. B., Waldman, M., Woltosz, W. S., & Enslein, K. (2008). Predicting kinetic parameters Km and Vmax for substrates of human cytochrome P450 1A2, 2C9, 2C19, 2D6, and 3A4. Drug Metabolism Reviews, 40(S1), 119.


Article metrics
View details
0
Downloads
0
Citations
11
Views
📖 Cite article

View Dimensions


View Plumx


View Altmetric



0
Save
0
Citation
11
View
0
Share