Microbial Bioactives
Microbial Bioactives | Online ISSN 2209-2161
279
Citations
170.7k
Views
157
Articles
REVIEWS (Open Access)
Illuminating Biological Dark Matter: Integrating Metagenomics, Synthetic Biology, and AI to Unlock Microbial and Genomic Potential for Therapeutics and Biotechnology
Yue Li 1, Shunqi Liu 2 *
Microbial Bioactives 9 (1) 1-8 https://doi.org/10.25163/microbbioacts.9110627
Submitted: 13 February 2026 Revised: 01 April 2026 Accepted: 08 April 2026 Published: 10 April 2026
Abstract
Biological “dark matter,” encompassing uncultured microorganisms and poorly characterized regions of microbial and human genomes, represents a vast and largely untapped resource for therapeutic discovery and sustainable biotechnology. Traditional cultivation-based approaches have accessed only a small fraction of this diversity, limiting innovation in drug development, biomanufacturing, and precision medicine. Recent advances in metagenomics, synthetic biology, proteogenomics, and artificial intelligence (AI) offer powerful tools to overcome these constraints. This study presents a systematic review and quantitative meta-analysis conducted in accordance with PRISMA guidelines. Peer-reviewed literature published between 2000 and 2025 was retrieved from PubMed, Web of Science, Scopus, and Google Scholar. Eligible studies employed function-based, sequencing-based, or single-cell metagenomics; synthetic biology chassis systems; AI- or machine learning–driven bioprocess optimization; proteogenomic integration; HIV-1 genomic surveillance; or gut microbiota interventions. Effect sizes were extracted and synthesized using random-effects models, with heterogeneity, publication bias, and sensitivity analyses performed to ensure robustness. Meta-analytic synthesis revealed that metagenomic approaches significantly enhance the discovery of structurally novel and bioactive natural products compared with conventional methods. Engineered microbial chassis, particularly yeast and cyanobacteria, demonstrated consistent improvements in biomass accumulation and metabolite yield. AI-driven models achieved high predictive accuracy across bioprocessing applications, while proteogenomic integration revealed reproducible genotype–phenotype associations in cancer. Microbiome-based interventions showed moderate but consistent improvements in microbial diversity and metabolic function. Collectively, the findings demonstrate that integrating metagenomics, synthetic biology, proteogenomics, and AI enables scalable, reproducible, and functionally relevant exploration of biological dark matter. This convergence provides a robust framework for advancing therapeutic innovation, sustainable biotechnology, and precision medicine
Keywords: Metagenomics; Synthetic Biology; Artificial Intelligence; Microbial Dark Matter; Proteogenomics; HIV-1; Gut Microbiota; Natural Products
References
Alam, K., Abbasi, M. N., Hao, J., Zhang, Y., & Li, A. (2021). Strategies for natural products discovery from uncultured microorganisms. Molecules, 26(10), 2977. https://doi.org/10.3390/molecules26102977
Alexiev, I., & Dimitrova, R. (2025). The origins and genetic diversity of HIV-1: Evolutionary insights and global health perspectives. International Journal of Molecular Sciences, 26(22), 10909. https://doi.org/10.3390/ijms262210909
Alrashed, A. A. A. A., et al. (2018). Electro- and thermophysical properties of water-based nanofluids containing copper ferrite nanoparticles coated with silica: Experimental data, modeling through enhanced ANN and curve fitting. International Journal of Heat and Mass Transfer, 127, 139–150. https://doi.org/10.1016/j.ijheatmasstransfer.2018.07.123
Andrade Cruz, I., et al. (2022). Application of machine learning in anaerobic digestion: Perspectives and challenges. Bioresource Technology, 343, 126433. https://doi.org/10.1016/j.biortech.2021.126433
Ansari, F. A., et al. (2021). Artificial neural network and techno-economic estimation with algae-based tertiary wastewater treatment. Journal of Water Process Engineering, 39, 101761. https://doi.org/10.1016/j.jwpe.2020.101761
Ansari, F. A., Nasr, M., Rawat, I., & Bux, F. (2021). Artificial neural network and techno-economic estimation with algae-based tertiary wastewater treatment. Journal of Water Process Engineering, 40, Article 101761. https://doi.org/10.1016/j.jwpe.2020.101761
Asnake Metekia, W., et al. (2022). Artificial intelligence-based approaches for modeling the effects of Spirulina growth mediums on total phenolic compounds. Saudi Journal of Biological Sciences, 29(2), 1053–1062. https://doi.org/10.1016/j.sjbs.2021.09.055
Bagherzadeh, F., et al. (2021). Comparative study on total nitrogen prediction in wastewater treatment plants and the effect of various feature selection methods on machine learning algorithm performance. Journal of Water Process Engineering, 43, 102033. https://doi.org/10.1016/j.jwpe.2021.102033
Banerjee, A., et al. (2016). Fertilizer-assisted optimal cultivation of microalgae using response surface methodology and genetic algorithm for biofuel feedstock. Energy, 115, 127–138. https://doi.org/10.1016/j.energy.2016.09.066
Bi, X., et al. (2019). Species identification and survival competition analysis of microalgae via hyperspectral microscopic images. Optik, 178, 238–246. https://doi.org/10.1016/j.ijleo.2018.09.077
Blank-Landeshammer, B., Richard, V. R., Mitsa, G., Marques, M., LeBlanc, A., Kollipara, L., … Borchers, C. H. (2019). Proteogenomics of colorectal cancer liver metastases: Complementing precision oncology with phenotypic data. Cancers, 11(12), 1907. https://doi.org/10.3390/cancers11121907
Camacho-Rodríguez, J., et al. (2015). Genetic algorithm for medium optimization of the microalga Nannochloropsis gaditana cultured for aquaculture. Bioresource Technology, 177, 102–109. https://doi.org/10.1016/j.biortech.2014.11.057
Cheah, W. Y., et al. (2018). Enhancing biomass and lipid production of microalgae in palm oil mill effluent using carbon and nutrient supplementation. Energy Conversion and Management, 164, 188–197. https://doi.org/10.1016/j.enconman.2018.02.094
Chen, C., et al. (2011). Cultivation, photobioreactor design, and harvesting of microalgae for biodiesel production: A critical review. Bioresource Technology, 102(1), 71–81. https://doi.org/10.1016/j.biortech.2010.06.159
Chong, J. W. R., Khoo, K. S., Chew, K. W., Ting, H. Y., Iwamoto, K., Ruan, R., Ma, Z., & Show, P. L. (2024). Artificial intelligence-driven microalgae autotrophic batch cultivation: A comparative study of machine and deep learning-based image classification models. Algal Research, 79, Article 103400. https://doi.org/10.1016/j.algal.2024.103400
Dixon, T. A., & Pretorius, I. S. (2020). Drawing on the past to shape the future of synthetic yeast research. International Journal of Molecular Sciences, 21(19), 7156. https://doi.org/10.3390/ijms21197156
Felley-Bosco, E. (2023). Exploring the expression of the “dark matter” of the genome in mesothelioma for potentially predictive biomarkers for prognosis and immunotherapy. Cancers, 15(11), 2969. https://doi.org/10.3390/cancers15112969
Feng, Z., Chakraborty, D., Dewell, S. B., Reddy, B. V. B., & Brady, S. F. (2012). Environmental DNA-encoded antibiotics fasamycins A and B inhibit FabF in type II fatty acid biosynthesis. Journal of the American Chemical Society, 134(6), 2981–2987. https://doi.org/10.1021/ja207662w
Fu, J., Gao, Q., & Li, S. (2023). Application of intelligent medical sensing technology. Biosensors, 13(8), 812. https://doi.org/10.3390/bios13080812
Gillespie, D. E., Brady, S. F., Bettermann, A. D., Cianciotto, N. P., Liles, M. R., Rondon, M. R., … Handelsman, J. (2002). Isolation of antibiotics turbomycin A and B from a metagenomic library of soil microbial DNA. Applied and Environmental Microbiology, 68(9), 4301–4306. https://doi.org/10.1128/AEM.68.9.4301-4306.2002
Handelsman, J. (2004). Metagenomics: Application of genomics to uncultured microorganisms. Microbiology and Molecular Biology Reviews, 68(4), 669–685. https://doi.org/10.1128/MMBR.68.4.669-685.2004
Hildebrand, M., Waggoner, L. E., Liu, H., Sudek, S., Allen, S., Anderson, C., … Haygood, M. (2004). bryA: An unusual modular polyketide synthase gene from the uncultivated bacterial symbiont of the marine bryozoan Bugula neritina. Chemistry & Biology, 11(11), 1543–1552. https://doi.org/10.1016/j.chembiol.2004.08.018
Huang, K. L., Li, S., Mertins, P., Cao, S., Gunawardena, H. P., Ruggles, K. V., … Ding, L. (2017). Proteogenomic integration reveals therapeutic targets in breast cancer xenografts. Nature Communications, 8(1), 14864. https://doi.org/10.1038/ncomms14864
Imamoglu, E. (2024). Artificial intelligence and/or machine learning algorithms in microalgae bioprocesses. Bioengineering, 11(11), 1143. https://doi.org/10.3390/bioengineering11111143
Kavitha, S., Ravi, Y. K., Kumar, G., & Nandabalan, Y. K. (2024). Microalgal biorefineries: advancement in machine learning tools for sustainable biofuel production and value-added products recovery. Journal of Environmental Management, 353, 120135. https://doi.org/10.1016/j.jenvman.2024.120135
Liu, X., Tang, K., & Hu, J. (2024). Application of cyanobacteria as chassis cells in synthetic biology. Microorganisms, 12(7), 1375. https://doi.org/10.3390/microorganisms12071375
Luan, G., Zhang, S., & Lu, X. (2020). Engineering cyanobacteria chassis cells toward more efficient photosynthesis. Current Opinion in Biotechnology, 62, 1–6. https://doi.org/10.1016/j.copbio.2019.07.004
Mertins, P., Mani, D. R., Ruggles, K. V., Gillette, M. A., Clauser, K. R., Wang, P., … Carr, S. A. (2016). Proteogenomics connects somatic mutations to signalling in breast cancer. Nature, 534(7605), 55–62. https://doi.org/10.1038/nature18003
Onay, A. (2023). Theoretical models constructed by artificial intelligence algorithms for enhanced lipid production: Decision support tools. Bitlis Eren Üniversitesi Fen Bilimleri Dergisi, 12(4), 1195–1211. https://doi.org/10.17798/bitlisfen.1362136
Oruganti, R. K., Biji, A. P., Lanuyanger, T., Show, P. L., Sriariyanun, M., Upadhyayula, V. K., ... & Bhattacharyya, D. (2023). Artificial intelligence and machine learning tools for high-performance microalgal wastewater treatment and algal biorefinery: A critical review. Science of The Total Environment, 876, 162797. https://doi.org/10.1016/j.scitotenv.2023.162797
Otálora, P., Guzmán, J. L., Acién, F. G., Berenguel, M., & Reul, A. (2023). An artificial intelligence approach for identification of microalgae cultures. New Biotechnology, 77, 58–67. https://doi.org/10.1016/j.nbt.2023.07.003
Piel, J. (2002). A polyketide synthase–peptide synthetase gene cluster from an uncultured bacterial symbiont of Paederus beetles. Proceedings of the National Academy of Sciences, 99(22), 14002–14007. https://doi.org/10.1073/pnas.222481399
Quaranta, G., Guarnaccia, A., Fancello, G., Agrillo, C., Iannarelli, F., Sanguinetti, M., & Masucci, L. (2022). Fecal microbiota transplantation and other gut microbiota manipulation strategies. Microorganisms, 10(12), 2424. https://doi.org/10.3390/microorganisms10122424
Reddy, B. V. B., Milshteyn, A., Charlop-Powers, Z., & Brady, S. F. (2014). eSNaPD: A versatile, web-based bioinformatics platform for surveying and mining natural product biosynthetic diversity from metagenomes. Chemistry & Biology, 21(8), 1023–1033. https://doi.org/10.1016/j.chembiol.2014.06.007
Reimann, R., Zeng, B., Jakopec, M., Burdukiewicz, M., Petrick, I., Schierack, P., & Rödiger, S. (2020). Classification of dead and living microalgae Chlorella vulgaris by bioimage informatics and machine learning. Algal Research, 48, Article 101908. https://doi.org/10.1016/j.algal.2020.101908
Rinke, C., Schwientek, P., Sczyrba, A., Ivanova, N. N., Anderson, I. J., Cheng, J. F., … Woyke, T. (2013). Insights into the phylogeny and coding potential of microbial dark matter. Nature, 499(7459), 431–437. https://doi.org/10.1038/nature12352
Sarkar, S., Manna, M. S., Bhowmick, T. K., & Gayen, K. (2020). Extraction of chlorophylls and carotenoids from dry and wet biomass of isolated Chlorella thermophila: Optimization of process parameters and modelling by artificial neural network. Process Biochemistry, 96, 58–72. https://doi.org/10.1016/j.procbio.2020.05.025
Scott, T. A., & Piel, J. (2019). The hidden enzymology of bacterial natural product biosynthesis. Nature Reviews Chemistry, 3(7), 404–425. https://doi.org/10.1038/s41570-019-0107-1
Sonmez, M. E., Eczacioglu, N., Gumus, N. E., Aslan, M. F., Sabanci, K., & Asikkutlu, B. (2022). Convolutional neural network–support vector machine based approach for classification of cyanobacteria and chlorophyta microalgae groups. Algal Research, 61, Article 102568. https://doi.org/10.1016/j.algal.2021.102568
Sultana, N., Hossain, S. M. Z., Abusaad, M., Alanbar, N., Senan, Y., & Razzak, S. A. (2022). Prediction of biodiesel production from microalgal oil using Bayesian optimization algorithm-based machine learning approaches. Fuel, 309, Article 122184. https://doi.org/10.1016/j.fuel.2021.122184
Venter, J. C., Remington, K., Heidelberg, J. F., Halpern, A. L., Rusch, D., Eisen, J. A., … Nelson, W. (2004). Environmental genome shotgun sequencing of the Sargasso Sea. Science, 304(5667), 66–74. https://doi.org/10.1126/science.1093857
Wu, C., Shang, Z., Lemetre, C., Ternei, M. A., & Brady, S. F. (2019). Cadasides, calcium-dependent acidic lipopeptides from the soil metagenome that are active against multidrug-resistant bacteria. Journal of the American Chemical Society, 141(9), 3910–3919. https://doi.org/10.1021/jacs.8b12087
Yu, J., Liberton, M., Cliften, P. F., Head, R. D., Jacobs, J. M., Smith, R. D., … Pakrasi, H. B. (2015). Synechococcus elongatus UTEX 2973, a fast-growing cyanobacterial chassis for biosynthesis using light and CO2. Scientific Reports, 5(1), 8132. https://doi.org/10.1038/srep08132
Zhang, B., Wang, J., Wang, X., Zhu, J., Liu, Q., Shi, Z., … Tabb, D. L. (2014). Proteogenomic characterization of human colon and rectal cancer. Nature, 513(7518), 382–387. https://doi.org/10.1038/nature13438
Recommended articles
From Antioxidants to Enzyme Inhibitors: A Systematic Review and Meta-Analysis of Bioactive Natural Products Targeting Oxidative Stress, Mitochondrial Function, and Microbial Virulence
Article metrics
View details
0
Downloads
0
Citations
212
Views
0
Save
Save
0
Citation
Citation
212
View
View
0
Share
Share