De Novo Molecular Generation Augmentation for Drug Discovery Using Deep Learning Approaches: A Comparative Study of Variational Autoencoders
Muzaffar Ahmad Sofi1, Dhanpratap Singh1 , Tawseef Ahmed Teli 2*
Journal of Angiotherapy 8(10) 1-13 https://doi.org/10.25163/angiotherapy.8109996
Submitted: 26 August 2024 Revised: 13 October 2024 Published: 14 October 2024
This study demonstrated how deep learning enhances drug discovery, streamlining molecular design processes and improving accuracy and chemical validity.
Abstract
Background: Drugs are defined as chemicals that induce physiological effects when ingested, and their development involves multiple stages, including discovery, design, and development, which are often complex and resource-intensive. To address these challenges, machine learning (ML) and deep learning (DL) techniques have emerged as powerful tools to optimize the drug development pipeline. Methods: This study utilized two distinct variational autoencoders: a convolutional encoder-decoder model and a convolutional-GRU-based encoder-decoder model. Employing a reparameterization technique, we aimed to improve the efficiency of de novo molecular generation. Both models were trained and evaluated on the ZINC dataset, assessing their capability to generate chemically valid and syntactically accurate molecules.Results: The convolution-GRU model demonstrated a synthesis accuracy of 96.79%, matching the performance of the convolutional encoder-decoder model. Additionally, the chemical validity of the generated compounds was notable, with unique chemical validity scores of 90.71% for the convolutional encoder-decoder model and 90.42% for the convolution-GRU model. Conclusion: The results indicate that deep molecular generative models, especially the convolution-GRU approach, significantly advance de novo molecular design. By achieving high levels of accuracy and chemical validity, these models hold promise for enhancing drug discovery processes and expediting the introduction of new therapeutics to the market.
Keywords: Drug discovery, Deep learning, Variational autoencoders, Molecular generation, Chemical validity.
References
Ahmed, T., Teli, T., & Masoodi, F. (2021). Blockchain in healthcare: Challenges and opportunities. In Proceedings of the International Conference on IoT Based Control Networks Intelligent Systems - ICICNIS 2021. SSRN. https://doi.org/10.2139/ssrn.3882744
Ahmed, T., Teli, T., Yousuf, R., & Masoodi, F. (2021). Security concerns and privacy preservation in blockchain-based IoT systems: Opportunities and challenges. ICICNIS 2020. SSRN. https://ssrn.com/abstract=3768235
Ali, M., & Aittokallio, T. (2019). Machine learning and feature selection for drug response prediction in precision oncology applications. Biophysical Reviews, 11, 31–39. https://doi.org/10.1007/s12551-018-0446-z
Askr, H., Elgeldawi, E., Aboul Ella, H., et al. (2023). Deep learning in drug discovery: An integrative review and future challenges. Artificial Intelligence Review, 56, 5975–6037. https://doi.org/10.1007/s10462-022-10306-1
Bhadwal, A. S., Kumar, K., & Kumar, N. (2024). NRC-VABS: Normalized reparameterized conditional variational autoencoder with applied beam search in latent space for drug molecule design. Expert Systems with Applications, 240, 122396.
Bilodeau, C., Jin, W., Jaakkola, T., Barzilay, R., & Jensen, K. F. (2022). Generative models for molecular discovery: Recent advances and challenges. Wiley Interdisciplinary Reviews: Computational Molecular Science, 12, e1608. https://doi.org/10.1002/wcms.1608
Blaschke, T., Olivecrona, M., Engkvist, O., Bajorath, J., & Chen, H. (2018). Application of generative autoencoder in de novo molecular design. Molecular Informatics, 37(1-2), 1700123. https://doi.org/10.1002/minf.201700123
Brown, N. (2015). In silico medicinal chemistry: Computational methods to support drug design. Royal Society of Chemistry.
Chan, H. S., et al. (2019). Advancing drug discovery via artificial intelligence. Trends in Pharmacological Sciences, 40(8), 592–604.
Chen, G., Shen, Z., Iyer, A., Ghumman, U. F., Tang, S., Bi, J., Chen, W., & Li, Y. (2020). Machine-learning-assisted de novo design of organic molecules and polymers: Opportunities and challenges. Polymers, 12(1), 163. https://doi.org/10.3390/polym12010163
Chen, S., Lin, T., Basu, R., Ritchey, J., Wang, S., Luo, Y., ... & Cheng, X. (2024). Design of target-specific peptide inhibitors using generative deep learning and molecular dynamics simulations. Nature Communications, 15(1), 1611.
Cheng, T., Hao, M., Takeda, T., et al. (2017). Large-scale prediction of drug-target interaction: A data-centric review. AAPS Journal, 19, 1264–1275. https://doi.org/10.1208/s12248-017-0092-6
Ciallella, H. L., & Zhu, H. (2019). Advancing computational toxicology in the big data era by artificial intelligence: Data-driven and mechanism-driven modelling for chemical toxicity. Chemical Research in Toxicology, 32, 536–547.
Dahouda, M. K., & Joe, I. (2021). A deep-learned embedding technique for categorical features encoding. IEEE Access, 9, 114381–114391. https://doi.org/10.1109/ACCESS.2021.3104357
Dara, S., Dhamercherla, S., Jadav, S. S., Babu, C. M., & Ahsan, M. J. (2022). Machine learning in drug discovery: A review. Artificial Intelligence Review, 55(3), 1947-1999. https://doi.org/10.1007/s10462-021-10058-4
David, L., Thakkar, A., Mercado, R., et al. (2020). Molecular representations in AI-driven drug discovery: A review and practical guide. Journal of Cheminformatics, 12, 56. https://doi.org/10.1186/s13321-020-00460-5
Diederik, P. K., & Max, W. (2019). An introduction to variational autoencoders.
Diederik, P. K., & Max, W. (2019). An introduction to variational autoencoders.
Ekins, S. (2016). The next era: Deep learning in pharmaceutical research. Pharmaceutical Research, 33(11), 2594-2603. https://doi.org/10.1007/s11095-016-2029-7
Gangwal, A., & Lavecchia, A. (2024). Unlocking the potential of generative AI in drug discovery. Drug Discovery Today, 103992.
Gao, M., Igata, H., Takeuchi, A., Sato, K., & Ikegaya, Y. (2017). Machine learning-based prediction of adverse drug effects: An example of seizure-inducing compounds. Journal of Pharmacological Sciences, 133(2), 70-78. https://doi.org/10.1016/j.jphs.2017.01.003
Gómez-Bombarelli, R., Duvenaud, D., Hernández-Lobato, J., Aguilera-Iparraguirre, J., Hirzel, T., Adams, R., & Aspuru-Guzik, A. (2016). Automatic chemical design using a data-driven continuous representation of molecules. ACS Central Science, 4, 268-276. https://doi.org/10.1021/acscentsci.7b00572
Guan, S., & Wang, G. (2024). Drug discovery and development in the era of artificial intelligence: From machine learning to large language models. Artificial Intelligence Chemistry, 2(1), 100070.
Gupta, A., Müller, A. T., Huisman, B. J. H., Fuchs, J. A., Schneider, P., & Schneider, G. (2018). Generative recurrent networks for de novo drug design. Molecular Informatics, 37(1-2), 1700111. https://doi.org/10.1002/minf.201700111
Gupta, R., Srivastava, D., Sahu, M., et al. (2021). Artificial intelligence to deep learning: Machine intelligence approach for drug discovery. Molecular Diversity, 25, 1315–1360. https://doi.org/10.1007/s11030-021-10217-3
Hancock, J. T., & Khoshgoftaar, T. M. (2020). Survey on categorical data for neural networks. Journal of Big Data, 7, 28. https://doi.org/10.1186/s40537-020-00305-w
Hirohara, M., Saito, Y., Koda, Y., Sato, K., Sakakibara, Y., & Yasubumi, K. (2018). Convolutional neural network based on SMILES representation of compounds for detecting chemical motifs. BMC Bioinformatics, 19, 10. https://doi.org/10.1186/s12859-018-2523-5
Joo, S., Kim, M., Yang, J., & Park, J. (2020). Generative model for proposing drug candidates satisfying anticancer properties using a conditional variational autoencoder. ACS Omega. https://doi.org/10.1021/acsomega.0c01149
Kadurin, A., Aliper, A., Kazennov, A., et al. (2017). The cornucopia of meaningful leads: Applying deep adversarial autoencoders for new molecule development in oncology. Oncotarget, 8(7), 10883-10890. https://doi.org/10.18632/oncotarget.14073
Kanakala, G. C., Devata, S., Chatterjee, P., & Priyakumar, U. D. (2024). Generative artificial intelligence for small molecule drug design. Current Opinion in Biotechnology, 89, 103175.
Kiriiri, G. K., Njogu, P. M., & Mwangi, A. N. (2020). Exploring different approaches to improve the success of drug discovery and development projects: A review. Future Journal of Pharmaceutical Sciences, 6(1), 27. https://doi.org/10.1186/s43094-020-00047-9
Lauv Patel, T., Shukla, T., Huang, X., Ussery, D. W., & Wang, S. (2020). Machine learning methods in drug discovery. Molecules.
Lavecchia, A. (2024). Navigating the frontier of drug-like chemical space with cutting-edge generative AI models. Drug Discovery Today, 104133.
Lim, J., Ryu, S., Kim, J. W., et al. (2018). Molecular generative model based on conditional variational autoencoder for de novo molecular design. Journal of Cheminformatics, 10, 31. https://doi.org/10.1186/s13321-018-0286-7
Liu, P., Li, H., Li, S., et al. (2019). Improving prediction of phenotypic drug response on cancer cell lines using deep convolutional networks. BMC Bioinformatics, 20, 408. https://doi.org/10.1186/s12859-019-2910-6
Mariam, Z., Niazi, S. K., & Magoola, M. (2024). Unlocking the future of drug development: Generative AI, digital twins, and beyond. BioMedInformatics, 4(2), 1441–1456.
Méndez-Lucio, O., Baillif, B., Clevert, D. A., et al. (2020). De novo generation of hit-like molecules from gene expression signatures using artificial intelligence. Nature Communications, 11, 10. https://doi.org/10.1038/s41467-019-13807-w
Middaugh, C. R., & Pearlman, R. (1999). Proteins as drugs: Analysis, formulation, and delivery. In D. L. Oxender & L. E. Post (Eds.), Novel therapeutics from modern biotechnology (pp. 35-60). Handbook of Experimental Pharmacology, vol 137. Springer. https://doi.org/10.1007/978-3-642-59990-3_3
Nag, S., Baidya, A. T. K., Mandal, A., et al. (2022). Deep learning tools for advancing drug discovery and development. 3 Biotech, 12(1), 110. https://doi.org/10.1007/s13205-022-03165-8
Ozawa, M., Nakamura, S., Yasuo, N., & Sekijima, M. (2024). IEV2Mol: Molecular generative model considering protein–ligand interaction energy vectors. Journal of Chemical Information and Modeling.
Partin, A., Brettin, T., Evrard, Y. A., et al. (2021). Learning curves for drug response prediction in cancer cell lines. BMC Bioinformatics, 22, 252. https://doi.org/10.1186/s12859-021-04163-y
Paul, D., Sanap, G., Shenoy, S., Kalyane, D., Kalia, K., & Tekade, R. K. (2021). Artificial intelligence in drug discovery and development. Drug Discovery Today, 26(1), 80-93. https://doi.org/10.1016/j.drudis.2020.10.010
Pereira, J. C., et al. (2016). Boosting docking-based virtual screening with deep learning. Journal of Chemical Information and Modeling, 56, 2495–2506.
Prykhodko, O., Johansson, S. V., Kotsias, P. C., et al. (2019). A de novo molecular generation method using latent vector based generative adversarial networks. Journal of Cheminformatics, 11, 74. https://doi.org/10.1186/s13321-019-0397-9
Rusdi, N. A., Kasihmuddin, M. S. M., Romli, N. A., Manoharam, G., & Mansor, M. A. (2023). Multi-unit discrete Hopfield neural network for higher order supervised learning through logic mining: Optimal performance design and attribute selection. Journal of King Saud University - Computer and Information Sciences, 35(5), 101554. https://doi.org/10.1016/j.jksuci.2023.101554.
Teli, T. A., Yousuf, R., & Khan, D. A. (2022). Ensuring secure data sharing in IoT domains using blockchain. In M. M. Ghonge, S. Pramanik, R. Mangrulkar, & D.-N. Le (Eds.), Cyber security and digital forensics. https://doi.org/10.1002/9781119795667.ch9.
Vamathevan, J., Clark, D., Czodrowski, P., Dunham, I., Ferran, E., Lee, G., Li, B., Madabhushi, A., Shah, P., Spitzer, M., & Zhao, S. (2019). Applications of machine learning in drug discovery and development. Nature Reviews Drug Discovery, 18(6), 463-477. https://doi.org/10.1038/s41573-019-0024-5
Yang, Y., Adelstein, S. J., & Kassis, A. I. (2009). Target discovery from data mining approaches. Drug Discovery Today, 14, 147–154.
Yaseen, B. T., & Kurnaz, S. (2021). Drug–target interaction prediction using artificial intelligence. Applied Nanoscience. https://doi.org/10.1007/s13204-021-02000-5
Zhang, C., Xie, L., Lu, X., Mao, R., Xu, L., & Xu, X. (2024). Developing an improved cycle architecture for AI-based generation of new structures aimed at drug discovery. Molecules, 29(7), 1499.
Zhu, H. (2020). Big data and artificial intelligence modelling for drug discovery. Annual Review of Pharmacology and Toxicology, 60, 573–589.
View Dimensions
View Altmetric
Save
Citation
View
Share