Logo

Download

Title:
OCS-TGBM: Intelligent Analysis of Organic Chemical Synthesis Based on Topological Data Analysis and LightGBM
Authors:
Yanhui Guo, Lichao Peng, Zixin Li, Meng'en Qin, Xue Jiao, Yun Chai, Xiaohui Yang
doi:
Volume
91
Issue
3
Year
2024
Pages
557-592
Abstract Organic synthesis has been widely used in drug discovery and development. The intelligent prediction and analysis of high-throughput coupling reaction yield is one of the important and challenging research hotspots in the field of organic synthesis. However, the existing methods focus on intelligent prediction rather than study and interpret the internal relationship between reaction conditions and yield. For tackling this problem, an intelligent analysis organic chemical synthesis model by combining topological data analysis (TDA) and Light Gradient Boosting Machine (LightGBM), named OCS-TGBM, is proposed to deeply explore the internal relationship between reaction conditions and yield, and obtain high-yield reaction conditions and combinations. In order to further enhance the performance of the OCS-TGBM model, a stratified diversity sampling strategy is introduced. Experimental results show that the OCS-TGBM model is superior to other methods in analyzing and predicting the reaction performance of high-throughput organic chemical synthesis. And it provides intelligent assistance for the optimal design of the reaction system and the evaluation of reaction conditions, thus greatly accelerating the process of the drug discovery and development.

Back