Contents |
Effective molecular representation is key to the success of machine learning models for molecular data analysis. TDA-based machine learning models have demonstrated great potential for drug design. In this talk, we will discuss a series of persistent models, including persistent homology, persistent spectral models, and persistent Ricci curvature and their combination with deep learning models. Unlike traditional graph and network models, these filtration-induced persistent models can characterize the multiscale topological and geometric information, thus significantly reduce molecular data complexity and dimensionality. Feature vectors are obtained from various persistent attributes derived from topological and geometric invariants, such as homology, cohomology, eigenvalues, and Ricci curvature. They are inputted into deep learning models, in particular, random forest, gradient boosting tree and convolutional neural network (CNN). Our persistent representations based molecular fingerprints can significantly boost the performance of learning models in drug design. |