An Evolutionary Approach for Feature Selection applied to ADMET Prediction

By: Material type: ArticleArticleDescription: 2008 12 (37) : 55-63Online resources: Summary: Feature selection methods look for the selection of a subset of features or variables in a data set, such that these features are the most relevant for predicting a target value. In chemoinformatics context, the determination of the most significant set of descriptors is of great importance due to their contribution for improving ADMET prediction models. In this paper, an evolutionary-based approach for descriptor selection aimed to physicochemical property prediction is presented. In particular, we propose a genetic algorithm with a fitness function based on decision trees, which evaluates the relevance of a set of descriptors. Other fitness functions, based on multivariate regression models, were also tested. The performance of the genetic algorithm as a feature selection technique was assessed for predicting logP (octanol-water partition coefficient), using an ensemble of neural networks for the prediction task. The results showed that the evolutionary approach using decision trees is a promising technique for this bioinformatic application.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Item type Home library Call number URL Status Date due Barcode
Artículo de revista Artículo de revista Biblioteca de la Facultad de Informática Link to resource No corresponde

Feature selection methods look for the selection of a subset of features or variables in a data set, such that these features are the most relevant for predicting a target value. In chemoinformatics context, the determination of the most significant set of descriptors is of great importance due to their contribution for improving ADMET prediction models. In this paper, an evolutionary-based approach for descriptor selection aimed to physicochemical property prediction is presented. In particular, we propose a genetic algorithm with a fitness function based on decision trees, which evaluates the relevance of a set of descriptors. Other fitness functions, based on multivariate regression models, were also tested. The performance of the genetic algorithm as a feature selection technique was assessed for predicting logP (octanol-water partition coefficient), using an ensemble of neural networks for the prediction task. The results showed that the evolutionary approach using decision trees is a promising technique for this bioinformatic application.