Statistical confidence for variable selection in QSAR models via Monte Carlo cross-validation
Konovalov, Dmitry A., Sim, Nigel, Deconinck, Eric, Vander Heyden, Yvan, and Coomans, Danny (2008) Statistical confidence for variable selection in QSAR models via Monte Carlo cross-validation. Journal of Chemical Information and Modeling, 48 (2). pp. 370-383.
|PDF (Published version) - Repository staff only - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader|
View at Publisher Website: http://dx.doi.org/10.1021/ci700283s
A new variable selection wrapper method named the Monte Carlo variable selection (MCVS) method was developed utilizing the framework of the Monte Carlo cross-validation (MCCV) approach. The MCVS method reports the variable selection results in the most conventional and common measure of statistical hypothesis testing, the P-values, thus allowing for a clear and simple statistical interpretation of the results. The MCVS method is equally applicable to the multiple-linear-regression (MLR)-based or non-MLR-based quantitative structure-activity relationship (QSAR) models. The method was applied to blood-brain barrier (BBB) permeation and human intestinal absorption (HIA) QSAR problems using MLR to demonstrate the workings of the new approach. Starting from more than 1600 molecular descriptors, only two (TPSA(NO) and ALOGP) yielded acceptably low P-values for the BBB and HIA problems, respectively. The new method has been implemented in the QSAR-BENCH v2 program, which is freely available (including its Java source code) from www.dmitrykonovalov.org for academic use.
|Item Type:||Article (Refereed Research - C1)|
|FoR Codes:||01 MATHEMATICAL SCIENCES > 0104 Statistics > 010401 Applied Statistics @ 100%|
|SEO Codes:||97 EXPANDING KNOWLEDGE > 970101 Expanding Knowledge in the Mathematical Sciences @ 100%|
|Deposited On:||26 Feb 2010 08:50|
|Last Modified:||18 Oct 2013 00:57|
Last 12 Months: 2
|Citation Counts with External Providers:|
Repository Staff Only: item control page