The use of CART and multivariate regression trees for supervised and unsupervised feature selection

Questier, F., Put, R., Coomans, D., Walczak, B., and Vander Heyden, Y. (2005) The use of CART and multivariate regression trees for supervised and unsupervised feature selection. Chemometrics and Intelligent Laboratory Systems, 76 (1). pp. 45-54.

[img]PDF (Published Version) - Repository staff only - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
764Kb

DOI: 10.1016/j.chemolab.2004.09.003

View at Publisher Website: http://dx.doi.org/10.1016/j.chemolab.200...

Abstract

Feature selection is a valuable technique in data analysis for information-preserving data reduction. This paper describes Classification and Regression Trees (CART) and Multivariate Regression Trees (MRT)-based approaches for both supervised and unsupervised feature selection. The well-known CART method allows to perform supervised feature selection by modeling one response variable (y) by some explanatory variables (x). The recently proposed CART extension, MRT can handle more than one response variable (y). This allows to perform a supervised feature selection in the presence of more than one response variable. For unsupervised feature selection, where no response variables are available, we propose Auto-Associative Multivariate Regression Trees (AAMRT) where the original variables (x) are not only used as explanatory variables (x), but also as response variables (y=x). Since (AA)MRT is grouping the objects into groups with similar response values by using explanatory variables, this means that the variables are found which are most responsible for the cluster structure in the data. We will demonstrate how these approaches can improve (the detection of) the cluster structure in data and how they can be used for knowledge discovery.

ID Code:4503
Item Type:Article (Refereed Research - C1)
Keywords:AAMRT; auto-associative; CART; clustering; feature selection; MRT; multivariate regression trees; supervised; unsupervised
FoR Codes:UNSPECIFIED
SEO Codes:92 HEALTH > 9299 Other Health > 929999 Health not elsewhere classified @ 100%
Deposited On:11 Jun 2009 11:46
Last Modified:17 May 2013 00:40
Downloads:Total: 2
Last 12 Months: 0
Statistics:More Statistics
Citation Counts with External Providers:Web of Science: 38

Repository Staff Only: item control page