ABSTRACT:
In this paper, a combination of dimensionality reduction technique, to address the problems of highly correlated data and selection of significant variables out of set of features, by assessing important and significant dimensionality reduction techniques contributing to efficient classification of genes is proposed. One-Way-ANOVA is employed for feature selection to obtain an optimal number of genes, Principal Component Analysis (PCA) as well as Partial Least Squares (PLS) are employed as feature extraction methods separately, to reduce the selected features from microarray dataset. An experimental result on colon cancer dataset uses Support Vector Machine (SVM) as a classification method. Combining feature selection and feature extraction into a generalized model, a robust and efficient dimensional space is obtained. In this approach, redundant and irrelevant features are removed at each step; classification presents an efficient performance of accuracy of about 98% over the state of art.
Keywords:
Dimensionality Reduction, Feature Selection, Feature Extraction, Classification