Dimension Reduction for High-Dimensional Data
互联网
538
With advancing of modern technologies, high-dimensional data have prevailed in computational biology. The number of variables p is very large, and in many applications, p is larger than the number of observational units n . Such high dimensionality and the unconventional small-n -large-p setting have posed new challenges to statistical analysis methods. Dimension reduction, which aims to reduce the predictor dimension prior to any modeling efforts, offers a potentially useful avenue to tackle such high-dimensional regression. In this chapter, we review a number of commonly used dimension reduction approaches, including principal component analysis, partial least squares, and sliced inverse regression. For each method, we review its background and its applications in computational biology, discuss both its advantages and limitations, and offer enough operational details for implementation. A numerical example of analyzing a microarray survival data is given to illustrate applications of the reviewed reduction methods.