IMPROVED ESTIMATION OF COVARIANCE MATRIX IN HOTELLING'S T2 FOR MICROARRAY DATA
Keywords:Gene set analysis, Hotelling`s T2, Microarray analysis, Shrinkage covariance matrix
The relationship between genes in gene set analysis in microarray data is analyzed using Hotelling's T2 but the test cannot be applied when the number of samples is larger than the number of variables which is uncommon in the microarray. Thus, in this study, we proposed shrinkage approaches to estimating the covariance matrix in Hotelling's T2 particularly to cater high dimensionality problem in microarray data. Three shrinkage covariance methods were proposed in this study and are referred as Shrink A, Shrink B and Shrink C. The analysis of the three proposed shrinkage methods was compared with the Regularized Covariance Matrix Approach and Kong's Principal Component Analysis. The performances of the proposed methods were assessed using several cases of simulated data sets. In many cases, the Shrink A method performed the best, followed by the Shrink C and RCMAT methods. In contrast, both the Shrink B and KPCA methods showed relatively poor results. The study contributes to an establishment of modified multivariate approach to differential gene expression analysis and expected to be applied in other areas with similar data characteristics.
Wang Z, Zineddin B, Liang J, Zeng N, Li Y, Du M, et al. cDNA microarray adaptive segmentation. Neurocomputing 2014;142:408-18.
Zvara Ã, Kitajka K, FaragÃ³ N, PuskÃ¡s LG. Microarray technology. Acta Biological Szegediensis 2015;59:51-67.
Cooper-Knock J, Kirby J, Ferraiuolo L, Heath PR, Rattray M, Shaw PJ. Gene expression profiling in human neurodegenerative disease. Nat Rev Neurol 2012;8:518-30.
Altenburger R, Scholz S, Schmitt-Jansen M, Busch W, Escher BI. Mixture toxicity revisited from a toxicogenomic perspective. Environ Sci Technol 2012;46:2508-22.
Tran B, Dancey JE, Kamel-Reid S, McPherson JD, Bedard PL, Brown AM, et al. Cancer genomics: technology, discovery, and translation. J Clin Oncol 2012;30:647-60.
Karjanto S, Ramli NM, Aripin R, Ghani NAM. Improved statistical test using shrinkage covariance matrix for identifying differential gene sets. J Appl Environ Biol Sci 2014;1:302-10.
Ledoit O, Wolf M. A well-conditioned estimator for large-dimensional covariance matrices. J Multivariate Anal 2001;88:365â€“411.
Ledoit O, Wolf M. Improved estimation of the covariance matrix of stock returns with an application to portfolio selection. J Empirical Finance 2003;10:603-21.
Ledoit O, Wolf M. Honey, I shrunk the sample covariance matrix. J Portfolio Management 2004;31:110-9.
SchÃ¤fer J, Strimmer K. A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Stat Appl Genet Mol Biol 2005;4:32.
Yates PD, Reimers MA. RCMAT: a regularized covariance matrix approach to testing gene sets. BMC Bioinf 2009;10:300.
Kong SW, Pu WT, Park PJ. A multivariate approach for integrating genome-wide expression data and biological knowledge. Bioinformatics 2006;22:2373-80.