Two-Way Analysis of High-Dimensional Collinear Data


Ilkka Huopaniemi, Tommi Suvitaival, Janne Nikkilä, Matej Orešic, and Samuel Kaski. Two-way analysis of high-dimensional collinear data. Data Mining and Knowledge Discovery, 19:261–276, 2009.


We present a Bayesian model for two-way ANOVA-type analysis of high-dimensional, small sample-size datasets with highly correlated groups of variables. Modern cellular measurement methods are a main application area; typically the task is differential analysis between diseased and healthy samples, complicated by additional covariates requiring a multi-way analysis. The main complication is the combination of high dimensionality and low sample size, which renders classical multivariate techniques useless. We introduce a hierarchical model which does dimensionality reduction by assuming that the input variables come in similarly-behaving groups, and performs an ANOVA-type decomposition for the set of reduced-dimensional latent variables. We apply the methods to study lipidomic profiles of a recent large-cohort human diabetes study.

Suggested BibTeX entry:

    author = {Ilkka Huopaniemi and Tommi Suvitaival and Janne Nikkil{\"{a}} and Matej Ore{\v{s}}i{\v{c}} and Samuel Kaski},
    journal = {Data Mining and Knowledge Discovery},
    pages = {261-276},
    title = {Two-Way Analysis of High-Dimensional Collinear Data},
    volume = {19},
    year = {2009},

PDF (744 kB)
See ...