SDS Colloquium, Speaker Yue Wang

October 13, 2025, 2:30pm in ENR2 S215

When

2:30 – 3:30 p.m., Oct. 13, 2025

Title: Robust unsupervised analysis of genetic pathways 

Abstract: 
Gene pathway analysis is a powerful approach for understanding how groups of genes collectively influence complex traits and diseases. In particular, unsupervised analyses can capture high-level co-expression patterns, cluster pathways into functionally related modules, and identify independent modules for deeper biological insights. However, existing methods based on gene overlap, functional similarity, or pathway correlations often lack biological relevance and statistical robustness. We propose a latent factor–based top-down model to estimate pathway correlations, designed to capture population-specific variation while remaining robust to differences in pathway activity scoring algorithms. To ensure positive definiteness and improve accuracy, we introduce a shrinkage estimator for the correlation matrix, enabling effective pathway clustering. We also establish the asymptotic normality of our estimator, allowing efficient computation of P-values for constructing pathway co-expression networks. The utility of our method is demonstrated through comprehensive analyses of TCGA gene expression datasets, and we provide an R package, highcor, for implementation.