Research Group - Statistics and Machine Learning

Core Statistics Group

This group develops modern statistical methods for complex data, with applications in biomedicine, data mining, and high-dimensional analysis. Their work focuses on flexible modeling, accurate inference, and adapting to changing data patterns.

Themes:

  • high dimensional statistics
  • nonparametric statistics
  • change point analysis

Members

  • Ning Hao - High dimensional data; Machine learning; Change point detection
  • Yue Selena Niu - Nonparametric statistics; Semiparametric modeling; Statistical genetics
  • Hao Helen Zhang - Nonparametric smoothing; Model selection; Data Mining; Statistical applications in biosciences and biomedicine
 

Select Publications

Zou H, Zhang HH. On the Adaptive Elastic-Net with a Diverging Number of Parameters. Ann Stat. 2009;37(4):1733-1751. doi: 10.1214/08-AOS625. PMID: 20445770; PMCID: PMC2864037.

Jianqing Fan, Shaojun Guo, Ning Hao, Variance Estimation Using Refitted Cross-Validation in Ultrahigh Dimensional Regression, Journal of the Royal Statistical Society Series B: Statistical Methodology, Volume 74, Issue 1, January 2012, Pages 37–65, https://doi.org/10.1111/j.1467-9868.2011.01005.x

Fan, J. and Niu, Y. Selection and validation of normalization methods for c-DNA microarrays using within-array replications. Bioinformatics, 23, 2391-2398.

Bayesian Statistics

Bayesian statistics provides a flexible framework for learning from data by combining prior knowledge with observed evidence. It is especially useful for complex models, uncertainty quantification, and real-time decision-making.

Themes:

  • biostatistics
  • ecological applications
  • psychometrics

Members

  • Edward Bedrick - Analysis of observational data; Bayesian methods; Generalized linear and mixed models
  • Dean Billheimer - Measurement and normalization, Quantitative proteomics, Statistical methods for compositional data.
  • Lifeng Lin - meta-analysis, network meta-analysis of multiple-treatment comparisons, publication bias, and Bayesian methods. He is also interested in the applications of statistical methods to real-world problems
  • Henry Scharf - spatiotemporal statistics; Bayesian statistics; Ecological applications
  • Xueying Tang - high dimensional Bayesian inference, latent variable models, small area estimation, psychometrics 

Select Publications

Bedrick, E. J., Christensen, R., & Johnson, W. (1996). A New Perspective on Priors for Generalized Linear Models. Journal of the American Statistical Association, 91(436), 1450–1460. https://doi.org/10.1080/01621459.1996.10476713
Scharf, H. R. (2021). Statistical analysis of animal movement: Understanding behavior through hierarchical parametric models. Notices of the American Mathematical Society, 68(6), 911–924. 
Tang, X. (2024) A Latent Hidden Markov Model for Process Data. Psychometrika. 89, 205-240.

Machine Learning Group

Machine learning focuses on developing algorithms that learn from data to make predictions, uncover patterns, and drive decision-making across domains like healthcare, finance, and technology.

Themes: 

  • reinforcement learning
  • learning theory
  • probabilistic graphical models

Members

  • Kwang-Sung Jun - Reinforcement learning, active learning, Bayesian optimization
  • Chicheng Zhang - Machine Learning, learning theory
  • Jason Pacheco - Statistical machine learning, probabilistic graphical models, approximate inference algorithms, and information-theoretic decision making 
     

Artificial Intelligence Group

Artificial intelligence aims to create systems that can reason, learn, and act intelligently. Research in this area spans from developing decision-making algorithms to building models that mimic human perception and behavior.

Members

  • Clayton Morrison - Machine Learning, Causal Inference, Activity Recognition and Understanding, Automated Planning, Knowledge Representation, Computational Cognitive Science
  • Kobus Barnard - Machine learning; Mathematical modeling of geometric form; Multi-modal data; Statistical applications in computer vision.