Skip to Content

Search: {{$root.lsaSearchQuery.q}}, Page {{$}}

Statistics Department Seminar Series: Yufeng Liu, Professor, Department of Statistics, Operations Research, Genetics, and Biostatistics, University of North Carolina at Chapel Hill

"Statistical Significance of Clustering for High Dimensional Data"
Friday, November 10, 2023
10:00-11:00 AM
340 West Hall Map
Abstract: Clustering serves as a fundamental tool for exploratory data analysis, but a key challenge lies in determining the reliability of the clusters identified by these methods, differentiating them from artifacts resulting from natural sampling variations. In this talk, I will present statistical significance of clustering (SigClust) as a cluster evaluation tool for high dimensional data. To begin, we define a cluster as data originating from a single Gaussian distribution and frame the assessment of statistical significance of clustering as a formal testing procedure. Addressing the challenge of high-dimensional covariance estimation in SigClust, we employ a combination of invariance principles and a factor analysis model. I'll also discuss an enhanced SigClust using multidimensional scaling (MDS) on dissimilarity matrices. SigClust for hierarchical clustering will be presented as well. Simulations and real data, including cancer subtype analysis, validate SigClust's effectiveness in assessing clustering significance.
Building: West Hall
Event Type: Workshop / Seminar
Tags: seminar
Source: Happening @ Michigan from Department of Statistics, Department of Statistics Seminar Series