Department Seminar Series: Jia Li, Professor, Department of Statistics, Pennsylvania State University
Clustering under the Wasserstein Metric
In a variety of research areas such as multimedia retrieval, computer vision, and document analysis, the bag of weighted vectors and the histogram are widely used descriptors for complicated objects. Both can be taken as discrete distributions. D2-clustering pursues the minimum total within cluster variation for a set of discrete distributions subject to the Kantorovich-Wasserstein metric. The Wasserstein distance, closely related to the mass transportation problem, is robust against quantization and readily allows sparse representation of distributions. In this talk, I will first introduce the basic D2-clustering algorithm exploiting large-scale linear programming. In order to improve the scalability of the algorithm, approaches have been developed based on respectively the ad-hoc
divide-and-conquer strategy and the theoretically sound ADMM optimization technique. Finally, applications to image annotation, document, and protein sequence clustering will be presented.
divide-and-conquer strategy and the theoretically sound ADMM optimization technique. Finally, applications to image annotation, document, and protein sequence clustering will be presented.
Building: | West Hall |
---|---|
Website: | |
Event Type: | Workshop / Seminar |
Tags: | seminar |
Source: | Happening @ Michigan from Department of Statistics, Department of Statistics Seminar Series |