Statistics Department Seminar Series: Simon Mak, H. Milton Stewart School of Industrial & Systems Engineering (ISyE), Georgia Institute of Technology
"Support points – a new way to reduce big and high-dimensional data"
Tuesday, January 15, 2019
411 West Hall Map
This talk presents a new method for reducing big and high-dimensional data into a smaller dataset, called support points (SPs). In an era where data is plentiful but downstream analysis is oftentimes expensive, SPs can be used to tackle many big data challenges in statistics, engineering and machine learning. SPs have two key advantages over existing methods. First, SPs provide optimal and model-free reduction of big data for a broad range of downstream analyses. Second, SPs can be efficiently computed via parallelized difference-of-convex optimization; this allows us to reduce millions of data points to a representative dataset in mere seconds. SPs also enjoy appealing theoretical guarantees, including distributional convergence and improved reduction over random sampling and clustering-based methods. The effectiveness of SPs is then demonstrated in two real-world applications, the first for reducing long Markov Chain Monte Carlo (MCMC) chains for rocket engine design, and the second for data reduction in computationally intensive predictive modeling.
|Event Type:||Workshop / Seminar|
|Source:||Happening @ Michigan from Department of Statistics, Department of Statistics Seminar Series|