The Michael Woodroofe Seminar Series is sponsored by the Department of Statistics in recognition of Emeritus Professor Michael B. Woodroofe’s profound contributions to the profession and to the Department of Statistics at the University of Michigan.

Michael Woodroofe retired from his distinguished career as the L. J. Savage Professor of Statistics and Mathematics at the University of Michigan in 2009. An eminent scientist in both statistics and probability, Professor Woodroofe made deep and pioneering contributions in many areas, including nonparametric inference, biased sampling, isotonic inference, sequential analysis, and statistics in astronomy. His seminal work in sequential analysis and non-linear renewal theory influenced an entire generation of researchers. He has published more than 100 research articles, written a SIAM monograph, and authored a book. A total of 40 Ph.D. students have completed dissertations under his direction, and he remains active in research and teaching today.

Professor Woodroofe has received many honors and has served our professional community in many ways, including his roles as Editor of the Annals of Statistics (1992–1994), Advisory Editor of JSPI since 1994, and Associate Editor of the Annals of Statistics, the Annals of Probability, JSPI, and Sequential Analysis. He has been a member of the IMS Council (two terms), a member of numerous federal granting and advisory panels, and an invited/distinguished lecturer at multiple universities and national and international conferences. He is a Fellow of the Institute of Mathematical Statistics and an elected member of International Statistical Institute.

Michael Woodroofe received his B.S. in Mathematics from Stanford in 1962, his M.S. in Mathematics from the University of Oregon in 1964, and his Ph.D. in Mathematics from the University of Oregon in 1965. He joined the Department of Statistics at Carnegie Mellon University as an assistant professor in 1966 and moved to the Department of Mathematics at the University of Michigan in 1968. He was a founding member of the Department of Statistics at the University Michigan in 1969, retaining a joint appointment with Mathematics, and served as Chair of the Department of Statistics from 1977–1983.

## Past Woodroofe Lecturers

### Cun-Hui Zhang - September 27, 2019

**Power-One Test for Higher Criticism, Multiple Isotonic Regression, and Second Order Stein Methods**

We consider several problems in areas where Michael Woodroofe has made seminal contributions to. In higher criticism, we develop a one-sided sequential probability ratio test based on the ordered p-values to achieve optimal detection of rare and weak signals. This makes an interesting connection to the test of power one and nonlinear renewal theorem. In multiple isotonic regression, a block estimator is developed to attain minimax rate for a wide range of signal-to-noise ratio, to achieve adaptation to the parametric root-n rate up to a logarithmic factor in the case where the unknown mean is piecewise constant, and to achieve adaptation in variable selection. In uncertainty quantification, we develop second order Stein formulas for statistical inference in nonparametric and high-dimensional problems. Applications of the second order Stein method include exact formulas and upper bounds for the variance of risk estimators and risk bounds for regularized or shape constrained estimators and related degrees of freedom adjustments and confidence regions.

### Emmanuel Candés - March 9, 2018

**A New Read on the Knockoff Filter: Statistical Tools for Replicable Selections**

A common problem in modern statistical applications is to select, from a large set of candidates, a subset of variables which are important for determining an outcome of interest. For instance, the outcome may be disease status and the variables may be hundreds of thousands of single nucleotide polymorphisms on the genome. In this talk, we develop an entirely new read of the knockoffs framework of Barber and Candès (2015), which proposes a general solution to perform variable selection under rigorous type-I error control, without relying on strong modeling assumptions. We show how to apply this solution to a rich family of problems where the distribution of the covariates can be described by a hidden Markov model (HMM). In particular, we develop an exact and efficient algorithm to sample knockoff copies of an HMM, and then argue that combined with the knockoffs selective framework, they provide a natural and powerful tool for performing principled inference in genome-wide association studies with guaranteed FDR control. Finally, our methodology is applied to several datasets aimed at studying the Crohn's disease and several continuous phenotypes, e.g. levels of cholesterol. Time permitting, we will discuss the robustness of our methods.

This is joint work with many people

### Iain Johnstone - October 28, 2016

**Low rank structure in highly multivariate models**

We start with an overview of some high-dimensional phenomena seen in principal components analysis. More generally, back in 1964 Alan James gave a remarkable classification of many of the eigenvalue distribution problems of multivariate statistics, including PCA. We show how the classification readily adapts to contemporary `spiked models' -- high dimensional data with low rank structure. In particular we approximate likelihood ratios when the number of variables grows proportionately with sample size or degrees of freedom. High dimensions bring phase transition phenomena, with quite different likelihood ratio behavior for small and large spike strengths. James' framework allows a unified approach to problems such as signal detection, matrix denoising, regression and canonical correlations.

### Tze Leung Lai - September 15, 2015

**Multi-armed Bandits with Covariates: Theory and Applications**

In the past five years, multi-armed bandits with covariates, also called "contextual bandits" in machine learning, have become an active area of research in data science, stochastic optimization, and statistical modeling because of their applications to the development of personalized strategies in translational medicine and in recommender systems for web-based marketing and electronic business. After a brief review of classical (context-free) bandit theory, we describe a corresponding theory, covering both parametric and nonparametric approaches, for contextual bandits and illustrate their applications to personalized strategies.

### Larry Wasserman - February 13, 2015

**Topological Data Analysis**

Topological Data Analysis (TDA) refers to a collection of techniques for extracting interesting features from point clouds and images. I will discuss two different approaches to TDA. The first is persistent homology, which is a multiscale method for finding voids of different dimensions. The second is ridge estimation which is aimed at finding high density filaments and walls. This is joint work with members of TopStat (www.stat.cmu.edu/topstat).

### Bin Yu - September 6, 2013

**The Relative Size of Big Data: Perspectives from an Interdisciplinary Statistician**

Big data problems occur when available computing resources (CPU, communication bandwidth, and memory) can not accommodate the computing demand on the data at hand. In this introductory overview talk, we provide perspectives on big data motivated by a collaborative project on coded aperture imaging with the advanced light source (ALS) group at the Lawrence Berkeley National Lab. In particular, we emphasize the key role in big data computing played by memory and communication bandwidth. Moreover, we briefly review available resource to monitor memory and time efficiency of algorithms in R and discuss active big data research topics. We conclude that the bottle-neck between statisticians and big data is human resource that includes interpersonal, leadership, and programming skills. As a community, we either "compute" or we "concede".

### Elizabeth Thompson - October 19, 2012

**Assessing the Significance and Uncertainty of Identity by Descent in Pedigrees and Populations**

Identity by descent (ibd) underlies all similarities among related individuals. Modern genetic data allow the inference of ibd from recent ancestors, even when the individuals are not known to be related. If ibd were observable, it could be used to infer a causal effect on a trait of DNA in a specific region of the genome, but inferences of ibd are uncertain. We show the similarities and differences in the use of ibd to resolve genetic effects on traits as we move from data on defined pedigrees to population-based approaches, and how uncertainty in the inferred ibd may be translated into a measure of uncertainty in the resulting tests of trait-related effects.

### Peter Buhlmann - April 13, 2012

### Predicting Causal Effects in High-Dimenionsal Settings

Understanding cause-effect relationships between variables is of great interest in many fields of science. An ambitious but highly desirable goal is to infer causal effects from observational data obtained by observing a system of interest without subjecting it to interventions. This would allow to circumvent severe experimental constraints or to sub-stantially lower experimental costs. Our main motivation to study this goal comes from applications in biology.

We present recent progress for prediction of causal effects with direct implications on designing new intervention experiments, particularly for high-dimensional, sparse settings with thousands of variables but based on only a few dozens of observations. We highlight exciting possibilities and fundamental limitations. In view of the latter, statisti-cal modeling needs to be complemented with experimental validations: we discuss this in the context of molecular biology for yeast (Saccharomyces Cerevisiae) and the model plant Arabidopsis Thaliana.

### Jon A. Wellner - April 1, 2011

### Chernoff’s Distribution is log-concave. But why? (And why does it matter?)

Chernoff's distribution, which apparently arose first in connection with nonparametric estimation of the mode of a uni-modal density, is the density of the location of the maximum of two-sided Brownian motion minus a parabola. By Groeneboom's switching relation this random variable has the same distribution as the slope at zero of the least concave majorant of two-sided Brownian motion minus a parabola, and it is in this connection that it arises naturally as the limit distribution (up to multiplicative constants) of nonparametric estimators of monotone functions. It was studied further by Daniels and Skryme (1985) and by Groeneboom (1985), (1989), and computed by Groeneboom and Wellner (2001). It appears “Gaussian" in shape; and it is natu-ral to conjecture that Chernoff's distribution is log-concave.

In this talk I will review some results concerning log-concave distributions on one and higher dimensional Euclidean spaces, with emphasis on preservation results and connections with questions arising from statistics. I will indicate why Chernoff's distribution is log-concave, and briefly mention further problems.

Some background material for this talk is given in the book “Unimodality, Convexity and Applications" by S. Dharmadhikari and K. Joag-dev, Academic Press, 1988, and in the papers mentioned above.

### Zhiliang Ying - October 30, 2009

### Some results on modeling and inference for spatially sampled data

Correlated observations with spatial dependence appear in many applications. Their asymptotic properties, however, are notoriously difficult to obtain. In this talk, I will present some results on spatially sampled data. I will show how these results are related to spatial Gaussian process models, to spatially correlated survival models and to certain local likelihood methods. The presentation is based on joint work with Jane Paik and Gongjun Xu.

### Ruth J. Williams - September 28, 2007

### Stochastic Networks with Resource Sharing

Stochastic networks are used as models for complex systems involving dynamic interactions subject to uncertainty. Application domains include manufacturing, the service industry, telecommunications, and computer systems. Networks arising in modern applications are often highly complex and heterogeneous, with network features that transcend those of conventional queuing models. The control and analysis of such networks present challenging mathematical problems.

In this talk, a concrete application will be used to illustrate a general approach to the study of stochastic networks using more tractable approximate models. Specifically, we consider a connection-level model of Internet congestion control that represents the randomly varying number of flows present in a network where bandwidth is shared fairly amongst elastic documents. This model, introduced by Massoulie and Roberts, can be viewed as a stochastic network with simultaneous resource possession. Elegant fluid and diffusion approximations will be used to study the stability and performance of this model.

The talk will conclude with a summary of the current status and description of open problems associated with the further development of approximate models for general stochastic networks.

This talk is based in part on joint work with W. Kang, F. P. Kelly, and N. H. Lee

### Ioannis Karatzas - October 3, 2006

### Recent Approaches to Optimal Stopping, with Applications

The problem of Optimal Stopping is of fundamental importance for sequential analysis, for the detection of signals in a background of noise, and also for pricing American-type options in the modern theory of finance. We shall review in this talk two relatively recent approaches to this problem: the deterministic or “pathwise” approach of Davis & Karatzas (1994), and the “integral representation” approach of Bank & El Karoui (2003). Each of these approaches has its own mathematical and methodological interest, and is almost tailor-made for particular applications. We shall discuss in some detail various aspects of the two approaches. Whether one of them can be obtained directly from the other, remains a tantalizing open question.