Skip to Content

Dissertation Defense: Selecting and Evaluating Models to Reflect Underlying Scientific Principles: Using Basis Sets to Parameterize Hypotheses

Karen Nielsen
Tuesday, March 21, 2017
1:30-3:00 PM
438 West Hall Map

The problem of selecting an appropriate representation for a given dataset is a critical first step in the analysis process. By making use of a particular model, the researcher places an often-unstated set of assumptions on the shape, or functional form, of the data. Sometimes the chosen model and its assumptions may lead to incorrect conclusions, or not even answer the underlying research question. This dissertation explores ways in which model specification is done in practice, the effects it has, and what we can do to address problems with current approaches.

Time series data, particularly biological, are becoming increasingly common as we explore the relationship between biology and behavior. Event-Related Potentials (ERPs), which are brain responses to time-locked stimuli measured using Electroencephalography (EEG), are one example of such data. The goal in ERP research is to make inferences about neural circuits and mechanisms used when responding to stimuli. We first discuss a methodological divide in this context that leads to both interpretation differences and differences in the underlying distributional theory for testing. Through both analytic work and simulation study, we explore the properties of two competing metrics for ERP component amplitude. This study leads to a suggestion that treating the data-generating model as an analysis framework could provide a major step toward a unifying framework that facilitates reproducible research.

Our framework can draw from the substantive expectations researchers have about the shape of individuals' waveforms, particularly in local regions of interest, and translate these verbalized assumptions into mathematical basis sets. These assumptions allow us to derive properties of the representative waveform and implement them as parameters of a statistical model. We then test hypotheses on landmark parameters of the basis sets via multilevel modeling, which allows us to account for temporal patterns, patterns across channels, individual differences, and differences across experimental conditions.

Biological contexts are not the only areas for applications of this basis set approach. Using an example from statistics education, we show that Item Characteristic Curves (ICC) from Item Response Theory (IRT) can also be conceptualized as basis sets, with interpretable parameters that are reflected in the shape of the resulting curve. This context also provides a venue for shifting the current paradigm of using established models without considering the underlying assumptions that they represent - introductory statistics courses. This work is incorporated into a more general paper on statistics education where we contrast a traditional method of analysis with the basis set approach presented here. We believe that by emphasizing the process of selecting a correct model early in methodological training, we can encourage scientists to be receptive of models that test their hypotheses directly and to begin incorporating the process of creating these models into their standard practice.

Overall, the collection of three papers assembled in this dissertation makes the following contributions: identification of a methodological divide in ERP research and exploration of the properties of two competing metrics, description and demonstration of how basis sets can be designed with meaningful landmarks as parameters, highlighting of additional uses for such basis sets, and emphasis on the value of methodological training for interdisciplinary awareness of the importance of model selection.
Building: West Hall
Event Type: Lecture / Discussion
Tags: Dissertation
Source: Happening @ Michigan from Department of Statistics