Caltech Home > PMA Home > Calendar > Applied Mathematics Colloquium
open search form
Monday, November 18, 2013
12:00 PM - 1:00 PM
Annenberg 105

Applied Mathematics Colloquium

Stability
Professor Bin Yu, Departments of Statistics and EECS, University of California at Berkeley,
Speaker's Bio:
Bin Yu is Chancellor's Professor in the Departments of Statistics and Electrical Engineering & Computer Science at UC Berkeley. She was Chair of Statistics Department at UC Berkeley from 2009 to 2012. She has published over 100 scientific papers in premier journals and conferences in Statistics, EECS, remote sensing and neuroscience. These papers cover a wide range of research topics, including empirical process theory, information theory (MDL), MCMC methods, signal processing, machine learning, high dimensional data inference (boosting and Lasso and sparse modeling in general), and interdisciplinary data problems. She has served on many editorial boards for journals such as Annals of Statistics, Journal of American Statistical Association, and Journal of Machine Learning Research. She is a member of the American Academy of Arts and Sciences. She was a Guggenheim Fellow in 2006, an ICIAM Invited Speaker in 2011, and the 2012 Tukey Memorial Lecturer of the Bernoulli Society. She is a Fellow of AAAS, IEEE, IMS (Institute of Mathematical Statistics) and ASA (American Statistical Association). She is currently the PresidentE of IMS, and serving on the Scientific Advisory B oard of IPAM (Institute for Pure and Applied Mathematics) and on the Governing Board of ICERM (Institute for Computational and Experimental Re search in Mathematics). She was co-chair of the National Scientific Committee of SAMSI (Statistical and Applied Mathematical Sciences Institute), and was on the Board of Governors of IEEE-IT Society and th e Board of Mathematical Sciences and Applications (BMSA) of NAS.
Reproducibility is imperative for any scientific discovery. More often than not, modern scientific findings rely on statistical analysis of high-dimensional data. At a minimum, reproducibility manifests itself in stability of statistical results relative to "reasonable" perturbations to data and to the model used. Jacknife, bootstrap, and cross-validation are based on perturbations to data, while robust statistics methods deal with perturbations to models.
 
In this talk, a case is made for the importance of stability in statistics. Firstly, we motivate the necessity of stability of interpretable encoding models for movie reconstruction from brain fMRI signals. Secondly, we find strong evidence in the literature to demonstrate the central role of stability in statis- tical inference. Thirdly, a smoothing parameter selector based on estimation stability (ES), ES-CV, is proposed for Lasso, in order to bring stability to bear on cross-validation (CV). ES-CV is then utilized in the encoding models to reduce the number of predictors by 60% with almost no loss (1.3%) of prediction performane across over 2,000 voxels. Last, a novel "stability" argument is seen to drive new results that shed light on the intriquing interactions between sample to sample varibility and heavier tail error distribution (e.g. double-exponential) in high dimensional regression models with p predictors and n independent samples. In particular, when p/n → κ ∈ (0.3, 1) and error is double-exponential, OLS is a better estimator than LAD.
For more information, please contact Carmen Nemer-Sirois by phone at (626) 395-4561 or by email at [email protected].