Caltech Home > PMA Home > Calendar > Special Seminar in Computing and Mathematical...
open search form
Tuesday, February 19, 2019
4:00 PM - 5:00 PM
Annenberg 105

Special Seminar in Computing and Mathematical Sciences

Genomes and AI: From Packing to Regulation
Caroline Uhler, Associate Profssor, Electrical Engineering and Computer Science and Institute for Data, Systems and Society, MIT,
Speaker's Bio:
I joined the MIT faculty in 2015 and am currently the Henry L. and Grace Doherty associate professor in EECS (Electrical Engineering & Computer Science) and IDSS (Institute for Data, Systems and Society). I am a member of LIDS (Laboratory for Information and Decision Systems), the Center for Statistics, Machine Learning at MIT, and the ORC (Operations Research Center). I hold an MSc in mathematics, a BSc in biology, and an MEd in mathematics education from the University of Zurich, and a PhD in statistics from UC Berkeley. Before joining MIT, I spent a semester in the "Big Data" program at the Simons Institute at UC Berkeley, postdoctoral positions at the IMA and at ETH Zurich, and 3 years as an assistant professor at IST Austria. I am an elected member of the International Statistical Institute and a recipient of the Sloan Research Fellowship, an NSF Career Award, a Sofja Kovalevskaja Award from the Humboldt Foundation, and a START Award from the Austrian Science Foundation. My research focuses on mathematical statistics and computational biology, in particular on graphical models, causal inference and algebraic statistics, and on applications to learning gene regulatory networks and the development of geometric models for the organization of chromosomes.

The spatial organization of the genome represents an important regulator of gene expression, and alterations thereof are associated with various diseases. A recent break-through in genomics makes it possible to perform perturbation experiments at a very large scale. This motivates the development of a causal inference framework that is based on observational and interventional data. We characterize the causal relationships that are identifiable and present the first provably consistent algorithm for learning a causal network from such data. I will then link gene expression with the 3D genome organization. In particular, we will discuss approaches for integrating different data modalities and analyze alterations in the spatial organization of the genome via autoencoders and optimal transport. We end by a theoretical analysis of autoencoders linking overparameterization to memorization. In particular, we will show that overparameterized single-layer fully connected autoencoders as well as deep convolutional autoencoders memorize images, i.e., they produce outputs in the span of the training images. Collectively, this talk will highlight the symbiosis between genomics and AI and show how biology can lead to new theorems, which in turn can guide biological experiments.

For more information, please visit Seminars & Events in CMS.