Rice University Department of Mathematics Colloquium

A 3D Code in the Human Genome

4:00 pm Thursday, March 9th, 2017
Erez Lieberman Aiden (Baylor College of Medicine)

Abstract: Stretched out from end-to-end, the human genome – a sequence of 3 billion chemical letters inscribed in a molecule called DNA – is over 2 meters long. Famously, short stretches of DNA fold into a double helix, which wind around histone proteins to form the 10nm fiber. But what about longer pieces? Does the genome’s fold influence function? How does the information contained in such an ultra-dense packing even remain accessible?

In this talk, I describe our work developing ‘Hi-C’ (Lieberman-Aiden et al., Science, 2009; Aiden, Science, 2011) and more recently ‘in-situ Hi-C’ (Rao & Huntley et al., Cell, 2014), which use proximity ligation to transform pairs of physically adjacent DNA loci into chimeric DNA sequences. Sequencing a library of such chimeras makes it possible to create genome-wide maps of physical contacts between pairs of loci, revealing features of genome folding in 3D.

Next, I will describe recent work using in situ Hi-C to construct 3D maps of many cell types. The densest, in human lymphoblastoid cells, contains 4.9 billion contacts, achieving 1 kb resolution. We use these maps to identify ∼10,000 loops that form as the human genome folds inside the cell nucleus. These loops correlate with gene activation and are conserved across cell types and species. Most loops lie between convergent DNA motifs (i.e., the asymmetric motifs are “facing” one another) which bind a complex containing CTCF and cohesin. By modifying CTCF motifs using CRISPR, we can reliably add, move, and delete loops in the human genome. Thus, it possible not only to “read” the genome’s 3D architecture, but also to write it.

Next, I will discuss the biophysical mechanism that underlies chromatin looping. Specifically, our data is consistent with the formation of loops by extrusion (Sanborn & Rao et al., PNAS, 2015). In fact, in many cases, the local structure of Hi-C maps may be predicted in silico based on patterns of CTCF binding and an extrusion-based model.

Finally, I will discuss the surprisingly deep connections between genome folding and the study of fractal curves. We recently proved that, for any self-similar fractal curve f([0,1]), dim f(X) = d × dim(X), where X⊆ [0,1]. (Sanborn & Rao et al., PNAS, 2015). This result provides a deterministic analog to McKean’s 1955 dimension-doubling theorem for Brownian motion, and played an important role in the confirmation of the extrusion model.

Return to Colloquium page