Skip to content

News & Events

Manifold learning uncovers hidden structure in complex cellular state space

David van Dijk (Yale University)

Colloquium

Thursday, March 14, 2019, 3:30 pm

Amazon Auditorium

Abstract

In the era of big biological data, there is a pressing need for methods that visualize, integrate and interpret high-throughput high-dimensional data to enable biological discovery. There are several major challenges in analyzing high-throughput biological data. These include the curse of (high) dimensionality, noise, sparsity, missing values, bias, and collection artifacts. In my work, I try to solve these problems using computational methods that are based on manifold learning. A manifold is a smoothly varying low-dimensional structure embedded within high-dimensional ambient measurement space. In my talk, I will present a number of recently completed and ongoing projects that utilize the manifold, implemented using graph signal processing and deep learning, to understand large biomedical datasets. These include MAGIC, a data denoising and imputation method designed to ‘fix’ single-cell RNA-sequencing data, PHATE, a dimensionality reduction and visualization method specifically designed to reveal continuous progression structure, and two deep learning methods that use specially designed constraints to allow for deep interpretable representation of heterogeneous systems such as the gut microbiome and tumor infiltrating lymphocytes.

Bio

David van Dijk received his PhD in Computer Science from the University of Amsterdam. He carried out his graduate research on predicting gene expression from DNA sequence at the University of Amsterdam and the Weizmann Institute of Science under the supervision of Jaap Kaandorp and Eran Segal. As a postdoc in the departments of Genetics and Computer Science at Yale University, David develops machine learning tools, using graph signal processing and deep learning, to uncover complex biological signals from high-throughput, high-dimensional biomedical data.