Mini-course: The Topology of Random Geometric Complexes

Omer Bobrowski, Duke University,


Given a set of points P in a metric space and a positive number r, the geometric graph G(P,r) is a graph whose set of vertices is P, and where two vertices are connected by an edge if the distance between them is less than r. Geometric simplicial complexes extend the idea of this construction and use the geometry of the set P to generate an abstract simplicial complex - a combinatorial structure made of vertices,edges, triangles and higher dimensional faces. Our main interest will be in the homology of these complexes. Briefly, homology is an algebraic structure that provides information about the connected components, loops, cavities, and higher dimensional 'holes' or 'cycles' in a space.

Geometric complexes arise often in topological data analysis (TDA), and are at the heart of many computational topology algorithms. The two most common ones are the Čech and the Vietoris-Rips complexes. These complexes allow us to convert abstract topological problems into simpler algebraic/combinatorial questions due to the following. If the set P lies in a topological space X then the celebrated `nerve lemma' states that under some conditions the Čech complex built from P, and X are homotopy equivalent, and in particular have the same homology. The Rips complex does not possess this property, but has been proven to serve as a good approximation for X. 

Thus, in order to develop a statistical framework for TDA, it would be highly useful to understand the behavior of random geometric complexes first. In a random geometric complex we have a random point process P (e.g. a spatial Poisson process), and we use this random set to  construct either a Čech or a Rips complex. Both the Čech and the Rips complexes are controlled by a parameter r, analogously to geometric graphs, such that the number of simplexes is increasing with r. Our goal will be to study the probabilistic properties of the homology of such complexes, and to examine the interaction between the point process model, the number of points, and the parameter r, in determining the behavior of homology.

Some of the main questions we will try to answer are:

(-) Starting from r=0 with a set of disconnected points, and letting r increase (adding simplexes to the complex) - when would non-trivial homology first appear? Will it vanish at some point, and if so when?

(-) What is the distribution of the Betti numbers (the ranks of the homology groups)? How does it depend on the number of points, the distribution of the point process, and the parameter r?

(-) Assuming that the point process P is generated from a distribution supported on a space of interest X. Under what conditions will a geometric complex recover the homology of X? How can we handle "noise" in such problems?

(-) What probabilistic statements can we make about persistent homology?

Much of the study on random geometric complexes has been done in the past decade, and there are many fascinating and challenging open problems that we will discuss as well.