Networks are mathematical formalisms that capture relations within and between data sets. Over the past several decades, much progress has been made in explaining real-world phenomena in the biological, physical, and social sciences using network theory, and in particular, graph theory. The complexity of data has increased rapidly in recent years, partly because of the following reasons: (1) It is now easier to produce large volumes of data, e.g. in social networks, and (2) Technological progress has enabled better data capture and extraction, e.g. in biological networks. The objective of the growing field of Network Data Analysis is to devise methods for analyzing such complex network data. Some aspects of these methods are listed below.

Our research is supported by the following grants: NSF-DMS-1723003, NSF-IIS-1901360, NSF-DMS-1547357, NSF-CCF-1740761, NSF-CCF-1839358 and NSF-CCF-1839356.

Persistent Cup-Length
We lift the cup-length into the Persistent Cup-Length invariant for the purpose of extracting non-trivial information about the evolution of the cohomology ring structure across a filtration. We show that the Persistent Cup-Length can be computed from a family of representative cocycles and devise a polynomial time algorithm for the computation of the Persistent Cup-Length invariant. We furthermore show that this invariant is stable under suitable interleaving-type distances. Along the way, we identify an invariant which we call the Cup-Length Diagram, which is stronger than persistent cup-length but can still be computed efficiently. Read more in [Contessoto, Mémoli, Stefanou and Zhou, 2021]

The Gromov-Hausdorff distance between spheres
We provide general upper and lower bounds for the Gromov-Hausdorff distance between m-sphere and n-sphere (endowed with the round metric) for m and n between zero and infinity. Some of these lower bounds are based on certain topological ideas related to the Borsuk-Ulam theorem. Via explicit constructions of (optimal) correspondences we prove that our lower bounds are tight in some cases. We also formulate a number of open questions. Read more in [Lim, Mémoli and Smith, 2021]

Curvature sets over persistence diagrams
Inspired by Gromov's idea of the curvature sets, we define the (n,k)-Persistence Set of a compact metric space M as the collection of k-dimensional persistence diagrams of all samples X with n points. When M is finite, Persistence Sets provide a computationally cheaper alternative to the Vietoris-Rips complex of M, as the boundary matrices are significantly smaller whenever n<<|M|. In the special case that n=2k+2, it can be shown that the Vietoris-Rips persistent homology of X only has one interval in dimension k. This allows us to visuallize the (2k+2,k)-Persistence Set by overlaying the diagrams of all samples X into one set of axes. The image on the side shows the (4,1)-Persistence Set of the unit circle equipped with the shortest path distance. The configurations shown point to their respective persistence diagram inside of the Persistence Set. These invariants can detect the homotopy type of a certain family of metric graphs. [Gómez and Mémoli, 2021]

Persistent Laplacians
In this work we present a thorough study of the theoretical properties and devise efficient algorithms for the persistent Laplacian, an extension of the standard combinatorial Laplacian to the setting of simplicial pairs: pairs of simplicial complexes related by an inclusion, which was recently introduced by Wang et al. in [1]. In analogy with the non-persistent case, we establish that the nullity of the q-th persistent Laplacian equals the q-th persistent Betti number of any given simplicial pair which provides an interesting connection between spectral graph theory and TDA. We further exhibit a novel relationship between the persistent Laplacian and the notion of Schur complement of a matrix. This relation permits us to uncover a link with the notion of effective resistance from network circuit theory and leads to a persistent version of the Cheeger inequality. This relationship also leads to a novel and fundamentally different algorithm for computing the persistent Betti number for a pair of simplicial complexes which can be significantly more efficient than standard algorithms. See [ Mémoli, Wan and Wang, 2020] for our work.

Vietoris-Rips Persistent Homology, Injective Metric Spaces, and The Filling Radius
In the applied algebraic topology community, the persistent homology induced by the Vietoris-Rips simplicial filtration is a standard method for capturing topological information from metric spaces. In this paper, we consider a different, more geometric way of generating persistent homology of metric spaces which arises by first embedding a given metric space into a larger space and then considering thickenings of the original space inside this ambient metric space. In the course of doing this, we construct an appropriate category for studying this notion of persistent homology and show that, in a category theoretic sense, the standard persistent homology of the Vietoris-Rips filtration is isomorphic to our geometric persistent homology provided that the ambient metric space satisfies a property called injectivity. Read more in [Lim, Mémoli and Okutan, 2020]

Elder-Rule-Staircodes for Augmented Metric Spaces
An augmented metric space \((X, d_X, f_X)\) is a metric space \((X, d_X)\) equipped with a function \(f_X: X \to \mathbb{R}\). It arises commonly in practice, e.g, a point cloud \(X\) in \(\mathbb{R}^d\) where each point \(x\in X\) has a density function value \(f_X(x)\) associated to it. Such an augmented metric space naturally gives rise to a 2-parameter filtration \(\mathcal{K}\). However, the resulting 2-parameter persistence module \(H_0(\mathcal{K})\) could still be of wild representation type, and may not have simple indecomposables. Motivated by the elder-rule for the zeroth homology of a 1-parameter filtration, we propose a barcode-like summary, called the elder-rule-staircode, as a way to encode \(H_0(\mathcal{K})\). Specifically, given a finite \((X, d_X, f_X)\), its elder-rule-staircode consists of \(n = |X|\) number of staircase-like blocks in the plane. We show that if \(H_0(\mathcal{K})\) is interval decomposable, then the barcode of \(H_0(\mathcal{K})\) is equal to the elder-rule-staircode. Furthermore, regardless of the interval decomposability, the fibered barcode, the dimension function (a.k.a. the Hilbert function), and the graded Betti numbers of \(H_0(\mathcal{K})\) can all be efficiently computed once the elder-rule-staircode is given. Finally, we develop and implement an efficient algorithm to compute the elder-rule-staircode in \(O(n^2\log n)\) time, which can be improved to \(O(n^2\alpha(n))\) if \(X\) is from a fixed dimensional Euclidean space \(\mathbb{R}^d\), where \(\alpha(n)\) is the inverse Ackermann function. [ Proceedings of the 36th International Symposium on Computational Geometry (SoCG 2020) ]

The Wasserstein transform
We introduce the Wasserstein transform (WT), a family of methods for enhancing and denoising datasets defined on general metric spaces. The construction draws inspiration from Optimal Transportation ideas and also connects with the mean shift family of algorithms. We exhibit the usefulness of WT in ameliorating the chaining effect, a well-known obstacle for clustering tasks and also other application scenarios such as image classification, image segmentation and NLP. See [Mémoli, Smith and Wan, 2019 ] and [Jin, Mémoli and Wan, 2020] for our work on the topic.

Spatiotemporal Persistent Homology for Dynamic Metric Spaces
Characterizing the dynamics of time-evolving data within the framework of topological data analysis (TDA) has been attracting increasingly more attention. Popular instances of time-evolving data include flocking/swarming behaviors in animals and social networks in the human sphere. A natural mathematical model for such collective behaviors is a dynamic point cloud, or more generally a dynamic metric space (DMS). In this paper we extend the Rips filtration stability result for (static) metric spaces to the setting of DMSs. We do this by devising a certain three-parameter “spatiotemporal” filtration of a DMS. Applying the homology functor to this filtration gives rise to multidimensional persistence module derived from the DMS. We show that this multidimensional module enjoys stability under a suitable generalization of the Gromov–Hausdorff distance which permits metrization of the collection of all DMSs. On the other hand, it is recognized that, in general, comparing two multidimensional persistence modules leads to intractable computational problems. For the purpose of practical comparison of DMSs, we focus on both the rank invariant or the dimension function of the multidimensional persistence module that is derived from a DMS. We specifically propose to utilize a certain metric d for comparing these invariants: In our work this d is either (1) a certain generalization of the erosion distance by Patel, or (2) a specialized version of the well-known interleaving distance. In either case, the metric d can be computed in polynomial time. [Discrete & Computational Geometry 2020]

Persistent Homotopy Groups of Metric Spaces
With the aim of enlarging the toolset of persistence theory, which has traditionally dealt with homology groups and their persistent versions, we study notions of persistent homotopy groups of compact metric spaces together with their stability properties in the Gromov-Hausdorff sense. We pay particular attention to the case of persistent fundamental groups for which we obtain a more precise description. Under fairly mild assumptions on the spaces, the classical fundamental group, isomorphic to the limit of the persistent fundamental group, has an underlying tree-like structure (i.e. a dendrogram) and an associated ultrametric. See [Mémoli and Zhou, 2019].

Interleaving by Parts for Persistence In a Poset
Metrics in computational topology are often either (i) themselves in the form of the interleaving distance d(F,G) between certain order-preserving maps F and G between posets or (ii) admit d(F,G) as a tractable lower bound. Assuming that the target poset of F and G admits a join-dense subset, we propose certain join decompositions of F and G which facilitate the computation of d(F,G). We leverage this result in order to (i) elucidate the structure and computational complexity of the interleaving distance for poset-indexed clusterings (i.e. poset-indexed subpartition-valued functors), (ii) to clarify the relationship between the erosion distance by Patel and the graded rank function by Betthauser, Bubenik, and Edwards, and (iii) to reformulate and generalize the tripod distance by the second author.[Kim, Mémoli and Stefanou, 2019]

Computing Gromov-Hausdorff distances between ultrametric spaces
In recent years, the Gromov-Hausdorff distance has been widely used for shape analysis and for establishing stability results for topological invariants in TDA. However, the distance is well-known to be NP-hard to compute, even when restricted to ultrametric spaces. Ultrametric spaces naturally arise in many application areas such as hierarchical clustering, phylogenetics, genomics, and even linguistics. In this project, we (1) found a polynomial-time computable variant of the Gromov-Hausdorff distance on the collection of ultrametric spaces, and (2) identified subcollections of ultrametric space on which the Gromov-Hausdorff distance can be computed in polynomial time. See [Mémoli, Smith and Wan, 2019] for details. We also extend our study of Gromov-Hausdorff distance to ultrametric measure spaces and consider applications to phylogenetic trees; see [Mémoli, Munk, Wan and Weitkamp, 2021] for details.

Network metrics
A natural question to ask when comparing two networks is: are these networks the same, or are they different? The difference between two networks can be formally quantified by developing metrics on the collections of all networks. Understanding the behavior of such metrics is one of our ongoing projects. See the related papers [Chowdhury and Mémoli, 2017], [Chowdhury and Mémoli, 2018], and [Chowdhury and Mémoli, 2018].

Clustering networks
Clustering methods, in particular hierarchical clustering methods, can be very useful in exploratory data analysis. Classical hierarchical clustering reveals the presence of clusters across a range of scales for metric datasets (i.e. "well-behaved" datasets), from which researchers are often able to find hidden groupings and patterns. However, general network data tends to be more "wild." We are interested in being able to provide hierachical clustering schemes that simplify the visualization of even very general network data, while providing theoretical guarantees on the worst-case distortion that can occur from applying such clustering methods. See [Chowdhury, Mémoli and Smith, 2016] and [Smith, Chowdhury and Mémoli, 2016].

Persistent Homology of Networks
An ongoing project in our group is to better understand the persistent homology of a general network. In addition to devising robust theoretical methods for producing persistence diagrams from networks, we are interested in using our methods to glean new insights into the structures of a wide range of network datasets. See the related works [Chowdhury and Mémoli, 2018], [Czumaj, 2018], and [Poster].

Zigzag Persistent Homology of Dynamic Networks
When studying flocking/swarming behaviors in animals one is interested in quantifying and comparing the dynamics of the clustering induced by the coalescence and disbanding of animals in different groups. Motivated by this, we study the problem of obtaining persistent homology based summaries of time-dependent metric data. Our research has produced the notions of formigrams, and distances between dynamic metric spaces, and dynamic graphs. Fot the interphase with the data anlysis part, we use concepts from zigzag persistent homology. Read more in See for an interatctive demo which can compute formigrams of user specified data.

Zigzag Persistent Homology for Neural Data
We apply Zigzag Persistent Homology towards understanding the amount of information contained in the spike trains of hippocampal place cells. Previous work has established that simply knowing which groups of place cells fire together in an animal's hippocampus is sufficient to extract the global topology of the animal's physical environment. We model a system where collections of place cells group and ungroup according to short-term plasticity rules. We obtain the surprising result that in experiments with spurious firing, the accuracy of the extracted topological information decreases with the persistence (beyond a certain regime) of the cell groups. This suggests that synaptic transience, or forgetting, is a mechanism by which the brain counteracts the effects of spurious place cell activity. PLoS ONE link.

Metric graph approximation of geodesic spaces
A standard result in metric geometry is that every compact geodesic metric space can be approximated arbitrarily well by finite metric graphs in the Gromov-Hausdorff sense. It is well known that the first Betti number of the approximating graphs may blow up as the approximation gets finer. In our work, given a compact geodesic metric \(X\), we define a sequence \((\delta^X_n)_{n \geq 0}\) of non-negative real numbers by $$\delta^X_n:=\inf \{d_{GH}(X,G): G \text{ a finite metric graph, } \beta_1(G)\leq n \} .$$ By construction, and the above result, this is a non-increasing sequence with limit \(0\). We study this sequence and its rates of decay with \(n\). We also identify a precise relationship between the sequence and the first Vietoris-Rips persistence barcode of \(X\). Furthermore, we specifically analyze \(\delta_0^X\) and find upper and lower bounds based on hyperbolicity and other metric invariants. As a consequence of the tools we develop, our work also provides a Gromov-Hausdorff stability result for the Reeb construction on geodesic metric spaces with respect to the function given by distance to a reference point. [Mémoli and Okutan, 2018].

Optimal Transport and Persistent Homology based distances between shapes
Our group has studied different methods for discriminating shapes. In many cases we see shapes/datasets as metric spaces or metric-measure spaces and utilize either the Gromov-Hausdorff or the Gromov-Wasserstein distance as bases for our studies. These distances admit several poly-time computable lower bounds which can be used in practical applications. For instance, given a metric measure space \((X,d_X,\mu_X)\) one can consider the so called local shape distribution \(h_X(x,t):=\mu_X(\overline(B_t(x)))\) as a feature of the space \(X\). Now, given two such spaces on sets up an optimal transport problem involving both \(h_X\) and \(h_Y\) whose solution turns out to be a lower bound for the Gromov-Wasserstein distance. See this paper and Ying Yin's MMS CPS thesis work demo.