TY - JOUR AB - Various kinds of data are routinely represented as discrete probability distributions. Examples include text documents summarized by histograms of word occurrences and images represented as histograms of oriented gradients. Viewing a discrete probability distribution as a point in the standard simplex of the appropriate dimension, we can understand collections of such objects in geometric and topological terms. Importantly, instead of using the standard Euclidean distance, we look into dissimilarity measures with information-theoretic justification, and we develop the theory needed for applying topological data analysis in this setting. In doing so, we emphasize constructions that enable the usage of existing computational topology software in this context. AU - Edelsbrunner, Herbert AU - Virk, Ziga AU - Wagner, Hubert ID - 9630 IS - 2 JF - Journal of Computational Geometry TI - Topological data analysis in information space VL - 11 ER - TY - CONF AB - Various kinds of data are routinely represented as discrete probability distributions. Examples include text documents summarized by histograms of word occurrences and images represented as histograms of oriented gradients. Viewing a discrete probability distribution as a point in the standard simplex of the appropriate dimension, we can understand collections of such objects in geometric and topological terms. Importantly, instead of using the standard Euclidean distance, we look into dissimilarity measures with information-theoretic justification, and we develop the theory needed for applying topological data analysis in this setting. In doing so, we emphasize constructions that enable the usage of existing computational topology software in this context. AU - Edelsbrunner, Herbert AU - Virk, Ziga AU - Wagner, Hubert ID - 6648 SN - 9783959771047 T2 - 35th International Symposium on Computational Geometry TI - Topological data analysis in information space VL - 129 ER - TY - JOUR AB - We study the topology generated by the temperature fluctuations of the cosmic microwave background (CMB) radiation, as quantified by the number of components and holes, formally given by the Betti numbers, in the growing excursion sets. We compare CMB maps observed by the Planck satellite with a thousand simulated maps generated according to the ΛCDM paradigm with Gaussian distributed fluctuations. The comparison is multi-scale, being performed on a sequence of degraded maps with mean pixel separation ranging from 0.05 to 7.33°. The survey of the CMB over 𝕊2 is incomplete due to obfuscation effects by bright point sources and other extended foreground objects like our own galaxy. To deal with such situations, where analysis in the presence of “masks” is of importance, we introduce the concept of relative homology. The parametric χ2-test shows differences between observations and simulations, yielding p-values at percent to less than permil levels roughly between 2 and 7°, with the difference in the number of components and holes peaking at more than 3σ sporadically at these scales. The highest observed deviation between the observations and simulations for b0 and b1 is approximately between 3σ and 4σ at scales of 3–7°. There are reports of mildly unusual behaviour of the Euler characteristic at 3.66° in the literature, computed from independent measurements of the CMB temperature fluctuations by Planck’s predecessor, the Wilkinson Microwave Anisotropy Probe (WMAP) satellite. The mildly anomalous behaviour of the Euler characteristic is phenomenologically related to the strongly anomalous behaviour of components and holes, or the zeroth and first Betti numbers, respectively. Further, since these topological descriptors show consistent anomalous behaviour over independent measurements of Planck and WMAP, instrumental and systematic errors may be an unlikely source. These are also the scales at which the observed maps exhibit low variance compared to the simulations, and approximately the range of scales at which the power spectrum exhibits a dip with respect to the theoretical model. Non-parametric tests show even stronger differences at almost all scales. Crucially, Gaussian simulations based on power-spectrum matching the characteristics of the observed dipped power spectrum are not able to resolve the anomaly. Understanding the origin of the anomalies in the CMB, whether cosmological in nature or arising due to late-time effects, is an extremely challenging task. Regardless, beyond the trivial possibility that this may still be a manifestation of an extreme Gaussian case, these observations, along with the super-horizon scales involved, may motivate the study of primordial non-Gaussianity. Alternative scenarios worth exploring may be models with non-trivial topology, including topological defect models. AU - Pranav, Pratyush AU - Adler, Robert J. AU - Buchert, Thomas AU - Edelsbrunner, Herbert AU - Jones, Bernard J.T. AU - Schwartzman, Armin AU - Wagner, Hubert AU - Van De Weygaert, Rien ID - 6756 JF - Astronomy and Astrophysics SN - 00046361 TI - Unexpected topology of the temperature fluctuations in the cosmic microwave background VL - 627 ER - TY - CONF AB - Smallest enclosing spheres of finite point sets are central to methods in topological data analysis. Focusing on Bregman divergences to measure dissimilarity, we prove bounds on the location of the center of a smallest enclosing sphere. These bounds depend on the range of radii for which Bregman balls are convex. AU - Edelsbrunner, Herbert AU - Virk, Ziga AU - Wagner, Hubert ID - 188 TI - Smallest enclosing spheres and Chernoff points in Bregman geometry VL - 99 ER - TY - CONF AB - We show that the framework of topological data analysis can be extended from metrics to general Bregman divergences, widening the scope of possible applications. Examples are the Kullback - Leibler divergence, which is commonly used for comparing text and images, and the Itakura - Saito divergence, popular for speech and sound. In particular, we prove that appropriately generalized čech and Delaunay (alpha) complexes capture the correct homotopy type, namely that of the corresponding union of Bregman balls. Consequently, their filtrations give the correct persistence diagram, namely the one generated by the uniformly growing Bregman balls. Moreover, we show that unlike the metric setting, the filtration of Vietoris-Rips complexes may fail to approximate the persistence diagram. We propose algorithms to compute the thus generalized čech, Vietoris-Rips and Delaunay complexes and experimentally test their efficiency. Lastly, we explain their surprisingly good performance by making a connection with discrete Morse theory. AU - Edelsbrunner, Herbert AU - Wagner, Hubert ID - 688 SN - 18688969 TI - Topological data analysis with Bregman divergences VL - 77 ER - TY - JOUR AB - Phat is an open-source C. ++ library for the computation of persistent homology by matrix reduction, targeted towards developers of software for topological data analysis. We aim for a simple generic design that decouples algorithms from data structures without sacrificing efficiency or user-friendliness. We provide numerous different reduction strategies as well as data types to store and manipulate the boundary matrix. We compare the different combinations through extensive experimental evaluation and identify optimization techniques that work well in practical situations. We also compare our software with various other publicly available libraries for persistent homology. AU - Bauer, Ulrich AU - Kerber, Michael AU - Reininghaus, Jan AU - Wagner, Hubert ID - 1433 JF - Journal of Symbolic Computation SN - 07477171 TI - Phat - Persistent homology algorithms toolbox VL - 78 ER - TY - CONF AB - We present an efficient algorithm to compute Euler characteristic curves of gray scale images of arbitrary dimension. In various applications the Euler characteristic curve is used as a descriptor of an image. Our algorithm is the first streaming algorithm for Euler characteristic curves. The usage of streaming removes the necessity to store the entire image in RAM. Experiments show that our implementation handles terabyte scale images on commodity hardware. Due to lock-free parallelism, it scales well with the number of processor cores. Additionally, we put the concept of the Euler characteristic curve in the wider context of computational topology. In particular, we explain the connection with persistence diagrams. AU - Heiss, Teresa AU - Wagner, Hubert ED - Felsberg, Michael ED - Heyden, Anders ED - Krüger, Norbert ID - 833 SN - 03029743 TI - Streaming algorithm for Euler characteristic curves of multidimensional images VL - 10424 ER -