Kyle Kloster Wrangling data, algorithms, and code in the SF Bay

PhD Dissertation Defense, Purdue University

April 22, 2016

PhD Advisor: David F. Gleich

Graph diffusions and matrix functions: fast algorithms and localization results

Network analysis provides tools for addressing fundamental applications in graphs such as webpage ranking, protein-function prediction, and product categorization and recommendation. As real-world networks grow to have millions of nodes and billions of edges, the scalability of network analysis algorithms becomes increasingly important. Whereas many standard graph algorithms rely on matrix-vector operations that require exploring the entire graph, this thesis is concerned with graph algorithms that are local (that explore only the graph region near the nodes of interest) as well as the localized behavior of global algorithms. We prove that two well-studied matrix functions for graph analysis, PageRank and the matrix exponential, stay localized on networks that have a skewed degree sequence related to the power-law degree distribution common to many real-world networks. Our results give the first theoretical explanation of a localization phenomenon that has long been observed in real-world networks. We prove our novel method for the matrix exponential converges in sublinear work on graphs with the specified degree sequence, and we adapt our method to produce the first deterministic algorithm for computing the related heat kernel diffusion in constant-time. Finally, we generalize this framework to compute any graph diffusion in constant time.