# publications

Publications sorted in reverse chronological order.

## 2024

- COLT 2024Finding Super-spreaders in Network CascadesElchanan Mossel, and
*Anirudh Sridhar**Proceedings of the Conference on Learning Theory*, Jul 2024Suppose that a cascade (e.g., an epidemic) spreads on an unknown graph, and only the infection times of vertices are observed. What can be learned about the graph from the infection times caused by multiple distinct cascades? Most of the literature on this topic focuses on the task of recovering the \emphentire graph, which requires Ω( \log n) cascades for an n-vertex bounded degree graph. Here we ask a different question: can the important parts of the graph be estimated from just a few (i.e., constant number) of cascades, even as n grows large? In this work, we focus on identifying super-spreaders (i.e., high-degree vertices) from infection times caused by a Susceptible-Infected process on a graph. Our first main result shows that vertices of degree greater than n^3/4 can indeed be estimated from a constant number of cascades. Our algorithm for doing so leverages a novel connection between vertex degrees and the second derivative of the cumulative infection curve. Conversely, we show that estimating vertices of degree smaller than n^1/2 requires at least \log(n) / \log \log (n) cascades. Surprisingly, this matches (up to \log \log n factors) the number of cascades needed to learn the \emphentire graph if it is a tree.

## 2023

- PreprintAverage-case and Smoothed Analysis of Graph IsomorphismJulia Gaudio, Miklós Z. Rácz, and
*Anirudh Sridhar**Preprint*, Sep 2023We propose a simple and efficient local algorithm for graph isomorphism which succeeds for a large class of sparse graphs. This algorithm produces a low-depth canonical labeling, which is a labeling of the vertices of the graph that identifies its isomorphism class using vertices’ local neighborhoods. Prior work by Czajka and Pandurangan showed that the degree profile of a vertex (i.e., the sorted list of the degrees of its neighbors) gives a canonical labeling with high probability when $np_n = ω( \log^4 (n) / \log \log n) (and p_n≤1/2); subsequently, Mossel and Ross showed that the same holds when np_n=ω(log2(n)). We first show that their analysis essentially cannot be improved: we prove that when npn=o(log2(n)/(loglogn)3), with high probability there exist distinct vertices with isomorphic 2-neighborhoods. Our first main result is a positive counterpart to this, showing that 3-neighborhoods give a canonical labeling when np_≥(1 + δ) log n (and p_n≤1/2$); this improves a recent result of Ding, Ma, Wu, and Xu, completing the picture above the connectivity threshold. Our second main result is a smoothed analysis of graph isomorphism, showing that for a large class of deterministic graphs, a small random perturbation ensures that 3-neighborhoods give a canonical labeling with high probability. While the worst-case complexity of graph isomorphism is still unknown, this shows that graph isomorphism has polynomial smoothed complexity.

- ISIT 2023Quickest Inference of Susceptible-Infected Cascades in Sparse Networks
*Anirudh Sridhar*, Tirza Routtenberg, and H. Vincent Poor*In Proceedings of the 2023 IEEE International Symposium on Information Theory*, Jul 2023We consider the task of estimating a network cascade as fast as possible. The cascade is assumed to spread according to a general Susceptible-Infected process with heterogeneous transmission rates from an unknown source in the network. While the propagation is not directly observable, noisy information about its spread can be gathered through multiple rounds of error-prone diagnostic testing. We propose a novel adaptive procedure which quickly outputs an estimate for the cascade source and the full spread under this observation model. Remarkably, under mild conditions on the network topology, our procedure is able to estimate the full spread of the cascade in an n-vertex network, before polylog(n) vertices are affected by the cascade. We complement our theoretical analysis with simulation results illustrating the effectiveness of our methods.

- ISIT 2023Matching Correlated Inhomogeneous Random Graphs using the k-core EstimatorMiklós Z. Rácz, and
*Anirudh Sridhar**In Proceedings of the 2023 IEEE International Symposium on Information Theory*, Jul 2023We consider the task of estimating the latent vertex correspondence between two edge-correlated random graphs with generic, inhomogeneous structure. We study the so-called \emphk-core estimator, which outputs a vertex correspondence that induces a large, common subgraph of both graphs which has minimum degree at least k. We derive sufficient conditions under which the k-core estimator exactly or partially recovers the latent vertex correspondence. Finally, we specialize our general framework to derive new results on exact and partial recovery in correlated stochastic block models, correlated Chung-Lu graphs, and correlated random geometric graphs.

- AAAI 2023Recovering the Graph Underlying Networked Dynamical Systems under Partial Observability: A Deep Learning ApproachSérgio Machado,
*Anirudh Sridhar*, Paulo Gil, Jorge Henriques, José M. F. Moura, and Augusto Santos*In Proceedings of the 37th AAAI Conference on Artificial Intelligence*, Feb 2023We study the problem of graph structure identification, i.e., of recovering the graph of dependencies among time series. We model these time series data as components of the state of linear stochastic networked dynamical systems. We assume partial observability, where the state evolution of only a subset of nodes comprising the network is observed. We propose a new feature-based paradigm: to each pair of nodes, we compute a feature vector from the observed time series. We prove that these features are linearly separable, i.e., there exists a hyperplane that separates the cluster of features associated with connected pairs of nodes from those of disconnected pairs. This renders the features amenable to train a variety of classifiers to perform causal inference. In particular, we use these features to train Convolutional Neural Networks (CNNs). The resulting causal inference mechanism outperforms state-of-the-art counterparts w.r.t. sample-complexity. The trained CNNs generalize well over structurally distinct networks (dense or sparse) and noise-level profiles. Remarkably, they also generalize well to real-world networks while trained over a synthetic network – namely, a particular realization of a random graph.

- SICONMean-field Approximations for Stochastic Population Processes with Heterogeneous Interactions
*Anirudh Sridhar*, and Soummya Kar*SIAM Journal on Control and Optimization*, Nov 2023This paper studies a general class of stochastic population processes in which agents interact with one another over a network. Agents update their behaviors in a random and decentralized manner according to a policy that depends only on the agent’s current state and an estimate of the macroscopic population state, given by a weighted average of the neighboring states. When the number of agents is large and the network is a complete graph (has all-to-all information access), the macroscopic behavior of the population can be well-approximated by a set of deterministic differential equations called a \it mean-field approximation. For incomplete networks such characterizations remained previously unclear, i.e., in general whether a suitable mean-field approximation exists for the macroscopic behavior of the population. The paper addresses this gap by establishing a generic theory describing when various mean-field approximations are accurate for \empharbitrary interaction structures. Our results are threefold. Letting W be the matrix describing agent interactions, we first show that a simple mean-field approximation that incorrectly assumes a homogeneous interaction structure is accurate provided W has a large spectral gap. Second, we show that a more complex mean-field approximation which takes into account agent interactions is accurate as long as the Frobenius norm of W is small. Finally, we compare the predictions of the two mean-field approximations through simulations, highlighting cases where using mean-field approximations that assume a homogeneous interaction structure can lead to inaccurate qualitative and quantitative predictions.

- PhysRevEThe Role of Masks in Mitigating Viral Spread on NetworksYurun Tian,
*Anirudh Sridhar*, Chai Wah Wu, Simon A. Levin, Kathleen M. Carley, H. Vincent Poor, and Osman Yagan*Physical Review E*, Jul 2023Masks have remained an important mitigation strategy in the fight against COVID-19 due to their ability to prevent the transmission of respiratory droplets between individuals. In this work, we provide a comprehensive quantitative analysis of the impact of mask-wearing. To this end, we propose a novel agent-based model of viral spread on networks where agents may either wear no mask or wear one of several types of masks with different properties (e.g., cloth or surgical). We derive analytical expressions for three key epidemiological quantities: The probability of emergence, the epidemic threshold, and the expected epidemic size. In particular, we show how the aforementioned quantities depend on the structure of the contact network, viral transmission dynamics, and the distribution of the different types of masks within the population. Through extensive simulations, we then investigate the impact of different allocations of masks within the population and tradeoffs between the outward efficiency and inward efficiency of the masks. Interestingly, we find that masks with high outward efficiency and low inward efficiency are most useful for controlling the spread in the early stages of an epidemic, while masks with high inward efficiency but low outward efficiency are most useful in reducing the size of an already large spread. Last, we study whether degree-based mask allocation is more effective in reducing the probability of epidemic as well as epidemic size compared to random allocation. The result echoes the previous findings that mitigation strategies should differ based on the stage of the spreading process, focusing on source control before the epidemic emerges and on self-protection after the emergence.

- PNASSpreading Processes with Mutations over Multi-Layer NetworksMansi Sood,
*Anirudh Sridhar*, Rashad Eletreby, Chai Wah Wu, Simon A. Levin, H. Vincent Poor, and Osman Yagan*Proceedings of the National Academy of Sciences*, Jun 2023A key scientific challenge during the outbreak of novel infectious diseases is to predict how the course of the epidemic changes under countermeasures that limit interaction in the population. Most epidemiological models do not consider the role of mutations and heterogeneity in the type of contact events. However, pathogens have the capacity to mutate in response to changing environments, especially caused by the increase in population immunity to existing strains, and the emergence of new pathogen strains poses a continued threat to public health. Further, in the light of differing transmission risks in different congregate settings (e.g., schools and offices), different mitigation strategies may need to be adopted to control the spread of infection. We analyze a multilayer multistrain model by simultaneously accounting for i) pathways for mutations in the pathogen leading to the emergence of new pathogen strains, and ii) differing transmission risks in different settings, modeled as network layers. Assuming complete cross-immunity among strains, namely, recovery from any infection prevents infection with any other (an assumption that will need to be relaxed to deal with COVID-19 or influenza), we derive the key epidemiological parameters for the multilayer multistrain framework. We demonstrate that reductions to existing models that discount heterogeneity in either the strain or the network layers may lead to incorrect predictions. Our results highlight that the impact of imposing/lifting mitigation measures concerning different contact network layers (e.g., school closures or work-from-home policies) should be evaluated in connection with their effect on the likelihood of the emergence of new strains.

- IEEE-ITQuickest Inference of Network Cascades with Noisy Information
*Anirudh Sridhar*, and H. Vincent Poor*IEEE Transactions on Information Theory*, Apr 2023We study the problem of estimating the source of a network cascade given a time series of noisy information about the spread. Initially, there is a single vertex affected by the cascade (the source) and the cascade spreads in discrete time steps across the network. Although the cascade evolution is hidden, one observes a noisy measurement of the evolution at each time step. Given this information, we aim to reliably estimate the cascade source as fast as possible. We investigate Bayesian and minimax formulations of the source estimation problem, and derive near-optimal estimators for simple cascade dynamics and network topologies. In the Bayesian setting, samples are taken until the error of the Bayes-optimal estimator falls below a threshold. For the minimax setting, we design a novel multi-hypothesis sequential probability ratio test. These optimal estimators require \log \log n / \log (k- 1) observations for a k-regular tree network, and (\log n)^1 / (\ell + 1) observations for a \ell-dimensional lattice. We then discuss conjectures on source estimation in general topologies. Finally, we provide simulations which validate our theoretical results on trees and lattices, and illustrate the effectiveness of our methods for estimating the sources of cascades on Erdős-Rényi graphs.

## 2022

- COLT 2022Exact Community Recovery in Correlated Stochastic Block ModelsJulia Gaudio, Miklós Z. Rácz, and
*Anirudh Sridhar**In Proceedings of the 35th Annual Conference on Learning Theory*, Jul 2022We consider the problem of learning latent community structure from multiple correlated networks. We study edge-correlated stochastic block models with two balanced communities, focusing on the regime where the average degree is logarithmic in the number of vertices. Our main result derives the precise information-theoretic threshold for exact community recovery using multiple correlated graphs. This threshold captures the interplay between the community recovery and graph matching tasks. In particular, we uncover and characterize a region of the parameter space where exact community recovery is possible using multiple correlated graphs, even though (1) this is information-theoretically impossible using a single graph and (2) exact graph matching is also information-theoretically impossible. In this regime, we develop a novel algorithm that carefully synthesizes algorithms from the community recovery and graph matching literatures.

- AAPCorrelated Randomly Growing GraphsMiklós Z. Rácz, and
*Anirudh Sridhar**The Annals of Applied Probability*, Apr 2022We introduce a new model of correlated randomly growing graphs and study the fundamental questions of detecting correlation and estimating aspects of the correlated structure. The model is simple and starts with any model of randomly growing graphs, such as uniform attachment (UA) or preferential attachment (PA). Given such a model, a pair of graphs (G_1, G_2) is grown in two stages: until time t_⋆they are grown together (i.e., G_1 = G_2), after which they grow independently according to the underlying growth model. We show that whenever the seed graph has an influence in the underlying graph growth model—this has been shown for PA and UA trees and is conjectured to hold broadly—then correlation can be detected in this model, even if the graphs are grown together for just a single time step. We also give a general sufficient condition (which holds for PA and UA trees) under which detection is possible with probability going to 1 as t_⋆\to ∞. Finally, we show for PA and UA trees that the amount of correlation, measured by t_⋆, can be estimated with vanishing relative error as t_⋆\to ∞.

## 2021

- ICASSP 2021Leveraging a Multiple-Strain Model with Mutations in Analyzing the Spread of COVID-19
*Anirudh Sridhar*, Osman Yagan, Rashad Eletreby, Simon A. Levin, Joshua B. Plotkin, and H. Vincent Poor*In Proceedings of the 46th IEEE International Conference on Acoustics, Speech, and Signal Processing*, Jun 2021The spread of COVID-19 has been among the most devastating events affecting the health and well-being of humans worldwide since World War II. A key scientific goal concerning COVID-19 is to develop mathematical models that help us to understand and predict its spreading behavior, as well as to provide guidelines on what can be done to limit its spread. In this paper, we discuss how our recent work on a multiple-strain spreading model with mutations can help address some key questions concerning the spread of COVID-19. We highlight the recent reports on a mutation of SARS-CoV-2 that is thought to be more transmissible than the original strain and discuss the importance of incorporating mutation and evolutionary adaptations (together with the network structure) in epidemic models. We also demonstrate how the multiple-strain transmission model can be used to assess the effectiveness of mask-wearing in limiting the spread of COVID-19. Finally, we present simulation results to demonstrate our ideas and the utility of the multiple-strain model in the context of COVID-19.

- ICASSP 2021Bayes-Optimal Methods for Finding the Source of a Cascade
*Anirudh Sridhar*, and H. Vincent Poor*In Proceedings of the 46th IEEE International Conference on Acoustics, Speech, and Signal Processing*, Jun 2021We study the problem of estimating the source of a network cascade. The cascade initially starts from a single vertex and spreads deterministically over time, but only a noisy version of the propagation is observable. The goal is then to design a stopping time and estimator that will estimate the source well while ensuring the number of affected vertices is not too large. We rigorously formulate a Bayesian approach to the problem. If vertices can be labelled by vectors in Euclidean space (which is natural in spatial networks), the optimal estimator is the conditional mean estimator, and we derive an explicit form for the optimal stopping time under minimal assumptions on the cascade dynamics. We study the performance of the optimal stopping time on lattices, and show that a computationally efficient but suboptimal stopping time which compares the posterior variance to a threshold has near-optimal performance.

- ACC 2021Analysis of the Impact of Mask-wearing in Viral Spread: Implications for COVID-19.Yurun Tian,
*Anirudh Sridhar*, Osman Yagan, and H. Vincent Poor*In Proceedings of the 2021 American Control Conference*, May 2021Masks are used as part of a comprehensive strategy of measures to limit transmission and save lives during the COVID-19 pandemic. Research about the impact of mask-wearing in the COVID-19 pandemic has raised formidable interest across multiple disciplines. In this paper, we investigate the impact of mask-wearing in spreading processes over complex networks. This is done by studying a heterogeneous bond percolation process over a multi-type network model, where nodes can be one of two types (mask-wearing, and not-mask-wearing). We provide analytical results that accurately predict the expected epidemic size and probability of emergence as functions of the characteristics of the spreading process (e.g., transmission probabilities, inward and outward efficiency of the masks, etc.), the proportion of mask-wearers in the population, and the structure of the underlying contact network. In addition to the theoretical analysis, we also conduct extensive simulations on random networks. We also comment on the analogy between the mask-model studied here and the multiple-strain viral spreading model with mutations studied recently by Eletreby et al.

- NeurIPS 2021Correlated Stochastic Block Models: Exact Graph Matching with Applications to Recovering CommunitiesMiklós Z. Rácz, and
*Anirudh Sridhar**In Proceedings of the 35th Conference on Neural Information Processing Systems*, Dec 2021We consider the task of learning latent community structure from multiple correlated networks. First, we study the problem of learning the latent vertex correspondence between two edge-correlated stochastic block models, focusing on the regime where the average degree is logarithmic in the number of vertices. We derive the precise information-theoretic threshold for exact recovery: above the threshold there exists an estimator that outputs the true correspondence with probability close to 1, while below it no estimator can recover the true correspondence with probability bounded away from 0. As an application of our results, we show how one can exactly recover the latent communities using \emphmultiple correlated graphs in parameter regimes where it is information-theoretically impossible to do so using just a single graph.

- HDSRModeling and Analysis of the Spread of COVID-19 under a Multiple-Strain Model with MutationsOsman Yagan,
*Anirudh Sridhar*, Simon A. Levin, Joshua A. Plotkin, and H. Vincent Poor*Harvard Data Science Review*, Apr 2021Since December 2019, COVID-19 has caused worldwide devastation. To devise effective countermeasures, it is important to develop mathematical models that help us to understand and predict the spreading of COVID-19, as well as to provide guidelines on what can be done to limit its spread. To this end, we leverage recent work of (Eletreby et al, 2020), which studies a model where multiple strains of a virus propagate through a network while also undergoing mutations. Highlighting the recent reports on a mutation of SARS-CoV-2 that is thought to be more transmissible than the original strain, we discuss the importance of incorporating mutations and evolutionary adaptations in epidemic models. We also demonstrate how the results of (Eletreby et al, 2020) can be used to assess the effectiveness of mask-wearing in limiting the spread of COVID-19. These are supported by simulation results showing the impact of various mutation and mask-wearing possibilities.

## 2020

- Asilomar 2020Sequential Estimation of Network Cascades
*Anirudh Sridhar*, and H. Vincent Poor*In Proceedings of the 54th Asilomar Conference on Signals, Systems and Computers*, Nov 2020We consider the problem of locating the source of a network cascade, given a noisy time-series of network data. Initially, the cascade starts with one unknown, affected vertex and spreads deterministically at each time step. The goal is to find an adaptive procedure that outputs an estimate for the source as fast as possible, subject to a bound on the estimation error. For a general class of graphs, we describe a family of matrix sequential probability ratio tests (MSPRTs) that are first-order asymptotically optimal up to a constant factor as the estimation error tends to zero. We apply our results to lattices and regular trees, and show that MSPRTs are asymptotically optimal for regular trees. We support our theoretical results with simulations.

- ICASSP 2020On Distributed Stochastic Gradient Algorithms for Global OptimizationBrian Swenson,
*Anirudh Sridhar*, and H. Vincent Poor*In Proceedings of the 45th IEEE International Conference on Acoustics, Speech, and Signal Processing*, May 2020The paper considers the problem of network-based computation of global minima in smooth nonconvex optimization problems. It is known that distributed gradient-descent-type algorithms can achieve convergence to the set of global minima by adding slowly decaying Gaussian noise in order to escape local minima. However, the technical assumptions under which convergence is known to occur can be restrictive in practice. In particular, in known convergence results, the local objective functions possessed by agents are required to satisfy a highly restrictive bounded-gradient-dissimilarity condition. The paper demonstrates convergence to the set of global minima while relaxing this key assumption.

## 2016

- AsiaCCS 2016Client-CASH: Protecting Master Passwords Against Offline AttacksJeremiah Blocki, and
*Anirudh Sridhar**In Proceedings of the 11th ACM Asia Conference on Computer and Communications Security*, May 2016Offline attacks on passwords are increasingly commonplace and dangerous. An offline adversary is limited only by the amount of computational resources he or she is willing to invest to crack a user’s password. The danger is compounded by the existence of authentication servers who fail to adopt proper password storage practices like key-stretching. Password managers can help mitigate these risks by adopting key stretching procedures like hash iteration or memory hard functions to derive site specific passwords from the user’s master password on the client-side. While key stretching can reduce the offline adversary’s success rate, these procedures also increase computational costs for a legitimate user. Motivated by the observation that most of the password guesses of the offline adversary will be incorrect, we propose a client side cost asymmetric secure hashing scheme (clientcash). clientcash randomizes the runtime of client-side key stretching procedure in a way that the expected computational cost of our key derivation function is greater when run with an incorrect master password. We make several contributions. First, we show how to introduce randomness into a client-side key stretching algorithms through the use of halting predicates which are selected randomly at the time of account creation. Second, we formalize the problem of finding the optimal running time distribution subject to certain cost constraints for the client and certain security constrains on the halting predicates. Finally, we demonstrate that Client-CASH can reduce the adversary’s success rate by up to 21%. These results demonstrate the promise of the Client-CASH mechanism.