Identifying diffusion sources in large networks: A community structure based approach

J Jiang, A Zhou, KM Yazdi, S Wen… - 2015 IEEE Trustcom …, 2015 - ieeexplore.ieee.org
2015 IEEE Trustcom/BigDataSE/ISPA, 2015ieeexplore.ieee.org
The global diffusion of epidemics, rumors and computer viruses causes great damage to our
society. It is critical to identify the diffusion sources and promptly quarantine them. However,
most methods proposed so far are unsuitable for large networks because of their
computational cost and the complex spatiotemporal diffusion processes. In this paper, we
develop a community structure based approach to efficiently identify diffusion sources in
large networks. We first detect the community structure of a network and assign sensors on …
The global diffusion of epidemics, rumors and computer viruses causes great damage to our society. It is critical to identify the diffusion sources and promptly quarantine them. However, most methods proposed so far are unsuitable for large networks because of their computational cost and the complex spatiotemporal diffusion processes. In this paper, we develop a community structure based approach to efficiently identify diffusion sources in large networks. We first detect the community structure of a network and assign sensors on community bridge nodes to record diffusion dynamics. From the infection time of bridge sensors, we can determine the very first infected community from which the diffusion started and spread out to other communities. This, therefore, overcomes the scalability issue in source identification problems by narrowing the set of suspects down to the first infected community. Then, to accurately locate the diffusion source from suspects, we utilize an intrinsic feature of diffusion sources that the relative infection time of any node is linear with its effective distance from the diffusion source. Thus, for each suspect, we compute the correlation coefficient to measure the degree of linear dependence between sensors' relative infection times and their effective distances from the suspect, and consider the one with the greatest correlation coefficient as the source. We evaluate our approach in two large networks containing more than 300,000 nodes, which are collected from Twitter. The experiment results show that our method can identify diffusion sources with very high degree of accuracy. Especially when the average community size shrinks, the accuracy of our approach increases dramatically.
ieeexplore.ieee.org
Showing the best result for this search. See all results