Skip to main content
Log in

Modeling blogger influence in a community

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

Blogging has become a popular and convenient way to communicate, publish information, share preferences, voice opinions, provide suggestions, report news, and form virtual communities in the Blogosphere. The blogosphere obeys a power law distribution with very few blogs being extremely influential and a huge number of blogs being largely unknown. Regardless of a (multi-author) blog being influential or not, there are influential bloggers. However, the sheer number of such blogs makes it extremely challenging to study each one of them. One way to analyze these blogs is to find influential bloggers and consider them as the community representatives. Influential bloggers can impact fellow bloggers in various ways. In this paper, we study the problem of identifying influential bloggers. We define influential bloggers, investigate their characteristics, discuss the challenges with identification, develop a model to quantify their influence, and pave the way for further research leading to more sophisticated models that enable categorization of various types of influential bloggers. To highlight these issues, we conduct experiments using data from blogs, evaluate multiple facets of the problem, and present a unique and objective evaluation strategy given the subjectivity in defining the influence, in addition to various other analytical capabilities. We conclude with interesting findings and future work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. http://weblogs.macromedia.com/.

  2. http://www.sifry.com/alerts/archives/000436.html.

  3. http://www.blogpulse.com.

  4. More details on identifying and measuring these indicators are provided in Sect. 3.

  5. Note that K is a user specified parameter.

  6. http://www.blogpulse.com.

  7. http://royal.pingdom.com/2011/01/12/internet-2010-in-numbers/.

  8. A reason we did not adopt any of these is their computation is beyond the scope of this work. We use some simpler measure to examine its effect in determining influence.

  9. http://technorati.com/developers/api/cosmos.html.

  10. http://www.nielsenbuzzmetrics.com/cgm.asp.

  11. http://www.tuaw.com/.

  12. http://technorati.com/developers/api/cosmos.html.

  13. TUAW was setup in February 2004.

  14. This dataset will be made available upon request for research purposes.

  15. http://www.tuaw.com/2007/01/09/iphone-will-not-allow-user-installable-applications/.

  16. http://www.tuaw.com/2007/01/09/macworld-2007-keynote-liveblog/.

  17. http://www.maczot.com/.

  18. http://www.tuaw.com/2007/01/04/xpad-developer-says-maczot-and-brian-ball-ripped-him-off/.

  19. http://www.digg.com/.

  20. We get this data using Digg API.

  21. On average, 70–80 blog posts from TUAW are submitted to Digg every month, so we pick 20 most “digged” or influential posts to avoid under-sampling or over-sampling.

  22. In early stage of the blog site, there are a few cases in which there was little blogging activity such as Feb-04, Oct-04, and Nov-04, resulting in fewer than five influentials.

  23. http://www.engadget.com.

  24. http://blogtrackers.fulton.asu.edu/.

  25. http://kdl.cs.umass.edu/data/dblp/dblp-info.html.

References

  • Agarwal N, Kumar S, Lim M, Liu H (2009a) Mapping socio-cultural dynamics in indonesian blogosphere. In: 3rd AAAI International Conference on Computational Cultural Dynamics (ICCCD09), pp 37–44

  • Agarwal N, Kumar S, Liu H, Woodward M (2009b) Blogtrackers: a tool for sociologists to track and analyze blogosphere. In: Proceedings of the 3rd International AAAI Conference on Weblogs and Social Media (ICWSM)

  • Agarwal N, Liu H, Murthy S, Sen A, Wang X (2009c) A social identity approach to identify familiar strangers in a social network. In: Proceedings of the Third International AAAI Conference of Weblogs and Social Media, pp 2–9

  • Agarwal N, Liu H, Salerno JJ, Yu PS (2007) Searching for familiar strangers on blogosphere: problems and challenges. In: NGDM

  • Anderson C (2006) The long tail: why the future of business is selling less of more. Hyperion, New York

  • Argamon S, Koppel M, Fine J, Shimoni A (2003) Gender, genre, and writing style in formal written texts. TextInterdiscip J Study Discourse 23(3):321–346

    Article  Google Scholar 

  • Berelson B, Lazarsfeld P, McPhee W (1986) Voting: a study of opinion formation in a presidential campaign. University of Chicago Press, Chicago

  • Bonacich P (1987) Power and centrality: a family of measures. Am J Sociol 92(5):1170–1182

    Article  Google Scholar 

  • Brin S, Page L (1998) The anatomy of a large-scale hypertextual Web search engine. Comput Netw ISDN Syst 30(1–7):107–117

    Article  Google Scholar 

  • Brin S, Page L (1998) The anatomy of a large-scale hypertextual Web search engine. In: Proceedings of the seventh international conference on World Wide Web, pp 107–117

  • Chen C, Paul R (2001) Visualizing a knowledge domain’s intellectual structure. Computer 34(3):65–71

    Article  Google Scholar 

  • Chen W, Wang Y, Yang S (2009) Efficient influence maximization in social networks. In: KDD ’09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, New York, pp 199–208

  • Chin A, Chignell M (2006) A social hypertext model for finding community in blogs. In: HYPERTEXT ’06: Proceedings of the seventeenth conference on Hypertext and hypermedia, ACM Press, New York, pp 11–22

  • Coffman T, Marcus S (2004) Dynamic classification of groups through social network analysis and hmms. In: Proceedings of IEEE Aerospace Conference

  • Coleman J, Katz E, Menzel H (1966) Medical innovation: a diffusion study. Bobbs-Merrill Co, Indiana

  • Drezner D, Farrell H (2004) The power and politics of blogs. In: American Political Science Association Annual Conference

  • Elkin T (2007) Just an online minute… online forecast. http://publications.mediapost.com/index.cfm?fuseaction=Articles.showArticle&artaid=29803

  • Faloutsos M, Faloutsos P, Faloutsos C (1999) On power-law relationships of the internet topology. In: Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication, pp 251–262

  • Fensterer GD (2007) Planning and assessing stability operations: a proposed value focus thinking approach. PhD thesis, Air Force Institute of Technology

  • Gill KE (2004) How can we measure the influence of the blogosphere? In: Proceedings of the WWW’04: workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics

  • Gillmor D (2006) We the media: grassroots journalism by the people, for the people. O’Reilly, Sebastopol

  • Goldenberg J, Libai B, Muller E (2001) Talk of the network: a complex systems look at the underlying process of word-of-mouth. Mark Lett 12:211–223

    Article  Google Scholar 

  • Golub G, Van Loan C (1996) Matrix computations. 3rd edn. Johns Hopkins University Press, Baltimore

  • Goyal A, Bonchi F, Lakshamanan LVS (2010) Learning influence probabilities in social networks. In: WSDM

  • Gruhl D, Guha R, Kumar R, Novak J, Tomkins A (2005) The predictive power of online chatter. In: KDD ’05: Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, ACM Press, New York, pp 78–87

  • Gruhl D, Liben-Nowell D, Guha R, Tomkins A (2004) Information diffusion through blogspace. SIGKDD Explor Newsl 6(2):43–52

    Article  Google Scholar 

  • Hu M, Lim E, Sun A, Lauw H, Vuong B (2007) Measuring article quality in wikipedia: models and evaluation. In: Proceedings of the Sixteenth ACM Conference on Conference on information and Knowledge Management, ACM, New York, pp 243–252

  • Java A, Kolari P, Finin T, Oates T (2006) Modeling the spread of influence on the blogosphere. In: Proceedings of the 15th International World Wide Web Conference

  • Katz E (1957) The two-step flow of communication: an up-to-date report on an hypothesis. Public Opin Q 21(1):61–78

    Article  Google Scholar 

  • Katz E, Lazarsfeld P (1955) Personal influence: the part played by people in the flow of mass communications. Free Press, Glencoe, IL

  • Kavanaugh A, Zin TT, Carroll JM, Schmitz J, Manuel Pérez-Qui N, Isenhour P (2006) When opinion leaders blog: new forms of citizen interaction. In: Proceedings of the 2006 international conference on Digital government research, ACM, New York, pp 79–88

  • Keeney RL, Raiffa H (1993) Decisions with multiple objectives: preferences and value tradeoffs. Cambridge University Press, Cambridge

  • Keller E, Berry J (2003) One American in ten tells the other nine how to vote, where to eat and, what to buy. They are The Influentials. The Free Press, New York

  • Kempe D, Kleinberg J, Tardos E (2003) Maximizing the spread of influence through a social network. In: Proceedings of the KDD, ACM Press, New York, pp 137–146

  • Kendall M (1938) A new measure of rank correlation. Biometrika 30:81–89

    MathSciNet  MATH  Google Scholar 

  • Kleinberg J (1998) Authoritative sources in a hyperlinked environment. In: 9th ACM-SIAM Symposium on Discrete Algorithms

  • Knoke D, Burt R (1983) Prominence. In: Applied network analysis, pp 195–222

  • Kolari P, Finin T, Joshi A (2006) SVMs for the blogosphere: Blog identification and splog detection. In: AAAI Spring Symposium on Computational Approaches to Analyzing Weblogs

  • Kritikopoulos A, Sideri M, Varlamis I (2006) Blogrank: ranking weblogs based on connectivity and similarity features. In: AAA-IDEA ’06: Proceedings of the 2nd international workshop on Advanced architectures and algorithms for internet delivery and applications, ACM Press, New York

  • Lazarsfeld P, Berelson B, Gaudet H (1944) The People’s Choice. How the Voter Makes up His Mind in a Presidential Campaign 1944. Columbia University Press, New York

  • Leskovec J, Krause A, Guestrin C, Faloutsos C, VanBriesen J, Glance N (2007) Cost-effective outbreak detection in networks. In: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, New York, pp 420–429

  • Leskovec J, McGlohon M, Faloutsos C, Glance N, Hurst M (2007) Cascading behavior in large blog graphs. In: SIAM International Conference on Data Mining

  • Lin Y-R, Sundaram H, Chi Y, Tatemura J, Tseng BL (2007) Splog detection using self-similarity analysis on blog temporal dynamics. In: Proceedings of the 3rd international workshop on Adversarial information retrieval on the web (AIRWeb), ACM press, New York, pp 1–8

  • Merton R (1968) Social theory and social structure. Free Press, New York

  • Mimno D, McCallum A (2007) Mining a digital library for influential authors. In: JCDL ’07: Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries, ACM, New York, pp 105–106

  • Mishne G, de Rijke M (2006) Deriving wishlists from blogs show us your blog, and we’ll tell you what books to buy. In: Proceedings of the 15th international conference on World Wide Web, ACM Press, New York, pp 925–926

  • Moed H (2005) Citation analysis in research evaluation. Kluwer Academic Publishers, Dordrecht

  • Motwani R, Raghavan P (1995) Randomized algorithms. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  • Nakajima S, Tatemura J, Hino Y, Hara Y, Tanaka K (2005) Discovering important bloggers based on analyzing blog threads. In: Annual Workshop on the Weblogging Ecosystem

  • Ni X, Xue G-R, Ling X, Yu Y, Yang Q (2007) Exploring in the weblog space by detecting informative and affective articles. In: WWW ’07: Proceedings of the 16th international conference on World Wide Web, ACM, New York, pp 281–290

  • O’Reilly T (2005) What is Web 2.0 - design patterns and business models for the next generation of software. http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html

  • Podolny J (2005) Status signals: a sociological study of market competition. Princeton University Press, Princeton

  • Richardson M, Domingos P (2002) Mining knowledge-sharing sites for viral marketing. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge Discovery and Data mining, ACM Press, New York, pp 61–70

  • Rogers E (1995) Diffusion of innovations. Free Press, New York

  • Rogers E, Shoemaker F (1971) Communication of innovations: a cross-cultural approach. Free Press, New York

  • Scoble R, Israel S (2006) Naked conversations: how blogs are changing the way businesses talk with customers. Wiley, London

  • Song X, Chi Y, Hino K, Tseng B (2007) Identifying opinion leaders in the blogosphere. In: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, ACM, New York, pp 971–974

  • Spearman C (1904) The proof and measurement of association between two things. Am J Psychol 15:72–101

    Article  Google Scholar 

  • Stefanone M, Jang C (2008) Writing for friends and family: the interpersonal nature of blogs. J ComputMediat Commun 13(1):123–140

    Article  Google Scholar 

  • Tang J, Sun J, Wang C, Yang Z (2009) Social influence analysis in large-scale networks. In: KDD ’09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, New York, pp 807–816

  • Thelwall M (2006) Bloggers under the London attacks: top information sources and topics. In: Proceedings of the 3rd annual workshop on webloging ecosystem: aggreation, analysis and dynamics

  • Turner J (1991) Social influence. Thomson Brooks/Cole, Belmont

  • Watts D (2007) Challenging the influentials hypothesis. WOMMA Meas Word Mouth 3:201–211

    Google Scholar 

  • Watts D, Dodds P (2007) Influentials, networks, and public opinion formation. J Consum Res 34(4):441

    Article  Google Scholar 

  • Watts DJ, Peretti J (2007) Viral marketing in the real world. Harvard Business Review, Cambridge

  • Weng J, Peng Lim E, Jiang J, He Q (2010) Twitterrank: finding topic-sensitive influential twitterers. In: WSDM

  • Yin X, Han J, Yu PS (2007) Truth discovery with multiple conflicting information providers on the web. In: IEEE Transactions on Knowledge and Data Engineering (TKDE)

  • Zheng R, Li J, Chen H, Huang Z (2006) A framework for authorship identification of online messages: writing-style features and classification techniques. J Am Soc Inf Sci Technol 57(3):378–393

    Article  Google Scholar 

Download references

Acknowledgments

This research was funded in part by the National Science Foundations Social-Computational Systems (SoCS) Program within the Directorate for Computer and Information Science and Engineerings Division of Information and Intelligent Systems (Award numbers: IIS-1110868 and IIS-1110649), the US Office of Naval Research (Grant number: N000141010091), and the US Air Force Office of Scientific Research (Grant number: FA95500810132). We gratefully acknowledge this support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nitin Agarwal.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Agarwal, N., Liu, H., Tang, L. et al. Modeling blogger influence in a community. Soc. Netw. Anal. Min. 2, 139–162 (2012). https://doi.org/10.1007/s13278-011-0039-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13278-011-0039-3

Keywords

Navigation