DOI: 10.3724/SP.J.1001.2009.00054

Journal of Software (软件学报) 2009/20:1 PP.54-66

Complex Network Clustering Algorithms

Network community structure is one of the most fundamental and important topological properties of complex networks, within which the links between nodes are very dense, but between which they are quite sparse. Network clustering algorithms which aim to discover all natural network communities from given complex networks are fundamentally important for both theoretical researches and practical applications, and can be used to analyze the topological structures, understand the functions, recognize the hidden patterns, and predict the behaviors of complex networks including social networks, biological networks, World Wide Webs and so on. This paper reviews the background, the motivation, the state of arts as well as the main issues of existing works related to discovering network communities, and tries to draw a comprehensive and clear outline for this new and active research area. This work is hopefully beneficial to the researchers from the communities of complex network analysis, data mining, intelligent Web and bioinformatics.

Key words:complex network,network clustering,network community structure

ReleaseDate:2014-07-21 14:29:16

Funds:Supported by the National Natural Science Foundation of China under Grant Nos.60496321, 60503016, 60573073, 60873149 (国家自然科学基金); the National High-Tech Research and Development Plan of China under Grant No.2006AA10Z245 (国家高技术研究发展计划(863))

[1] Watts DJ, Strogatz SH. Collective dynamics of Small-World networks. Nature, 1998,393(6638):440-442.

[2] Barabási AL, Albert R. Emergence of scaling in random networks. Science, 1999,286(5439):509-512.

[3] Barabási AL, Albert R, Jeong H, Bianconi G. Power-Law distribution of the World Wide Web. Science, 2000,287(5461):2115a.

[4] Albert R, Barabási AL, Jeong H. The Internet's Achilles heel: Error and attack tolerance of complex networks. Nature, 2000, 406(2115):378-382.

[5] Girvan M, Newman MEJ. Community structure in social and biological networks. Proc. of the National Academy of Science, 2002,9(12):7821-7826.

[6] Guimera R, Amaral LAN. Functional cartography of complex metabolic networks. Nature, 2005,433(7028):895-900.

[7] Palla G, Derenyi I, Farkas I, Vicsek T. Uncovering the overlapping community structures of complex networks in nature and society. Nature, 2005,435(7043):814-818.

[8] Wilkinson DM, Huberman BA. A method for finding communities of related genes. Proc. of the National Academy of Science, 2004,101(Suppl.1):5241-5248.

[9] Radicchi F, Castellano C, Cecconi F, Loreto V, Parisi D. Defining and identifying communities in networks. Proc. of the National Academy of Science, 2004,101(9):2658-2663.

[10] Palla G, Barabási AL, Vicsek T. Quantifying social group evolution. Nature, 2007,446(7136):664-667.

[11] Newman MEJ. Coauthorship networks and patterns of scientific collaboration. Proc. of the National Academy of Science, 2004,101(1):5200-5205.

[12] Tyler JR, Wilkinson DM, Huberman BA. Email as spectroscopy: Automated discovery of community structure within organizations. In: Huysman M, Wenger E, Wulf V, eds. Proc. of the 1st Int'l Conf. on Communities and Technologies. Dordrecht: Kluwer Academic Publishers, 2003. 81-96.

[13] Ravasz E, Somera AL, Mongru DA. Hierarchical organization of modularity in metabolic networks. Science, 2002,297(5586): 1551-1555.

[14] Wang Z, Zhang J. In serach of the biological significance of modular structures in protein networks. PLOS Computational Biology, 2007,3(6):e107.

[15] Spirin V, Mirny LA. Protein complexes and functional modules in molecular networks. Proc. of the National Academy of Science, 2003,100(21):12123-12128.

[16] Farutin V, Robison K, Lightcap E, Dancik V, Ruttenberg A, Letovsky S, Pradines J. Edge-Count probabilities for the identification of local protein communities and their organization. Proteins: Structure, Function, and Bioinformatics, 2006,62(3):800-818.

[17] Flake GW, Lawrence S, Giles CL, Coetzee FM. Self-Organization and identification of Web communities. IEEE Computer, 2002,35(3):66-71.

[18] Li X, Liu B, Yu PS. Discovering overlapping communities of named entities. In: Fürnkranz J, Scheffer T, Spiliopoulou M, eds. Proc. of the 10th European Conf. on Principles and Practice of Knowledge Discovery in Databases. Berlin: Springer-Verlag, 2006. 593-600.

[19] Ino H, Kudo M, Nakamura A. Partitioning of Web graphs by community topology. In: Ellis A, Hagino T, eds. Proc. of the 14th Int'l Conf. on World Wide Web. New York: ACM Press, 2005. 661-669.

[20] Kleinberg JM. Authoritative sources in a hyperlinked environment. Journal of the ACM, 1999,46(5):604-632.

[21] Almeida RB, Almeida VAF. A community-aware search engine. In: Feldman SI, Uretsky M, Najork M, Wills CE, eds. Proc. of the 13th Int'l Conf. on World Wide Web. New York: ACM Press, 2004. 413-421.

[22] Sidiropoulos A, Pallis G, Katsaros D, Stamos K, Vakali A, Manolopoulos Y. Prefetching in content distribution networks via Web communities identification and outsourcing. World Wide Web, 2008,11(1):39-70.

[23] Newman MEJ. Modularity and communities structure in networks. Proc. of the National Academy of Science, 2006,103(23): 8577-8582.

[24] Fortunato S, Barthelemy M. Resolution limit in community detection. Proc. of the National Academy of Science, 2007,104(1): 36-41.

[25] Hopcroft J, Khan O, Kulis B, Selman B. Tracking evolving communities in large linked networks. Proc. of the National Academy of Science, 2004,101(1):5249-5253.

[26] Reichardt J, Bornholdt S. Detecting fuzzy community structures in complex networks with a potts model. Physical Review Letters, 2004,93(19):218701.

[27] Garlaschelli D, Loffredo MI. Patterns of link reciprocity in directed networks. Physical Review Letters, 2004,93(26):268701.

[28] Yang B, Cheung WK, Liu J. Community mining from signed social networks. IEEE Trans. on Knowledge and Data Engineering, 2007,19(10):1333-1348.

[29] Brandes U, Delling D, Gaertler M, Görke R, Hoefer M, Nikoloski Z, Wagner D. On modularity clustering. IEEE Trans. on Knowledge and Data Engineering, 2008,20(2):172-188.

[30] Cartozo CC, Rios PDL, Piazza F, Lio P. Bottleneck genes and community structure in the cell cycle network of S.pombe. PLOS Computational Biology, 2007,3(6):e103.

[31] Shiga M, Takigawa I, Mamitsuka H. A spectral clustering approach to optimally combining numerical vectors with a modular network. In: Berkhin P, Caruana R, Wu X, eds. Proc. of the 13th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. New York: ACM Press, 2007. 647-656.

[32] Zhou D, Councill I, Zha H, Giles CL. Discovering temporal communities from social network documents. In: Shi Y, Clifton CW, eds. Proc. of the 7th IEEE Int'l Conf. on Data Mining. New York: IEEE Society, 2007. 745-750.

[33] White S, Smyth P. A spectral clustering approach to finding communities in graphs. In: Kamath C, Goodman A, eds. Proc. of the 5th SIAM Int'l Conf. on Data Mining. Philadelphia: SIAM, 2005. 76-84.

[34] Donetti L, Munoz MA. Improved spectral algorithm for the detection of network communities. In: Garrido PL, Muñoz MA, Marro J, eds. Proc. of the 8th Int'l Conf. on Modeling Cooperative Behavior in the Social Sciences. New York: American Institute of Physics, 2005,779:104-107.

[35] Fiedler M. Algebraic connectivity of graphs. Czechoslovakian Mathematical Journal, 1973,23(2):298-305.

[36] Fiedler M. A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory. Czechoslovakian Mathematical Journal, 1975,25(4):619-637.

[37] Wei YC, Cheng CK. Ratio cut partitioning for hierarchical designs. IEEE Trans. on Computer-Aided Design, 1991,10(7):911-921.

[38] Hagen L, Kahng AB. New spectral methods for ratio cut partition and clustering. IEEE Trans. on Computer-Aided Design, 1992, 11(9):1074-1085.

[39] Shi J, Malik J. Normalized cuts and image segmentation. IEEE Trans. on Pattern Analysis and Machine Intelligent, 2000,22(8): 888-904.

[40] Garey MR, Johnson DS. Computers and Intractability: A Guide to the Theory of NP-Completeness. New York: W.H. Freeman Co., 1990. 60-63.

[41] Golub GH, Loan CFV. Matrix Computations. Baltimore: Johns Hopkins University Press, 1989.

[42] Newman MEJ. Detecting community structure in networks. European Physical Journal (B), 2004,38(2):321-330.

[43] Newman MEJ. Fast algorithm for detecting community structure in networks. Physical Review E, 2004,69(6):066133.

[44] Newman MEJ, Girvan M. Finding and evaluating community structure in networks. Physical Review E, 2004,69(2):026113.

[45] Pujol JM, Béjar J, Delgado J. Clustering algorithm for determining community structure in large networks. Physical Review E, 2006,74(1):016107.

[46] Duch J, Arenas A. Community detection in complex networks using extreme optimization. Physical Review E, 2005,72(2):027104.

[47] Zachary WW. An information flow model for conflict and fission in small groups. Journal of Anthropological Research, 1977, 33(4):452-473.

[48] Guimera R, Sales M, Amaral LAN. Modularity from fluctuations in random graphs and complex networks. Physical Review E, 2004,70(2):025101.

[49] Wu F, Huberman BA. Finding communities in linear time. A physics approach. European Physical Journal B, 2004,38(2):331-338.

[50] Goldberg AV. Recent developments in maximum flow algorithms. In: Arnborg S, Ivansson L, eds. Proc. of the 6th Scandinavian Workshop on Algorithm Theory. Berlin: Springer-Verlag, 1998. 1-10.

[51] Wasserman S, Faust K. Social Network Analysis. Cambridge: Cambridge University Press, 1994.

[52] Pons P, Latapy M. Computing communities in large networks using random walks. In: Yolum P, ed. Proc. of the 20th Int'l Symp. on Computer and Information Sciences. Berlin: Springer-Verlag, 2005. 284-293.

[53] Yang B, Liu J. Discovering global network communities based on local centralities. ACM Trans. on the Web, 2008,2(1):article 9:1-32.

[54] Hall KM. An r-dimensional quadratic placement algorithm. Management Science, 1970,17(3):219-229.

[55] Donetti L, Munoz MA. Detecting network communities: A new systematic and efficient algorithm. Journal of Statistical Mechanics: Theory and Experiment, 2004,10:P10012.

[56] Liu J, Jin XL, Tsui KC. Autonomy Oriented Computing-From Problem Solving to Complex Systems Modeling. Boston: Kluwer Academic Publishers, 2004.