DOI: 10.3724/SP.J.1249.2019.01018

Journal of Shenzhen University Science and Engineering (深圳大学学报理工版) 2019/36:1 PP.18-23

Fast clustering based on bipartite graph

Spectral clustering algorithm can effectively learn the data manifold distribution and non-convex distribution of data.However, the spectral clustering process which involves the graph construction and eigen-decomposition has the high computational complexity. It is difficult to apply the spectral clustering to deal with the large-scale data directly.The fast clustering based on bipartite graph (FCBG) algorithm reduces the size of original data structure by using the sampling method and learns the relationship between the selection data and original data. The algorithm can optimize the weights of bipartite graph edge mean while maintaining the cluster structure of bipartite graph. The computational complexity of proposed algorithm increases linearly with the increase of data size. The experimental analysis shows that the algorithm can effectively learn the data relationship and obtain the better clustering results with less time consumption.

Key words:technology of computer application,clustering,large-scale,spectral graph theory,bipartite graph,rank constraint

ReleaseDate:2019-01-28 09:56:34

[1] JAIN A K, MURTY M N, FLYNN P J. Data clustering:a review[J]. ACM Computing Surveys, 1999, 31(3):264-323.

[2] HASTIE T, TIBSHIRANI R, FREDMAN J. The elements of statistical learning[M]. New York, USA:Springer series in statistics, 2001.

[3] Von UXBURG U. A tutorial on spectral clustering[J]. Statistics and Computing, 2007, 17(4):395-416.

[4] HAGEN L, KAHNG A B. New spectral methods for ratio cut partitioning and clustering[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 1992, 11(9):1074-1085.

[5] SHI Jianbo, MALIK J. Normalized cuts and image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8):888-905.

[6] NG A Y, JORDAN M I, WEISS Y. On spectral clustering:analysis and an algorithm[C]//Proceedings of the 14th International Conference on Neural Information Processing Systems:Natural and Synthetic. Cambridge, USA:MIT Press, 2001:849-856.

[7] LIU Wei, HE Junfeng, CHANG S. Large graph construction for scalable semi-supervised learning[C]//Proceedings of the 27th International Conference on Machine Learning. Haifa, Israel:[s.n.], 2010:679-686.

[8] CAI Deng, CHEN Xinlei. Large scale spectral clustering via Landmark-based sparse representation[J]. IEEE Transactions on Cybernetics, 2015, 45(8):1669-1680.

[9] NIE Feiping, WANG Xiaoqian, HUANG Heng. Clustering and projected clustering with adaptive neighbors[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA:ACM, 2014:977-986.

[10] HUANG Jin, NIE Feiping, HUANG Heng. A new simplex sparse learning model to measure data similarity for clustering[C]//Proceedings of the 24th International Conference on Artificial Intelligence. Buenos, Argentina:AAAI Press, 2015:3569-3575.

[11] YAN Donghui, Huang Ling, JORDAN M I. Fast approximate spectral clustering[C]//Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. New York, USA:ACM, 2009:907-916.

[12] ZHU Wei, NIE Feiping, LI Xuelong. Fast spectral clustering with efficient large graph construction[C]//IEEE International Conference on Acoustics, Speech and Signal Processing. New Orleans, USA:IEEE, 2017:2492-2496.

[13] NIE Feiping, WANG Xiaoqian, DENG Cheng, et al. Learning a structured optimal bipartite graph for coclustering[C]//Advances in Neural Information Processing Systems. Long Beach, USA:[s. n.], 2017:4129-4138.

[14] MHOAR B. The Laplacian spectrum of graphs[J]. Graph Theory, Combinatorics, and Applications, 1991, 2(871/898):12.

[15] FAN K. On a theorem of Weyl concerning eigenvalues of linear transformations I[J]. Proceedings of the National Academy of Sciences of the United States of America, 1949, 35(11):652-655.