DOI: 10.3724/SP.J.1047.2013.00854

Journal of Geo-information Science (地球信息科学学报) 2013/15:6 PP.854-861

Fuzzy C-means Clustering for GIS Data Based on Spatial Weighted Distance

Ordinary Euclidean distance is often used to measure similarity in fuzzy C-means, and in distance formula, different attribute features should have different weights according to their important degree. Moreover, for geospatial objects, clustering should consider not only similarity of attribute features, but also spatial proximity of the objects. Based on ordinary Euclidean distance, several forms of spatial weighted distance are proposed in this paper. Different distance formula imposes different weight on both two coordinate directions and each attribute feature. The weight vector is used to measure effect sizes of spatial location features and attribute features in similarity-based clustering and also measure degree of isotropy and anisotropy along X and Y coordinate directions. A fuzzy evaluation function derived from similarity matrix of spatial objects is used as optimization objective, and the weight vector is learned by gradient-descent algorithm based on dynamic learning rate. Then, spatial weighted distance is introduced to fuzzy C-means clustering to replace ordinary Euclidean distance. Meuse dataset, a spatial dataset as the application example, is analyzed by FCM clustering and the clustering number is set to 2-10. The clustering results are evaluated and compared via cluster validity indices including PC, PE and Xie-Beni. The analysis indicates that clustering performance based on spatial weighted distance is better than ordinary Euclidean distance and spatial common distance, and further, spatial distribution of the clustering results shows that, besides attribute features, spatial features such as locations also play important roles in spatial data clustering.

Key words:spatial weighted distance,GIS data,Fuzzy C-means clustering,gradient-descent learning algorithm

ReleaseDate:2015-04-17 13:34:29

