DOI:
Journal of Computer Research and Development (计算机研究与发展) 2009/2009:12 PP.2101-2110
Abstract：
In wireless sensor network, the missing of sensor data is inevitable due to the inherent characteristic of wireless sensor network, and it causes many difficulties in various applications. To solve the problem, the best way is to estimate the missing data as accurately as possible. In this paper, a multiple-regression-model-based missing values imputation algorithm is proposed. It first adopts the multiple linear regression model to estimate the missing data both on temporal dimension and spatial dimension. Then, it assigns the weight coefficients to the two estimated values computed respectively on temporal dimension and spatial dimension according to the goodness-of-fit, and then uses the weighted average of the two values as the final estimated value. Since the algorithm estimates the missing data with the data of multiple neighbor nodes jointly rather than independently, its estimation performance is more stable and reliable. Experimental results on two real-world datasets show that the proposed algorithm can estimate the missing data accurately.
ReleaseDate：2014-07-21 15:00:30
[1] Li Jianzhong, Li Jinbao, Shi Shengfei. Concepts, issues and advance of sensor networks and data management of sensor networks [J] . Journal of Software, 2003, 14(10):1717-1727 (in Chinese)(李建中, 李金宝, 石胜飞. 传感器与感知数据管理的概念、问题与研究进展 [J]. 软件学报, 2003, 14(10): 1717－1727)
[2] Cullar D, Estrin D, Strvastava M. Overview of sensor networks [J]. IEEE Computer, 2004, 37(8): 41－49
[3] Madden S, Franklin M J, Hellerstein J M, et al. The design of an acquisitional query processor for sensor networks [C] //Proc of the 2003 ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2003: 491-502
[4] Manjhi A, Nath S, Gibbons P B. Tributaries and deltas: Efficient and robust aggregation in sensor network streams [C] //Proc of the 2005 ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2005: 287-298
[5] Silberstein A, Munagala K, Yang J. Energy-efficient monitoring of extreme values in sensor networks [C] //Proc of the 2006 ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2006: 169-180
[6] Considine J, Li F, Kollios G, et al. Approximate aggregation techniques for sensor databases [C] //Proc of the 20th Int Conf on Data Engineering. Washington: IEEE Computer Society, 2004: 449-460
[7] Deshpande A, Guestrin C, Madden S, et al. Model-driven data acquisition in sensor networks [C] //Proc of the 30th Int Conf on Very Large Data Bases. New York: ACM, 2004: 588-599
[8] Deshpande A, Guestrin C, Hong W, et al. Exploiting correlated attributes in axquisitional query processing [C] //Proc of the 21st Int Conf on Data Engineering. Washington: IEEE Computer Society, 2005: 143-154
[9] Chu D, Deshpand A, Hellerstein J M, et al. Approximate data collection in sensor networks using probabilistic models [C] //Proc of the 22nd Int Conf on Data Engineering. Washington: IEEE Computer Society, 2006: 48
[10] Madden S. Intel Berkeley research lab data [OL]. [2006-08-08]. http://berkeley.intel-research.net/labdata
[11] Zhu X, Zhang S, Zhang J, et al. Cost-sensitive imputing missing values with ordering [C] //Proc of the 22nd AAAI Conf on Artificial Intelligence. Menlo Park, California: AAAI Press, 2007: 1922-1923
[12] Setiawan N A, Venkatachalam P A, Hani A F M. Missing attribute values prediction based on artificial neural network and rough set theory [C] //Proc of the 2008 Int Conf on BioMedical Engineering and Informatics-Vol.1. Washington: IEEE Computer Society, 2008: 306-310
[13] Sehgal M S B, Gondal I, Dooley L, et al. Ameliorative missing value imputation for robust biological knowledge inference [J]. Journal of Biomedical Informatics, 2008, 41(4): 499-514
[14] Sehgal M S B, Gondal I, Dooley L. Collateral missing value imputation: A New robust missing value estimation algorithm for microarray data [J]. Bioinformatics, 2005, 21(10): 2417-2423
[15] Kim H, Golub G H, Park H. Missing value estimation for dna microarray gene expression data: Local least squares imputation [J]. Bioinformatics, 2006, 22(11): 1410-1411
[16] Zhang C, Zhu X, Zhang J, et al. Gbkii: An imputation method for missing values [C] //Proc of Advances in Knowledge Discovery and Data Mining. Berlin: Springer, 2007: 1080-1087
[17] Zhang S, Zhang J, Zhu X, et al. Missing value imputation based on data clustering [J]. Trans on Computational Science, 2008, 1: 128-138
[18] Abadi D J, Madden S, Lindner W. Reed: robust, efficient filtering and event detection in sensor networks [C] //Proc of the 31st Int Conf on Very Large Data Bases. New York: ACM, 2005: 769-780
[19] Yang X, Lim H B, Ozsu M T, et al. In-network execution of monitoring queries in sensor networks [C] //Proc of the 2007 ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2007: 521-532
[20] Silberstein A, Braynard R, Ellis C S, et al. A sampling-based approach to optimizing top-k queries in sensor networks [C] //Proc of the 22nd Int Conf on Data Engineering. Washington: IEEE Computer Society, 2006: 68
[21] Li Y, Ai C, Deshmukh W P, et al. Data estimation in sensor networks using physical and statistical methodologies [C] //Proc of the 28th IEEE Int Conf on Distributed Computing Systems. Washington: IEEE Computer Society, 2008: 538-545
[22] Zhang H, Moura J M F, Krogh B H. Estimation in sensor networks: A graph approach [C] //Proc of the 4th Int Symp on Information Processing in Sensor Networks. New York: ACM, 2005: 203-209
[23] Halatchev M, Gruenwald L. Estimating missing values in related sensor data streams [C] //Proc of the 11th Int Conf on Management of Data. Vadodara, Mumbai: Allied Publishers, 2005: 83-94
[24] Jiang N, Gruenwald L. Estimating missing data in data streams [C] //Proc of the 12th Int Conf on Database Systems for Advanced Applications. Berlin: Springer, 2007: 981-987
[25] Tolle G. Sonoma redwoods data [OL]. [2006-08-08]. http://www.cs.berkeley.edu/~get/sonoma, 2005