DOI: 10.3724/SP.J.1146.2012.01670

Journal of Electronics & Information Technology (电子与信息学报) 2013/35:12 PP.2901-2907

An Algorithm for Belief States Space Compression Using Non-negative Matrix Factorization Updating Rules in Partially Observable Markov Decision Processes

For the curse of dimensionality encountered in solving the planning in Partially Observable Markov Decision Processes (POMDP), this paper presents a novel approach to compress belief states space using Non- negative Matrix Factorization (NMF) updating rules, which reduces high dimensional belief states space by two steps. First, the algorithm adopts factored representations of states, observations and actions by exploiting the structure of factored POMDP, then decomposes and compresses transition functions by exploiting conditional independence of dynamic Bayesian network, and then removes the zero probability to lower the sparsity of belief states space. Second, it adopts value-directed compression approach to make the obtained approximate belief states after dimension reduction be consistent with the original optimal, and exploits NMF updating rules instead of Krylov iterations to accelerate the dimension reduction. The proposed algorithm not only guarantees the value function and reward function of the belief states unchanged after reducing dimensions, but also keeps the piecewise linear and convex property to compute the optimal policy by using dynamic programming. Experiments demonstrate that the proposed belief compression algorithm has lower error rates and higher convergence.

Key words:Information Processing,Partially Observable Markov Decision Processes (POMDP),Belief states space,Non-negative Matrix Factorization (NMF),Value-directed compression,Curse of dimensionality

ReleaseDate:2014-07-21 17:04:31