Regularized fitted Q-iteration: application to planning

Farahmand, Amir massoud and Ghavamzadeh, Mohammad and Szepesvári, Csaba and Mannor, Shie (2008) Regularized fitted Q-iteration: application to planning. In: EWRL 2008. 8th European workshop on recent advances in reinforcement learning. Villeneuve d'Ascq, 2008. (Lecture notes in computer science 5323.).

Text
RegFQIPlanEWRL08.pdf - Published Version
Restricted to Registered users only
Download (211kB)

Official URL: http://www.sztaki.hu/~szcsaba/papers/RegFQI-Plan-E...

Abstract

We consider planning in a Markovian decision problem, i.e., the problem of finding a good policy given access to a generative model of the environment. We propose to use fitted Q-iteration with penalized (or regularized) least-squares regression as the regression subroutine to address the problem of controlling model-complexity. The algorithm is presented in detail for the case when the function space is a reproducing kernel Hilbert space underlying a user-chosen kernel function. We derive bounds on the quality of the solution and argue that data-dependent penalties can lead to almost optimal performance. A simple example is used to illustrate the benefits of using a penalized procedure.

Item Type:	Conference or Workshop Item (Paper)
Subjects:	Q Science > QA Mathematics and Computer Science > QA75 Electronic computers. Computer science / számítástechnika, számítógéptudomány
Depositing User:	Eszter Nagy
Date Deposited:	11 Dec 2012 15:32
Last Modified:	11 Dec 2012 15:32
URI:	https://eprints.sztaki.hu/id/eprint/5599

Update Item