Stochastic dynamic production control by neurodynamic programming
Monostori, László and Csáji, Balázs Csanád (2006) Stochastic dynamic production control by neurodynamic programming. CIRP ANNALS-MANUFACTURING TECHNOLOGY, 55 (1). pp. 473-478. ISSN 0007-8506 10.1016/S0007-8506(07)60462-4
Text
Monostori_473_1037969_z.pdf Restricted to Registered users only Download (523kB) | Request a copy |
Abstract
The paper proposes Markov Decision Processes (MDPs) to model production control systems that work in uncertain and changing environments. In an MDP finding an optimal control policy can be traced back to computing the optimal value function, which is the unique solution of the Bellman equation. Reinforcement learning methods, such as Q-learning, can be used for estimating this function; however, the value estimations are often only available for a few states of the environment, typically generated by simulation. The paper suggests the application of a new type of support vector regression model, called v-SVR, which can effectively fit a smooth function to the available data and allow good generalization properties. The effectiveness of the approach is shown by experimental results on both benchmark and industry related data.
Item Type: | Article |
---|---|
Subjects: | Q Science > QA Mathematics and Computer Science > QA75 Electronic computers. Computer science / számítástechnika, számítógéptudomány |
Divisions: | Research Laboratory on Engineering & Management Intelligence |
SWORD Depositor: | MTMT Injector |
Depositing User: | MTMT Injector |
Date Deposited: | 19 Jan 2022 07:28 |
Last Modified: | 19 Jan 2022 07:28 |
URI: | https://eprints.sztaki.hu/id/eprint/10278 |
Update Item |