This article proposes a novel kernel-based Dynamic Programming (DP) approximation method to tackle the typical curse of dimensionality of stochastic DP problems over the finite time horizon. Such a method utilizes kernel functions in combination with Support Vector Machine (SVM) regression to determine an approximate cost function for the entire state space of the underlying Markov Decision Process (MDP), by leveraging cost function computed for selected representative states. Kernel functions are used to define the so-called kernel matrix, while the parameter vector of the given kernel-based cost function approximation is computed by moving backwards in time from the terminal condition and by applying SVM regression. This way, the difficulty of selecting a proper set of features is also tackled. The proposed method is then extended to the infinite time horizon case. To show the effectiveness of the proposed approach, the resulting Recursive Residual Approximate Dynamic Programming (RR-ADP) algorithm is applied to the sensor scheduling design in multi-process remote state estimation problems.

A kernel-based approximate dynamic programming approach: Theory and application

Baccari S.
2024

Abstract

This article proposes a novel kernel-based Dynamic Programming (DP) approximation method to tackle the typical curse of dimensionality of stochastic DP problems over the finite time horizon. Such a method utilizes kernel functions in combination with Support Vector Machine (SVM) regression to determine an approximate cost function for the entire state space of the underlying Markov Decision Process (MDP), by leveraging cost function computed for selected representative states. Kernel functions are used to define the so-called kernel matrix, while the parameter vector of the given kernel-based cost function approximation is computed by moving backwards in time from the terminal condition and by applying SVM regression. This way, the difficulty of selecting a proper set of features is also tackled. The proposed method is then extended to the infinite time horizon case. To show the effectiveness of the proposed approach, the resulting Recursive Residual Approximate Dynamic Programming (RR-ADP) algorithm is applied to the sensor scheduling design in multi-process remote state estimation problems.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11591/545259
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact