Entropic Risk Measure in Policy Search

Nass, David; Belousov, Boris; Peters, Jan

Computer Science > Machine Learning

arXiv:1906.09090v1 (cs)

[Submitted on 21 Jun 2019 (this version), latest version 31 Aug 2019 (v2)]

Title:Entropic Risk Measure in Policy Search

Authors:David Nass, Boris Belousov, Jan Peters

View PDF

Abstract:With the increasing pace of automation, modern robotic systems need to act in stochastic, non-stationary, partially observable environments. A range of algorithms for finding parameterized policies that optimize for long-term average performance have been proposed in the past. However, the majority of the proposed approaches does not explicitly take into account the variability of the performance metric, which may lead to finding policies that although performing well on average, can perform spectacularly bad in a particular run or over a period of time. To address this shortcoming, we study an approach to policy optimization that explicitly takes into account higher order statistics of the reward function. In this paper, we extend policy gradient methods to include the entropic risk measure in the objective function and evaluate their performance in simulation experiments and on a real-robot task of learning a hitting motion in robot badminton.

Comments:	Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Subjects:	Machine Learning (cs.LG); Robotics (cs.RO)
Cite as:	arXiv:1906.09090 [cs.LG]
	(or arXiv:1906.09090v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1906.09090

Submission history

From: Boris Belousov [view email]
[v1] Fri, 21 Jun 2019 12:38:05 UTC (1,576 KB)
[v2] Sat, 31 Aug 2019 05:49:37 UTC (1,575 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2019-06

Change to browse by:

cs
cs.RO

References & Citations

DBLP - CS Bibliography

listing | bibtex

David Nass
Boris Belousov
Jan Peters

export BibTeX citation

Computer Science > Machine Learning

Title:Entropic Risk Measure in Policy Search

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Entropic Risk Measure in Policy Search

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators