Online Inverse Reinforcement Learning via Bellman Gradient Iteration

Li, Kun; Burdick, Joel W.

Computer Science > Robotics

arXiv:1707.09393 (cs)

[Submitted on 28 Jul 2017]

Title:Online Inverse Reinforcement Learning via Bellman Gradient Iteration

Authors:Kun Li, Joel W. Burdick

View PDF

Abstract:This paper develops an online inverse reinforcement learning algorithm aimed at efficiently recovering a reward function from ongoing observations of an agent's actions. To reduce the computation time and storage space in reward estimation, this work assumes that each observed action implies a change of the Q-value distribution, and relates the change to the reward function via the gradient of Q-value with respect to reward function parameter. The gradients are computed with a novel Bellman Gradient Iteration method that allows the reward function to be updated whenever a new observation is available. The method's convergence to a local optimum is proved. This work tests the proposed method in two simulated environments, and evaluates the algorithm's performance under a linear reward function and a non-linear reward function. The results show that the proposed algorithm only requires a limited computation time and storage space, but achieves an increasing accuracy as the number of observations grows. We also present a potential application to robot cleaners at home.

Comments:	The code and video are available at this https URL . arXiv admin note: substantial text overlap with arXiv:1707.07767
Subjects:	Robotics (cs.RO)
Cite as:	arXiv:1707.09393 [cs.RO]
	(or arXiv:1707.09393v1 [cs.RO] for this version)
	https://doi.org/10.48550/arXiv.1707.09393

Submission history

From: Kun Li [view email]
[v1] Fri, 28 Jul 2017 19:51:38 UTC (1,120 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.RO

< prev | next >

new | recent | 2017-07

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Kun Li
Joel W. Burdick

export BibTeX citation

Computer Science > Robotics

Title:Online Inverse Reinforcement Learning via Bellman Gradient Iteration

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Robotics

Title:Online Inverse Reinforcement Learning via Bellman Gradient Iteration

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators