Improvements on Hindsight Learning

Deshpande, Ameet; Sarma, Srikanth; Jha, Ashutosh; Ravindran, Balaraman

Computer Science > Machine Learning

arXiv:1809.06719 (cs)

[Submitted on 16 Sep 2018 (v1), last revised 4 Nov 2018 (this version, v2)]

Title:Improvements on Hindsight Learning

Authors:Ameet Deshpande, Srikanth Sarma, Ashutosh Jha, Balaraman Ravindran

View PDF

Abstract:Sparse reward problems are one of the biggest challenges in Reinforcement Learning. Goal-directed tasks are one such sparse reward problems where a reward signal is received only when the goal is reached. One promising way to train an agent to perform goal-directed tasks is to use Hindsight Learning approaches. In these approaches, even when an agent fails to reach the desired goal, the agent learns to reach the goal it achieved instead. Doing this over multiple trajectories while generalizing the policy learned from the achieved goals, the agent learns a goal conditioned policy to reach any goal. One such approach is Hindsight Experience replay which uses an off-policy Reinforcement Learning algorithm to learn a goal conditioned policy. In this approach, a replay of the past transitions happens in a uniformly random fashion. Another approach is to use a Hindsight version of the policy gradients to directly learn a policy. In this work, we discuss different ways to replay past transitions to improve learning in hindsight experience replay focusing on prioritized variants in particular. Also, we implement the Hindsight Policy gradient methods to robotic tasks.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1809.06719 [cs.LG]
	(or arXiv:1809.06719v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1809.06719

Submission history

From: Ameet Deshpande [view email]
[v1] Sun, 16 Sep 2018 17:07:33 UTC (1,526 KB)
[v2] Sun, 4 Nov 2018 19:40:31 UTC (1,526 KB)

Computer Science > Machine Learning

Title:Improvements on Hindsight Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Improvements on Hindsight Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators