Gradient Estimation Using Stochastic Computation Graphs

Schulman, John; Heess, Nicolas; Weber, Theophane; Abbeel, Pieter

Computer Science > Machine Learning

arXiv:1506.05254 (cs)

[Submitted on 17 Jun 2015 (v1), last revised 5 Jan 2016 (this version, v3)]

Title:Gradient Estimation Using Stochastic Computation Graphs

Authors:John Schulman, Nicolas Heess, Theophane Weber, Pieter Abbeel

View PDF

Abstract:In a variety of problems originating in supervised, unsupervised, and reinforcement learning, the loss function is defined by an expectation over a collection of random variables, which might be part of a probabilistic model or the external world. Estimating the gradient of this loss function, using samples, lies at the core of gradient-based learning algorithms for these problems. We introduce the formalism of stochastic computation graphs---directed acyclic graphs that include both deterministic functions and conditional probability distributions---and describe how to easily and automatically derive an unbiased estimator of the loss function's gradient. The resulting algorithm for computing the gradient estimator is a simple modification of the standard backpropagation algorithm. The generic scheme we propose unifies estimators derived in variety of prior work, along with variance-reduction techniques therein. It could assist researchers in developing intricate models involving a combination of stochastic and deterministic operations, enabling, for example, attention, memory, and control actions.

Comments:	Advances in Neural Information Processing Systems 28 (NIPS 2015)
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:1506.05254 [cs.LG]
	(or arXiv:1506.05254v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1506.05254

Submission history

From: John Schulman [view email]
[v1] Wed, 17 Jun 2015 09:32:31 UTC (277 KB)
[v2] Fri, 13 Nov 2015 03:19:18 UTC (277 KB)
[v3] Tue, 5 Jan 2016 19:56:22 UTC (277 KB)

Computer Science > Machine Learning

Title:Gradient Estimation Using Stochastic Computation Graphs

Submission history

Access Paper:

References & Citations

1 blog link

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Gradient Estimation Using Stochastic Computation Graphs

Submission history

Access Paper:

References & Citations

1 blog link

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators