Welcome to Scribd!

0% found this document useful (0 votes)

36 views

RNN and LSTM: YANG Jiancheng

Uploaded by

This document provides an outline and overview of RNNs, LSTMs, and other neural network structures. It begins by introducing vanilla RNNs and their gradient vanishing problem. It then discusses LSTMs, explaining their core idea of using gates to allow information to persist in the cell state. Finally, it briefly mentions GRUs and variants on LSTMs, noting that popular variants like GRUs perform similarly to LSTMs.

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

RNN and LSTM: YANG Jiancheng

Uploaded by

seddik

0% found this document useful (0 votes)

36 views15 pages

Original Description:

RNN

Original Title

rnn-lstm-161106132927

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

0% found this document useful (0 votes)

36 views15 pages

RNN and LSTM: YANG Jiancheng

Uploaded by

seddik

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

Jump to Page

You are on page 1of 15

Search inside document

RNN and LSTM

(Oct 12, 2016)

YANG Jiancheng
Outline

• I. Vanilla RNN
• II. LSTM
• III. GRU and Other Structures
• I. Vanilla RNN
GREAT Intro: Understanding LSTM Networks

In theory, RNNs are absolutely capable of handling such “long-

term dependencies.” A human could carefully pick parameters for
them to solve toy problems of this form. Sadly, in practice, RNNs
don’t seem to be able to learn them.
• I. Vanilla RNN

WILDML has a series of articles to introduce RNN (4 articles, 2 GitHub repos).

• I. Vanilla RNN
• Back Prop Through Time (BPTT)
• I. Vanilla RNN
• Back Prop Through Time (BPTT)
• I. Vanilla RNN
• Gradient Vanishing Problem
RNNs tend to be very deep

tanh and derivative. Source: http://nn.readthedocs.org/en/rtd/transfer/

• II. LSTM
• Differences of LSTM and Vanilla RNN
• II. LSTM
• Core Idea Behind LSTMs

Cell state Gates

• II. LSTM
• Step-by-Step Walk Through

0~1

0~1
• II. LSTM
• Step-by-Step Walk Through

0~1

0~1
• III. GRU and other structures
• Gated Recurrent Unit (GRU)

• Combines the forget and input gates into a single “update gate.”
• Merges the cell state and hidden state
• Other changes
• III. GRU and other structures
• Variants on Long Short Term Memory

Greff, et al. (2015) do a nice comparison of popular variants,

finding that they’re all about the same.
Bibliography

• [1] Understanding LSTM Networks

• [2] Back Propagation Through Time and Vanishing Gradients
Thanks for listening!

A Survey On Vision Transformer
Document24 pages
A Survey On Vision Transformer
mainproject967
No ratings yet
20.tooth Segmentation On Dental Meshes Using Morphologic Skeleton
Document13 pages
20.tooth Segmentation On Dental Meshes Using Morphologic Skeleton
budi
No ratings yet
FIX Multicast Transport v1.0
Document10 pages
FIX Multicast Transport v1.0
juttekata
No ratings yet
A Survey of Evolution of Image Captioning PDF
Document18 pages
A Survey of Evolution of Image Captioning PDF
_mbo
No ratings yet
Java Certification Study Notes
Document91 pages
Java Certification Study Notes
api-3717552
No ratings yet
primerHW SWinterface
Document107 pages
primerHW SWinterface
bigbigbig90003270
No ratings yet
Barekar2014P Overview of Twin Roll Casting of Al Alloys
Document12 pages
Barekar2014P Overview of Twin Roll Casting of Al Alloys
kishore
No ratings yet
Deep Learning - DL-2
Document44 pages
Deep Learning - DL-2
Hasnain Ahmad
100% (1)
The 9 Deep Learning Papers You Need To Know About 3
Document19 pages
The 9 Deep Learning Papers You Need To Know About 3
jatin1514
No ratings yet
Deep Learning With H2O
Document31 pages
Deep Learning With H2O
cristianmondaca
No ratings yet
NILM Using FHMM & CO
Document15 pages
NILM Using FHMM & CO
66umer66
No ratings yet
HybridCNN Based Hyperspectral Image Classification Using Multiscalespatiospectral Features
Document10 pages
HybridCNN Based Hyperspectral Image Classification Using Multiscalespatiospectral Features
Alkha Jayakrishnan
No ratings yet
Deep Learning Deep Learning Hyperspectral
Document18 pages
Deep Learning Deep Learning Hyperspectral
carlosgg33
No ratings yet
Lecture 1 2013
Document92 pages
Lecture 1 2013
sohailasghar_t
No ratings yet
Ku Dissertation 2019
Document154 pages
Ku Dissertation 2019
catgadha
No ratings yet
Li Fi The Future Technology in Wireless Communication
Document4 pages
Li Fi The Future Technology in Wireless Communication
PankajatSIBM
No ratings yet
Training Report On BSNL
Document60 pages
Training Report On BSNL
Parul Pareek
67% (6)
H265 HEVC Overview and Comparison With H264 AVC
Document19 pages
H265 HEVC Overview and Comparison With H264 AVC
F. Aguirre
No ratings yet
HEVC
Document50 pages
HEVC
Raviraj317
No ratings yet
Improving Neural Networks For Time-Series Forecasting Using Data Augmentation and Automl
Document8 pages
Improving Neural Networks For Time-Series Forecasting Using Data Augmentation and Automl
Pritam Changkakoti
No ratings yet
WEKA
Document50 pages
WEKA
ad
No ratings yet
Performance Evaluation of NB-IoT Coverage
Document5 pages
Performance Evaluation of NB-IoT Coverage
Van Quan Hoang
No ratings yet
Attendee Guide: Summer School 2022: Quantum Simulations July 18 - July 29 #QGSS2022
Document19 pages
Attendee Guide: Summer School 2022: Quantum Simulations July 18 - July 29 #QGSS2022
Fyodor Amanov
No ratings yet
Systems Design and The 8051
Document447 pages
Systems Design and The 8051
vongocthanh
No ratings yet
Artificial Neural Networks
Document43 pages
Artificial Neural Networks
anqrwpoborew
No ratings yet
Cognitive Fatigue Detection in Vehicular Drivers Using K-Means Algorithm
Document48 pages
Cognitive Fatigue Detection in Vehicular Drivers Using K-Means Algorithm
MANISH KUMAR SHARMA
No ratings yet
Parallela Cluster by Michael Johan Kruger
Document56 pages
Parallela Cluster by Michael Johan Kruger
Giacomo Marco Toigo
No ratings yet
Machine Learning Operations MLOps Overview Definition and Architecture
Document14 pages
Machine Learning Operations MLOps Overview Definition and Architecture
Serhiy Yehress
No ratings yet
Deep-Learning Based Lossless Image Coding
Document14 pages
Deep-Learning Based Lossless Image Coding
hoangthaiduong
No ratings yet
Cs329s 01 Slides
Document70 pages
Cs329s 01 Slides
tribasuki74
No ratings yet
Neural Network For SSCV Hydrodynamics
Document104 pages
Neural Network For SSCV Hydrodynamics
vane-16
No ratings yet
Azure Quantum - An Emerging Quantum Computing Ecosystem
Document8 pages
Azure Quantum - An Emerging Quantum Computing Ecosystem
Gopala Krishna Sriram
No ratings yet
Kodo Ns3 Examples
Document38 pages
Kodo Ns3 Examples
شهاب الدين
100% (1)
Neuromorphic Computing: The Potential For High-Performance Processing in Space
Document12 pages
Neuromorphic Computing: The Potential For High-Performance Processing in Space
mahesh
No ratings yet
3D Scientific Visualization With Blender (PDFDrive)
Document121 pages
3D Scientific Visualization With Blender (PDFDrive)
Mrzim Gugl
No ratings yet
Federated Quantum Neural Network With Quantum Teleportation For Resource Optimization in Future Wireless Communication
Document17 pages
Federated Quantum Neural Network With Quantum Teleportation For Resource Optimization in Future Wireless Communication
Showribabu Kanta
No ratings yet
ICCA Volume 5
Document235 pages
ICCA Volume 5
kk3934
No ratings yet
Named Entity Recognition in English Using Hidden Markov Model
Document6 pages
Named Entity Recognition in English Using Hidden Markov Model
Anonymous lVQ83F8mC
No ratings yet
Named Entity Recognition Using Deep Learning
Document21 pages
Named Entity Recognition Using Deep Learning
Zerihun Yitayew
100% (1)
CS578: Internet of Things: MQTT: Message Queuing Telemetry Transport
Document18 pages
CS578: Internet of Things: MQTT: Message Queuing Telemetry Transport
shubham
No ratings yet
Me3116 E3.0
Document14 pages
Me3116 E3.0
barcones
No ratings yet
Lecture 01 (Introduction To Pattern Recognition)
Document26 pages
Lecture 01 (Introduction To Pattern Recognition)
Jahid Hassan
No ratings yet
Seismic Analysis in Python
Document11 pages
Seismic Analysis in Python
ScribdTranslations
No ratings yet
Python Machine Learning - Machine Learning and Deep Learning With Python Scikit Learn and Tensorflow 2 Third Edition
Document4 pages
Python Machine Learning - Machine Learning and Deep Learning With Python Scikit Learn and Tensorflow 2 Third Edition
Matheus Vasconcelos
No ratings yet
Evaluating Bert and Parsbert For Analyzing Persian Advertisement Data
Document12 pages
Evaluating Bert and Parsbert For Analyzing Persian Advertisement Data
Darren
No ratings yet
Sigmod278 Silberstein
Document12 pages
Sigmod278 Silberstein
Dmytro Shteflyuk
No ratings yet
Applications of Evolutionary Computation, Part II
Document547 pages
Applications of Evolutionary Computation, Part II
bilal
No ratings yet
3D Discrete Element Workbench for Highly Dynamic Thermo-mechanical Analysis: GranOO
From Everand
3D Discrete Element Workbench for Highly Dynamic Thermo-mechanical Analysis: GranOO
Damien Andre
Rating: 5 out of 5 stars
5/5 (1)
How to Design Optimization Algorithms by Applying Natural Behavioral Patterns
From Everand
How to Design Optimization Algorithms by Applying Natural Behavioral Patterns
Rohollah Omidvar
No ratings yet
Deep Neural Network ASICs The Ultimate Step-By-Step Guide
From Everand
Deep Neural Network ASICs The Ultimate Step-By-Step Guide
Gerardus Blokdyk
No ratings yet
Building Modern GUIs with tkinter and Python: Building user-friendly GUI applications with ease (English Edition)
From Everand
Building Modern GUIs with tkinter and Python: Building user-friendly GUI applications with ease (English Edition)
Saurabh Chandrakar
No ratings yet
Foundations of Data Intensive Applications: Large Scale Data Analytics under the Hood
From Everand
Foundations of Data Intensive Applications: Large Scale Data Analytics under the Hood
Supun Kamburugamuve
No ratings yet
Complex system Standard Requirements
From Everand
Complex system Standard Requirements
Gerardus Blokdyk
No ratings yet
Distributed Control of Robotic Networks: A Mathematical Approach to Motion Coordination Algorithms
From Everand
Distributed Control of Robotic Networks: A Mathematical Approach to Motion Coordination Algorithms
Francesco Bullo
Rating: 5 out of 5 stars
5/5 (1)
Emerging Technologies in Information and Communications Technology
From Everand
Emerging Technologies in Information and Communications Technology
Fouad Sabry
No ratings yet
Deep Neural Nets Deep Learning A Complete Guide
From Everand
Deep Neural Nets Deep Learning A Complete Guide
Gerardus Blokdyk
No ratings yet
Neural Networks with Python
From Everand
Neural Networks with Python
Mei Wong
No ratings yet
Bare-Metal Embedded C Programming: Develop high-performance embedded systems with C for Arm microcontrollers
From Everand
Bare-Metal Embedded C Programming: Develop high-performance embedded systems with C for Arm microcontrollers
Israel Gbati
No ratings yet
Multi-voltage CMOS Circuit Design
From Everand
Multi-voltage CMOS Circuit Design
Volkan Kursun
No ratings yet
TensorFlow A Complete Guide - 2019 Edition
From Everand
TensorFlow A Complete Guide - 2019 Edition
Gerardus Blokdyk
No ratings yet
A Granular Computing Approach To Machine Learning
Document6 pages
A Granular Computing Approach To Machine Learning
seddik
No ratings yet
Expert-Augmented Machine Learning: Significance
Document7 pages
Expert-Augmented Machine Learning: Significance
seddik
No ratings yet
The 2 International Conference On Embedded Systems and Artificial Intelligence
Document9 pages
The 2 International Conference On Embedded Systems and Artificial Intelligence
seddik
No ratings yet
Transfuse: Fusing Transformers and Cnns For Medical Image Segmentation
Document9 pages
Transfuse: Fusing Transformers and Cnns For Medical Image Segmentation
seddik
No ratings yet
Project Report: Self-Labeled Techniques For Semi-Supervised Learning
Document8 pages
Project Report: Self-Labeled Techniques For Semi-Supervised Learning
seddik
No ratings yet
1-S2.0-S0925231219317989-Main Rachid Jenanne
Document10 pages
1-S2.0-S0925231219317989-Main Rachid Jenanne
seddik
No ratings yet