Long short-term memory: Difference between revisions

Latest revision as of 20:33, 20 March 2020

Long short-term memory
Primarilly used for time-series or sequential data
Previously state-of-the-art for NLP tasks but has since been surpassed by Transformer (machine learning model)

See this video for an explanation:
https://www.youtube.com/watch?v=XymI5lluJeU

Architecture

The LSTM architecture has two memory components

A long term memory \(\displaystyle c\)
A short term memory \(\displaystyle h\)

The architecture itself has the following gates in addition to the traditional RNN:

A forget gate for the long term memory (sigmoid 1)
An input gate for the long term memory (sigmoid 2)
An output gate for the short term memory/output

@@ Line 2: / Line 2: @@
 Primarilly used for time-series or sequential data<br>
 Previously state-of-the-art for NLP tasks but has since been surpassed by [[Transformer (machine learning model)]]
+See this video for an explanation:<br>
+[https://www.youtube.com/watch?v=XymI5lluJeU https://www.youtube.com/watch?v=XymI5lluJeU]
+==Architecture==
+[[File:The_LSTM_cell.png | thumb | 400px | LSTM picture from Wikipedia]]
+The LSTM architecture has two memory components
+* A long term memory <math>c</math>
+* A short term memory <math>h</math>
+The architecture itself has the following gates in addition to the traditional RNN:
+* A forget gate for the long term memory (sigmoid 1)
+* An input gate for the long term memory (sigmoid 2)
+* An output gate for the short term memory/output