Long short-term memory: Difference between revisions

Long short-term memory (view source)

583 bytes added , 20 March 2020

no edit summary

5,336

edits

@@ Line 2: / Line 2: @@
 Primarilly used for time-series or sequential data<br>
 Previously state-of-the-art for NLP tasks but has since been surpassed by [[Transformer (machine learning model)]]
+See this video for an explanation:<br>
+[https://www.youtube.com/watch?v=XymI5lluJeU https://www.youtube.com/watch?v=XymI5lluJeU]
+==Architecture==
+[[File:The_LSTM_cell.png | thumb | 400px | LSTM picture from Wikipedia]]
+The LSTM architecture has two memory components
+* A long term memory <math>c</math>
+* A short term memory <math>h</math>
+The architecture itself has the following gates in addition to the traditional RNN:
+* A forget gate for the long term memory (sigmoid 1)
+* An input gate for the long term memory (sigmoid 2)
+* An output gate for the short term memory/output