CompPhysics
diff --git a/‎doc/pub/week8/html/week8-bs.html‎
Lines changed: 12 additions & 7 deletions b/‎doc/pub/week8/html/week8-bs.html‎
Lines changed: 12 additions & 7 deletions
diff --git a/‎doc/pub/week8/html/week8-reveal.html‎
Lines changed: 14 additions & 7 deletions b/‎doc/pub/week8/html/week8-reveal.html‎
Lines changed: 14 additions & 7 deletions
diff --git a/‎doc/pub/week8/html/week8-solarized.html‎
Lines changed: 12 additions & 7 deletions b/‎doc/pub/week8/html/week8-solarized.html‎
Lines changed: 12 additions & 7 deletions
diff --git a/‎doc/pub/week8/html/week8.html‎
Lines changed: 12 additions & 7 deletions b/‎doc/pub/week8/html/week8.html‎
Lines changed: 12 additions & 7 deletions
diff --git a/‎doc/pub/week8/ipynb/ipynb-week8-src.tar.gz‎
0 Bytes b/‎doc/pub/week8/ipynb/ipynb-week8-src.tar.gz‎
0 Bytes
@@ -420,12 +420,12 @@ <h2 id="lstm-details" class="anchor">LSTM details </h2>
 element-wise multiplication, denoted by \( \odot \).
 </p>
 
-<p>It follows </p>
+<p>Mathematically we have (see also figure below)</p>
 $$
-\mathbf{f}^{(t)} = \sigma(W_f\mathbf{x}^{(t)} + U_f\mathbf{h}^{(t-1)} + \mathbf{b}_f)
+\mathbf{f}^{(t)} = \sigma(W_{fx}\mathbf{x}^{(t)} + W_{fh}\mathbf{h}^{(t-1)} + \mathbf{b}_f)
 $$
 
-<p>where \( W \) and \( U \) are the weights respectively.</p>
+<p>where the $W$s are the weights to be trained.</p>
 
 <!-- !split -->
 <h2 id="comparing-with-a-standard-rnn" class="anchor">Comparing with a standard  RNN  </h2>
@@ -481,6 +481,11 @@ <h2 id="the-forget-gate" class="anchor">The forget gate </h2>
 control the amount of information we want to take from the long-term
 memory.
 </p>
+$$
+\mathbf{f}^{(t)} = \sigma(W_{fx}\mathbf{x}^{(t)} + W_{fh}\mathbf{h}^{(t-1)} + \mathbf{b}_f)
+$$
+
+<p>where the $W$s are the weights to be trained.</p>
 
 <!-- !split -->
 <h2 id="basic-layout" class="anchor">Basic layout </h2>
@@ -505,15 +510,15 @@ <h2 id="input-gate" class="anchor">Input gate </h2>
 
 <p>We have</p>
 $$
-\mathbf{i}^{(t)} = \sigma_g(W_i\mathbf{x}^{(t)} + U_i\mathbf{h}^{(t-1)} + \mathbf{b}_i),
+\mathbf{i}^{(t)} = \sigma_g(W_{ix}\mathbf{x}^{(t)} + W_{ih}\mathbf{h}^{(t-1)} + \mathbf{b}_i),
 $$
 
 <p>and</p>
 $$
-\mathbf{\tilde{c}}^{(t)} = \tanh(W_c\mathbf{x}^{(t)} + U_c\mathbf{h}^{(t-1)} + \mathbf{b}_c),
+\mathbf{g}^{(t)} = \tanh(W_{gx}\mathbf{x}^{(t)} + W_{gh}\mathbf{h}^{(t-1)} + \mathbf{b}_g),
 $$
 
-<p>again the \( W \) and \( U \) are the weights.</p>
+<p>again the $W$s are the weights to train.</p>
 
 <!-- !split -->
 <h2 id="short-summary" class="anchor">Short summary  </h2>
@@ -529,7 +534,7 @@ <h2 id="forget-and-input" class="anchor">Forget and input </h2>
 
 <p>The forget gate and the input gate together also update the cell state with the following equation, </p>
 $$
-\mathbf{c}^{(t)} = \mathbf{f}^{(t)} \otimes \mathbf{c}^{(t-1)} + \mathbf{i}^{(t)} \otimes \mathbf{\tilde{c}}^{(t)},
+\mathbf{c}^{(t)} = \mathbf{f}^{(t)} \otimes \mathbf{c}^{(t-1)} + \mathbf{i}^{(t)} \otimes \mathbf{g}^{(t)},
 $$
 
 <p>where \( f^{(t)} \) and \( i^{(t)} \) are the outputs of the forget gate and the input gate, respectively.</p>
 
@@ -321,14 +321,14 @@ <h2 id="lstm-details">LSTM details </h2>
 element-wise multiplication, denoted by \( \odot \).
 </p>
 
-<p>It follows </p>
+<p>Mathematically we have (see also figure below)</p>
 <p>&nbsp;<br>
 $$
-\mathbf{f}^{(t)} = \sigma(W_f\mathbf{x}^{(t)} + U_f\mathbf{h}^{(t-1)} + \mathbf{b}_f)
+\mathbf{f}^{(t)} = \sigma(W_{fx}\mathbf{x}^{(t)} + W_{fh}\mathbf{h}^{(t-1)} + \mathbf{b}_f)
 $$
 <p>&nbsp;<br>
 
-<p>where \( W \) and \( U \) are the weights respectively.</p>
+<p>where the $W$s are the weights to be trained.</p>
 </section>
 
 <section>
@@ -390,6 +390,13 @@ <h2 id="the-forget-gate">The forget gate </h2>
 control the amount of information we want to take from the long-term
 memory.
 </p>
+<p>&nbsp;<br>
+$$
+\mathbf{f}^{(t)} = \sigma(W_{fx}\mathbf{x}^{(t)} + W_{fh}\mathbf{h}^{(t-1)} + \mathbf{b}_f)
+$$
+<p>&nbsp;<br>
+
+<p>where the $W$s are the weights to be trained.</p>
 </section>
 
 <section>
@@ -417,18 +424,18 @@ <h2 id="input-gate">Input gate </h2>
 <p>We have</p>
 <p>&nbsp;<br>
 $$
-\mathbf{i}^{(t)} = \sigma_g(W_i\mathbf{x}^{(t)} + U_i\mathbf{h}^{(t-1)} + \mathbf{b}_i),
+\mathbf{i}^{(t)} = \sigma_g(W_{ix}\mathbf{x}^{(t)} + W_{ih}\mathbf{h}^{(t-1)} + \mathbf{b}_i),
 $$
 <p>&nbsp;<br>
 
 <p>and</p>
 <p>&nbsp;<br>
 $$
-\mathbf{\tilde{c}}^{(t)} = \tanh(W_c\mathbf{x}^{(t)} + U_c\mathbf{h}^{(t-1)} + \mathbf{b}_c),
+\mathbf{g}^{(t)} = \tanh(W_{gx}\mathbf{x}^{(t)} + W_{gh}\mathbf{h}^{(t-1)} + \mathbf{b}_g),
 $$
 <p>&nbsp;<br>
 
-<p>again the \( W \) and \( U \) are the weights.</p>
+<p>again the $W$s are the weights to train.</p>
 </section>
 
 <section>
@@ -447,7 +454,7 @@ <h2 id="forget-and-input">Forget and input </h2>
 <p>The forget gate and the input gate together also update the cell state with the following equation, </p>
 <p>&nbsp;<br>
 $$
-\mathbf{c}^{(t)} = \mathbf{f}^{(t)} \otimes \mathbf{c}^{(t-1)} + \mathbf{i}^{(t)} \otimes \mathbf{\tilde{c}}^{(t)},
+\mathbf{c}^{(t)} = \mathbf{f}^{(t)} \otimes \mathbf{c}^{(t-1)} + \mathbf{i}^{(t)} \otimes \mathbf{g}^{(t)},
 $$
 <p>&nbsp;<br>
 
 
@@ -353,12 +353,12 @@ <h2 id="lstm-details">LSTM details </h2>
 element-wise multiplication, denoted by \( \odot \).
 </p>
 
-<p>It follows </p>
+<p>Mathematically we have (see also figure below)</p>
 $$
-\mathbf{f}^{(t)} = \sigma(W_f\mathbf{x}^{(t)} + U_f\mathbf{h}^{(t-1)} + \mathbf{b}_f)
+\mathbf{f}^{(t)} = \sigma(W_{fx}\mathbf{x}^{(t)} + W_{fh}\mathbf{h}^{(t-1)} + \mathbf{b}_f)
 $$
 
-<p>where \( W \) and \( U \) are the weights respectively.</p>
+<p>where the $W$s are the weights to be trained.</p>
 
 <!-- !split --><br><br><br><br><br><br><br><br><br><br>
 <h2 id="comparing-with-a-standard-rnn">Comparing with a standard  RNN  </h2>
@@ -414,6 +414,11 @@ <h2 id="the-forget-gate">The forget gate </h2>
 control the amount of information we want to take from the long-term
 memory.
 </p>
+$$
+\mathbf{f}^{(t)} = \sigma(W_{fx}\mathbf{x}^{(t)} + W_{fh}\mathbf{h}^{(t-1)} + \mathbf{b}_f)
+$$
+
+<p>where the $W$s are the weights to be trained.</p>
 
 <!-- !split --><br><br><br><br><br><br><br><br><br><br>
 <h2 id="basic-layout">Basic layout </h2>
@@ -438,15 +443,15 @@ <h2 id="input-gate">Input gate </h2>
 
 <p>We have</p>
 $$
-\mathbf{i}^{(t)} = \sigma_g(W_i\mathbf{x}^{(t)} + U_i\mathbf{h}^{(t-1)} + \mathbf{b}_i),
+\mathbf{i}^{(t)} = \sigma_g(W_{ix}\mathbf{x}^{(t)} + W_{ih}\mathbf{h}^{(t-1)} + \mathbf{b}_i),
 $$
 
 <p>and</p>
 $$
-\mathbf{\tilde{c}}^{(t)} = \tanh(W_c\mathbf{x}^{(t)} + U_c\mathbf{h}^{(t-1)} + \mathbf{b}_c),
+\mathbf{g}^{(t)} = \tanh(W_{gx}\mathbf{x}^{(t)} + W_{gh}\mathbf{h}^{(t-1)} + \mathbf{b}_g),
 $$
 
-<p>again the \( W \) and \( U \) are the weights.</p>
+<p>again the $W$s are the weights to train.</p>
 
 <!-- !split --><br><br><br><br><br><br><br><br><br><br>
 <h2 id="short-summary">Short summary  </h2>
@@ -462,7 +467,7 @@ <h2 id="forget-and-input">Forget and input </h2>
 
 <p>The forget gate and the input gate together also update the cell state with the following equation, </p>
 $$
-\mathbf{c}^{(t)} = \mathbf{f}^{(t)} \otimes \mathbf{c}^{(t-1)} + \mathbf{i}^{(t)} \otimes \mathbf{\tilde{c}}^{(t)},
+\mathbf{c}^{(t)} = \mathbf{f}^{(t)} \otimes \mathbf{c}^{(t-1)} + \mathbf{i}^{(t)} \otimes \mathbf{g}^{(t)},
 $$
 
 <p>where \( f^{(t)} \) and \( i^{(t)} \) are the outputs of the forget gate and the input gate, respectively.</p>
 
@@ -430,12 +430,12 @@ <h2 id="lstm-details">LSTM details </h2>
 element-wise multiplication, denoted by \( \odot \).
 </p>
 
-<p>It follows </p>
+<p>Mathematically we have (see also figure below)</p>
 $$
-\mathbf{f}^{(t)} = \sigma(W_f\mathbf{x}^{(t)} + U_f\mathbf{h}^{(t-1)} + \mathbf{b}_f)
+\mathbf{f}^{(t)} = \sigma(W_{fx}\mathbf{x}^{(t)} + W_{fh}\mathbf{h}^{(t-1)} + \mathbf{b}_f)
 $$
 
-<p>where \( W \) and \( U \) are the weights respectively.</p>
+<p>where the $W$s are the weights to be trained.</p>
 
 <!-- !split --><br><br><br><br><br><br><br><br><br><br>
 <h2 id="comparing-with-a-standard-rnn">Comparing with a standard  RNN  </h2>
@@ -491,6 +491,11 @@ <h2 id="the-forget-gate">The forget gate </h2>
 control the amount of information we want to take from the long-term
 memory.
 </p>
+$$
+\mathbf{f}^{(t)} = \sigma(W_{fx}\mathbf{x}^{(t)} + W_{fh}\mathbf{h}^{(t-1)} + \mathbf{b}_f)
+$$
+
+<p>where the $W$s are the weights to be trained.</p>
 
 <!-- !split --><br><br><br><br><br><br><br><br><br><br>
 <h2 id="basic-layout">Basic layout </h2>
@@ -515,15 +520,15 @@ <h2 id="input-gate">Input gate </h2>
 
 <p>We have</p>
 $$
-\mathbf{i}^{(t)} = \sigma_g(W_i\mathbf{x}^{(t)} + U_i\mathbf{h}^{(t-1)} + \mathbf{b}_i),
+\mathbf{i}^{(t)} = \sigma_g(W_{ix}\mathbf{x}^{(t)} + W_{ih}\mathbf{h}^{(t-1)} + \mathbf{b}_i),
 $$
 
 <p>and</p>
 $$
-\mathbf{\tilde{c}}^{(t)} = \tanh(W_c\mathbf{x}^{(t)} + U_c\mathbf{h}^{(t-1)} + \mathbf{b}_c),
+\mathbf{g}^{(t)} = \tanh(W_{gx}\mathbf{x}^{(t)} + W_{gh}\mathbf{h}^{(t-1)} + \mathbf{b}_g),
 $$
 
-<p>again the \( W \) and \( U \) are the weights.</p>
+<p>again the $W$s are the weights to train.</p>
 
 <!-- !split --><br><br><br><br><br><br><br><br><br><br>
 <h2 id="short-summary">Short summary  </h2>
@@ -539,7 +544,7 @@ <h2 id="forget-and-input">Forget and input </h2>
 
 <p>The forget gate and the input gate together also update the cell state with the following equation, </p>
 $$
-\mathbf{c}^{(t)} = \mathbf{f}^{(t)} \otimes \mathbf{c}^{(t-1)} + \mathbf{i}^{(t)} \otimes \mathbf{\tilde{c}}^{(t)},
+\mathbf{c}^{(t)} = \mathbf{f}^{(t)} \otimes \mathbf{c}^{(t-1)} + \mathbf{i}^{(t)} \otimes \mathbf{g}^{(t)},
 $$
 
 <p>where \( f^{(t)} \) and \( i^{(t)} \) are the outputs of the forget gate and the input gate, respectively.</p>