|
46 | 46 | 2, |
47 | 47 | None, |
48 | 48 | 'from-nns-and-cnns-to-recurrent-neural-networks-rnns'), |
| 49 | + ('What is a recurrent NN?', 2, None, 'what-is-a-recurrent-nn'), |
| 50 | + ('Why RNNs?', 2, None, 'why-rnns'), |
49 | 51 | ('Feedback connections', 2, None, 'feedback-connections'), |
50 | 52 | ('Vanishing gradients', 2, None, 'vanishing-gradients'), |
51 | 53 | ('Recurrent neural networks (RNNs): Overarching view', |
|
152 | 154 | 2, |
153 | 155 | None, |
154 | 156 | 'summary-of-a-typical-rnn'), |
| 157 | + ('The mathematics of RNNs, the basic architecture', |
| 158 | + 2, |
| 159 | + None, |
| 160 | + 'the-mathematics-of-rnns-the-basic-architecture'), |
155 | 161 | ('Four effective ways to learn an RNN and preparing for next ' |
156 | 162 | 'week', |
157 | 163 | 2, |
|
195 | 201 | <!-- navigation toc: --> <li><a href="#reading-recommendations" style="font-size: 80%;">Reading recommendations</a></li> |
196 | 202 | <!-- navigation toc: --> <li><a href="#tensorflow-examples" style="font-size: 80%;">TensorFlow examples</a></li> |
197 | 203 | <!-- navigation toc: --> <li><a href="#from-nns-and-cnns-to-recurrent-neural-networks-rnns" style="font-size: 80%;">From NNs and CNNs to recurrent neural networks (RNNs)</a></li> |
| 204 | + <!-- navigation toc: --> <li><a href="#what-is-a-recurrent-nn" style="font-size: 80%;">What is a recurrent NN?</a></li> |
| 205 | + <!-- navigation toc: --> <li><a href="#why-rnns" style="font-size: 80%;">Why RNNs?</a></li> |
198 | 206 | <!-- navigation toc: --> <li><a href="#feedback-connections" style="font-size: 80%;">Feedback connections</a></li> |
199 | 207 | <!-- navigation toc: --> <li><a href="#vanishing-gradients" style="font-size: 80%;">Vanishing gradients</a></li> |
200 | 208 | <!-- navigation toc: --> <li><a href="#recurrent-neural-networks-rnns-overarching-view" style="font-size: 80%;">Recurrent neural networks (RNNs): Overarching view</a></li> |
|
230 | 238 | <!-- navigation toc: --> <li><a href="#gradients-of-loss-functions" style="font-size: 80%;">Gradients of loss functions</a></li> |
231 | 239 | <!-- navigation toc: --> <li><a href="#summary-of-rnns" style="font-size: 80%;">Summary of RNNs</a></li> |
232 | 240 | <!-- navigation toc: --> <li><a href="#summary-of-a-typical-rnn" style="font-size: 80%;">Summary of a typical RNN</a></li> |
| 241 | + <!-- navigation toc: --> <li><a href="#the-mathematics-of-rnns-the-basic-architecture" style="font-size: 80%;">The mathematics of RNNs, the basic architecture</a></li> |
233 | 242 | <!-- navigation toc: --> <li><a href="#four-effective-ways-to-learn-an-rnn-and-preparing-for-next-week" style="font-size: 80%;">Four effective ways to learn an RNN and preparing for next week</a></li> |
234 | 243 |
|
235 | 244 | </ul> |
@@ -312,6 +321,40 @@ <h2 id="from-nns-and-cnns-to-recurrent-neural-networks-rnns" class="anchor">From |
312 | 321 | inputs. |
313 | 322 | </p> |
314 | 323 |
|
| 324 | +<!-- !split --> |
| 325 | +<h2 id="what-is-a-recurrent-nn" class="anchor">What is a recurrent NN? </h2> |
| 326 | + |
| 327 | +<p>A recurrent neural network (RNN), as opposed to a regular fully |
| 328 | +connected neural network (FCNN) or just neural network (NN), has |
| 329 | +layers that are connected to themselves. |
| 330 | +</p> |
| 331 | + |
| 332 | +<p>In an FCNN there are no connections between nodes in a single |
| 333 | +layer. For instance, \( (h_1^1 \) is not connected to \( (h_2^1 \). In |
| 334 | +addition, the input and output are always of a fixed length. |
| 335 | +</p> |
| 336 | + |
| 337 | +<p>In an RNN, however, this is no longer the case. Nodes in the hidden |
| 338 | +layers are connected to themselves. |
| 339 | +</p> |
| 340 | + |
| 341 | +<!-- !split --> |
| 342 | +<h2 id="why-rnns" class="anchor">Why RNNs? </h2> |
| 343 | + |
| 344 | +<p>Recurrent neural networks work very well when working with |
| 345 | +sequential data, that is data where the order matters. In a regular |
| 346 | +fully connected network, the order of input doesn't really matter. |
| 347 | +</p> |
| 348 | + |
| 349 | +<p>Another property of RNNs is that they can handle variable input |
| 350 | +and output. Consider again the simplified breast cancer dataset. If you |
| 351 | +have trained a regular FCNN on the dataset with the two features, it |
| 352 | +makes no sense to suddenly add a third feature. The network would not |
| 353 | +know what to do with it, and would reject such inputs with three |
| 354 | +features (or any other number of features that isn't two, for that |
| 355 | +matter). |
| 356 | +</p> |
| 357 | + |
315 | 358 | <!-- !split --> |
316 | 359 | <h2 id="feedback-connections" class="anchor">Feedback connections </h2> |
317 | 360 |
|
@@ -1019,6 +1062,9 @@ <h2 id="summary-of-a-typical-rnn" class="anchor">Summary of a typical RNN </h2> |
1019 | 1062 | The parameters are trained through the so-called back-propagation through time (BPTT) algorithm. |
1020 | 1063 | </p> |
1021 | 1064 |
|
| 1065 | +<!-- !split --> |
| 1066 | +<h2 id="the-mathematics-of-rnns-the-basic-architecture" class="anchor">The mathematics of RNNs, the basic architecture </h2> |
| 1067 | + |
1022 | 1068 | <!-- !split --> |
1023 | 1069 | <h2 id="four-effective-ways-to-learn-an-rnn-and-preparing-for-next-week" class="anchor">Four effective ways to learn an RNN and preparing for next week </h2> |
1024 | 1070 | <ol> |
|
0 commit comments