Skip to content

Commit 245bc0d

Browse files
committed
update
1 parent 77c0e0d commit 245bc0d

File tree

8 files changed

+230
-180
lines changed

8 files changed

+230
-180
lines changed

doc/pub/week8/html/week8-bs.html

Lines changed: 12 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -420,12 +420,12 @@ <h2 id="lstm-details" class="anchor">LSTM details </h2>
420420
element-wise multiplication, denoted by \( \odot \).
421421
</p>
422422

423-
<p>It follows </p>
423+
<p>Mathematically we have (see also figure below)</p>
424424
$$
425-
\mathbf{f}^{(t)} = \sigma(W_f\mathbf{x}^{(t)} + U_f\mathbf{h}^{(t-1)} + \mathbf{b}_f)
425+
\mathbf{f}^{(t)} = \sigma(W_{fx}\mathbf{x}^{(t)} + W_{fh}\mathbf{h}^{(t-1)} + \mathbf{b}_f)
426426
$$
427427

428-
<p>where \( W \) and \( U \) are the weights respectively.</p>
428+
<p>where the $W$s are the weights to be trained.</p>
429429

430430
<!-- !split -->
431431
<h2 id="comparing-with-a-standard-rnn" class="anchor">Comparing with a standard RNN </h2>
@@ -481,6 +481,11 @@ <h2 id="the-forget-gate" class="anchor">The forget gate </h2>
481481
control the amount of information we want to take from the long-term
482482
memory.
483483
</p>
484+
$$
485+
\mathbf{f}^{(t)} = \sigma(W_{fx}\mathbf{x}^{(t)} + W_{fh}\mathbf{h}^{(t-1)} + \mathbf{b}_f)
486+
$$
487+
488+
<p>where the $W$s are the weights to be trained.</p>
484489

485490
<!-- !split -->
486491
<h2 id="basic-layout" class="anchor">Basic layout </h2>
@@ -505,15 +510,15 @@ <h2 id="input-gate" class="anchor">Input gate </h2>
505510

506511
<p>We have</p>
507512
$$
508-
\mathbf{i}^{(t)} = \sigma_g(W_i\mathbf{x}^{(t)} + U_i\mathbf{h}^{(t-1)} + \mathbf{b}_i),
513+
\mathbf{i}^{(t)} = \sigma_g(W_{ix}\mathbf{x}^{(t)} + W_{ih}\mathbf{h}^{(t-1)} + \mathbf{b}_i),
509514
$$
510515

511516
<p>and</p>
512517
$$
513-
\mathbf{\tilde{c}}^{(t)} = \tanh(W_c\mathbf{x}^{(t)} + U_c\mathbf{h}^{(t-1)} + \mathbf{b}_c),
518+
\mathbf{g}^{(t)} = \tanh(W_{gx}\mathbf{x}^{(t)} + W_{gh}\mathbf{h}^{(t-1)} + \mathbf{b}_g),
514519
$$
515520

516-
<p>again the \( W \) and \( U \) are the weights.</p>
521+
<p>again the $W$s are the weights to train.</p>
517522

518523
<!-- !split -->
519524
<h2 id="short-summary" class="anchor">Short summary </h2>
@@ -529,7 +534,7 @@ <h2 id="forget-and-input" class="anchor">Forget and input </h2>
529534

530535
<p>The forget gate and the input gate together also update the cell state with the following equation, </p>
531536
$$
532-
\mathbf{c}^{(t)} = \mathbf{f}^{(t)} \otimes \mathbf{c}^{(t-1)} + \mathbf{i}^{(t)} \otimes \mathbf{\tilde{c}}^{(t)},
537+
\mathbf{c}^{(t)} = \mathbf{f}^{(t)} \otimes \mathbf{c}^{(t-1)} + \mathbf{i}^{(t)} \otimes \mathbf{g}^{(t)},
533538
$$
534539

535540
<p>where \( f^{(t)} \) and \( i^{(t)} \) are the outputs of the forget gate and the input gate, respectively.</p>

doc/pub/week8/html/week8-reveal.html

Lines changed: 14 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -321,14 +321,14 @@ <h2 id="lstm-details">LSTM details </h2>
321321
element-wise multiplication, denoted by \( \odot \).
322322
</p>
323323

324-
<p>It follows </p>
324+
<p>Mathematically we have (see also figure below)</p>
325325
<p>&nbsp;<br>
326326
$$
327-
\mathbf{f}^{(t)} = \sigma(W_f\mathbf{x}^{(t)} + U_f\mathbf{h}^{(t-1)} + \mathbf{b}_f)
327+
\mathbf{f}^{(t)} = \sigma(W_{fx}\mathbf{x}^{(t)} + W_{fh}\mathbf{h}^{(t-1)} + \mathbf{b}_f)
328328
$$
329329
<p>&nbsp;<br>
330330

331-
<p>where \( W \) and \( U \) are the weights respectively.</p>
331+
<p>where the $W$s are the weights to be trained.</p>
332332
</section>
333333

334334
<section>
@@ -390,6 +390,13 @@ <h2 id="the-forget-gate">The forget gate </h2>
390390
control the amount of information we want to take from the long-term
391391
memory.
392392
</p>
393+
<p>&nbsp;<br>
394+
$$
395+
\mathbf{f}^{(t)} = \sigma(W_{fx}\mathbf{x}^{(t)} + W_{fh}\mathbf{h}^{(t-1)} + \mathbf{b}_f)
396+
$$
397+
<p>&nbsp;<br>
398+
399+
<p>where the $W$s are the weights to be trained.</p>
393400
</section>
394401

395402
<section>
@@ -417,18 +424,18 @@ <h2 id="input-gate">Input gate </h2>
417424
<p>We have</p>
418425
<p>&nbsp;<br>
419426
$$
420-
\mathbf{i}^{(t)} = \sigma_g(W_i\mathbf{x}^{(t)} + U_i\mathbf{h}^{(t-1)} + \mathbf{b}_i),
427+
\mathbf{i}^{(t)} = \sigma_g(W_{ix}\mathbf{x}^{(t)} + W_{ih}\mathbf{h}^{(t-1)} + \mathbf{b}_i),
421428
$$
422429
<p>&nbsp;<br>
423430

424431
<p>and</p>
425432
<p>&nbsp;<br>
426433
$$
427-
\mathbf{\tilde{c}}^{(t)} = \tanh(W_c\mathbf{x}^{(t)} + U_c\mathbf{h}^{(t-1)} + \mathbf{b}_c),
434+
\mathbf{g}^{(t)} = \tanh(W_{gx}\mathbf{x}^{(t)} + W_{gh}\mathbf{h}^{(t-1)} + \mathbf{b}_g),
428435
$$
429436
<p>&nbsp;<br>
430437

431-
<p>again the \( W \) and \( U \) are the weights.</p>
438+
<p>again the $W$s are the weights to train.</p>
432439
</section>
433440

434441
<section>
@@ -447,7 +454,7 @@ <h2 id="forget-and-input">Forget and input </h2>
447454
<p>The forget gate and the input gate together also update the cell state with the following equation, </p>
448455
<p>&nbsp;<br>
449456
$$
450-
\mathbf{c}^{(t)} = \mathbf{f}^{(t)} \otimes \mathbf{c}^{(t-1)} + \mathbf{i}^{(t)} \otimes \mathbf{\tilde{c}}^{(t)},
457+
\mathbf{c}^{(t)} = \mathbf{f}^{(t)} \otimes \mathbf{c}^{(t-1)} + \mathbf{i}^{(t)} \otimes \mathbf{g}^{(t)},
451458
$$
452459
<p>&nbsp;<br>
453460

doc/pub/week8/html/week8-solarized.html

Lines changed: 12 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -353,12 +353,12 @@ <h2 id="lstm-details">LSTM details </h2>
353353
element-wise multiplication, denoted by \( \odot \).
354354
</p>
355355

356-
<p>It follows </p>
356+
<p>Mathematically we have (see also figure below)</p>
357357
$$
358-
\mathbf{f}^{(t)} = \sigma(W_f\mathbf{x}^{(t)} + U_f\mathbf{h}^{(t-1)} + \mathbf{b}_f)
358+
\mathbf{f}^{(t)} = \sigma(W_{fx}\mathbf{x}^{(t)} + W_{fh}\mathbf{h}^{(t-1)} + \mathbf{b}_f)
359359
$$
360360

361-
<p>where \( W \) and \( U \) are the weights respectively.</p>
361+
<p>where the $W$s are the weights to be trained.</p>
362362

363363
<!-- !split --><br><br><br><br><br><br><br><br><br><br>
364364
<h2 id="comparing-with-a-standard-rnn">Comparing with a standard RNN </h2>
@@ -414,6 +414,11 @@ <h2 id="the-forget-gate">The forget gate </h2>
414414
control the amount of information we want to take from the long-term
415415
memory.
416416
</p>
417+
$$
418+
\mathbf{f}^{(t)} = \sigma(W_{fx}\mathbf{x}^{(t)} + W_{fh}\mathbf{h}^{(t-1)} + \mathbf{b}_f)
419+
$$
420+
421+
<p>where the $W$s are the weights to be trained.</p>
417422

418423
<!-- !split --><br><br><br><br><br><br><br><br><br><br>
419424
<h2 id="basic-layout">Basic layout </h2>
@@ -438,15 +443,15 @@ <h2 id="input-gate">Input gate </h2>
438443

439444
<p>We have</p>
440445
$$
441-
\mathbf{i}^{(t)} = \sigma_g(W_i\mathbf{x}^{(t)} + U_i\mathbf{h}^{(t-1)} + \mathbf{b}_i),
446+
\mathbf{i}^{(t)} = \sigma_g(W_{ix}\mathbf{x}^{(t)} + W_{ih}\mathbf{h}^{(t-1)} + \mathbf{b}_i),
442447
$$
443448

444449
<p>and</p>
445450
$$
446-
\mathbf{\tilde{c}}^{(t)} = \tanh(W_c\mathbf{x}^{(t)} + U_c\mathbf{h}^{(t-1)} + \mathbf{b}_c),
451+
\mathbf{g}^{(t)} = \tanh(W_{gx}\mathbf{x}^{(t)} + W_{gh}\mathbf{h}^{(t-1)} + \mathbf{b}_g),
447452
$$
448453

449-
<p>again the \( W \) and \( U \) are the weights.</p>
454+
<p>again the $W$s are the weights to train.</p>
450455

451456
<!-- !split --><br><br><br><br><br><br><br><br><br><br>
452457
<h2 id="short-summary">Short summary </h2>
@@ -462,7 +467,7 @@ <h2 id="forget-and-input">Forget and input </h2>
462467

463468
<p>The forget gate and the input gate together also update the cell state with the following equation, </p>
464469
$$
465-
\mathbf{c}^{(t)} = \mathbf{f}^{(t)} \otimes \mathbf{c}^{(t-1)} + \mathbf{i}^{(t)} \otimes \mathbf{\tilde{c}}^{(t)},
470+
\mathbf{c}^{(t)} = \mathbf{f}^{(t)} \otimes \mathbf{c}^{(t-1)} + \mathbf{i}^{(t)} \otimes \mathbf{g}^{(t)},
466471
$$
467472

468473
<p>where \( f^{(t)} \) and \( i^{(t)} \) are the outputs of the forget gate and the input gate, respectively.</p>

doc/pub/week8/html/week8.html

Lines changed: 12 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -430,12 +430,12 @@ <h2 id="lstm-details">LSTM details </h2>
430430
element-wise multiplication, denoted by \( \odot \).
431431
</p>
432432

433-
<p>It follows </p>
433+
<p>Mathematically we have (see also figure below)</p>
434434
$$
435-
\mathbf{f}^{(t)} = \sigma(W_f\mathbf{x}^{(t)} + U_f\mathbf{h}^{(t-1)} + \mathbf{b}_f)
435+
\mathbf{f}^{(t)} = \sigma(W_{fx}\mathbf{x}^{(t)} + W_{fh}\mathbf{h}^{(t-1)} + \mathbf{b}_f)
436436
$$
437437

438-
<p>where \( W \) and \( U \) are the weights respectively.</p>
438+
<p>where the $W$s are the weights to be trained.</p>
439439

440440
<!-- !split --><br><br><br><br><br><br><br><br><br><br>
441441
<h2 id="comparing-with-a-standard-rnn">Comparing with a standard RNN </h2>
@@ -491,6 +491,11 @@ <h2 id="the-forget-gate">The forget gate </h2>
491491
control the amount of information we want to take from the long-term
492492
memory.
493493
</p>
494+
$$
495+
\mathbf{f}^{(t)} = \sigma(W_{fx}\mathbf{x}^{(t)} + W_{fh}\mathbf{h}^{(t-1)} + \mathbf{b}_f)
496+
$$
497+
498+
<p>where the $W$s are the weights to be trained.</p>
494499

495500
<!-- !split --><br><br><br><br><br><br><br><br><br><br>
496501
<h2 id="basic-layout">Basic layout </h2>
@@ -515,15 +520,15 @@ <h2 id="input-gate">Input gate </h2>
515520

516521
<p>We have</p>
517522
$$
518-
\mathbf{i}^{(t)} = \sigma_g(W_i\mathbf{x}^{(t)} + U_i\mathbf{h}^{(t-1)} + \mathbf{b}_i),
523+
\mathbf{i}^{(t)} = \sigma_g(W_{ix}\mathbf{x}^{(t)} + W_{ih}\mathbf{h}^{(t-1)} + \mathbf{b}_i),
519524
$$
520525

521526
<p>and</p>
522527
$$
523-
\mathbf{\tilde{c}}^{(t)} = \tanh(W_c\mathbf{x}^{(t)} + U_c\mathbf{h}^{(t-1)} + \mathbf{b}_c),
528+
\mathbf{g}^{(t)} = \tanh(W_{gx}\mathbf{x}^{(t)} + W_{gh}\mathbf{h}^{(t-1)} + \mathbf{b}_g),
524529
$$
525530

526-
<p>again the \( W \) and \( U \) are the weights.</p>
531+
<p>again the $W$s are the weights to train.</p>
527532

528533
<!-- !split --><br><br><br><br><br><br><br><br><br><br>
529534
<h2 id="short-summary">Short summary </h2>
@@ -539,7 +544,7 @@ <h2 id="forget-and-input">Forget and input </h2>
539544

540545
<p>The forget gate and the input gate together also update the cell state with the following equation, </p>
541546
$$
542-
\mathbf{c}^{(t)} = \mathbf{f}^{(t)} \otimes \mathbf{c}^{(t-1)} + \mathbf{i}^{(t)} \otimes \mathbf{\tilde{c}}^{(t)},
547+
\mathbf{c}^{(t)} = \mathbf{f}^{(t)} \otimes \mathbf{c}^{(t-1)} + \mathbf{i}^{(t)} \otimes \mathbf{g}^{(t)},
543548
$$
544549

545550
<p>where \( f^{(t)} \) and \( i^{(t)} \) are the outputs of the forget gate and the input gate, respectively.</p>
0 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)