|
132 | 132 | ('Diffusion models', 2, None, 'diffusion-models'), |
133 | 133 | ('Original idea', 2, None, 'original-idea'), |
134 | 134 | ('Diffusion learning', 2, None, 'diffusion-learning'), |
135 | | - ('Diffusion models, basics', 2, None, 'diffusion-models-basics'), |
136 | | - ('Problems with probabilistic models', |
137 | | - 2, |
138 | | - None, |
139 | | - 'problems-with-probabilistic-models'), |
140 | | - ('Diffusion models', 2, None, 'diffusion-models'), |
141 | | - ('Original idea', 2, None, 'original-idea'), |
142 | | - ('Diffusion learning', 2, None, 'diffusion-learning'), |
143 | 135 | ('Mathematics of diffusion models', |
144 | 136 | 2, |
145 | 137 | None, |
|
273 | 265 | <!-- navigation toc: --> <li><a href="#diffusion-models" style="font-size: 80%;">Diffusion models</a></li> |
274 | 266 | <!-- navigation toc: --> <li><a href="#original-idea" style="font-size: 80%;">Original idea</a></li> |
275 | 267 | <!-- navigation toc: --> <li><a href="#diffusion-learning" style="font-size: 80%;">Diffusion learning</a></li> |
276 | | - <!-- navigation toc: --> <li><a href="#diffusion-models-basics" style="font-size: 80%;">Diffusion models, basics</a></li> |
277 | | - <!-- navigation toc: --> <li><a href="#problems-with-probabilistic-models" style="font-size: 80%;">Problems with probabilistic models</a></li> |
278 | | - <!-- navigation toc: --> <li><a href="#diffusion-models" style="font-size: 80%;">Diffusion models</a></li> |
279 | | - <!-- navigation toc: --> <li><a href="#original-idea" style="font-size: 80%;">Original idea</a></li> |
280 | | - <!-- navigation toc: --> <li><a href="#diffusion-learning" style="font-size: 80%;">Diffusion learning</a></li> |
281 | 268 | <!-- navigation toc: --> <li><a href="#mathematics-of-diffusion-models" style="font-size: 80%;">Mathematics of diffusion models</a></li> |
282 | 269 | <!-- navigation toc: --> <li><a href="#chains-of-vaes" style="font-size: 80%;">Chains of VAEs</a></li> |
283 | 270 | <!-- navigation toc: --> <li><a href="#mathematical-representation" style="font-size: 80%;">Mathematical representation</a></li> |
@@ -1316,72 +1303,6 @@ <h2 id="code-in-pytorch-for-vaes" class="anchor">Code in PyTorch for VAEs </h2> |
1316 | 1303 | </div> |
1317 | 1304 |
|
1318 | 1305 |
|
1319 | | -<!-- !split --> |
1320 | | -<h2 id="diffusion-models-basics" class="anchor">Diffusion models, basics </h2> |
1321 | | - |
1322 | | -<p>Diffusion models are inspired by non-equilibrium thermodynamics. They |
1323 | | -define a Markov chain of diffusion steps to slowly add random noise to |
1324 | | -data and then learn to reverse the diffusion process to construct |
1325 | | -desired data samples from the noise. Unlike VAE or flow models, |
1326 | | -diffusion models are learned with a fixed procedure and the latent |
1327 | | -variable has high dimensionality (same as the original data). |
1328 | | -</p> |
1329 | | - |
1330 | | -<!-- !split --> |
1331 | | -<h2 id="problems-with-probabilistic-models" class="anchor">Problems with probabilistic models </h2> |
1332 | | - |
1333 | | -<p>Historically, probabilistic models suffer from a tradeoff between two |
1334 | | -conflicting objectives: \textit{tractability} and |
1335 | | -\textit{flexibility}. Models that are \textit{tractable} can be |
1336 | | -analytically evaluated and easily fit to data (e.g. a Gaussian or |
1337 | | -Laplace). However, these models are unable to aptly describe structure |
1338 | | -in rich datasets. On the other hand, models that are \textit{flexible} |
1339 | | -can be molded to fit structure in arbitrary data. For example, we can |
1340 | | -define models in terms of any (non-negative) function \( \phi(\boldsymbol{x}) \) |
1341 | | -yielding the flexible distribution \( p\left(\boldsymbol{x}\right) = |
1342 | | -\frac{\phi\left(\boldsymbol{x} \right)}{Z} \), where \( Z \) is a normalization |
1343 | | -constant. However, computing this normalization constant is generally |
1344 | | -intractable. Evaluating, training, or drawing samples from such |
1345 | | -flexible models typically requires a very expensive Monte Carlo |
1346 | | -process. |
1347 | | -</p> |
1348 | | - |
1349 | | -<!-- !split --> |
1350 | | -<h2 id="diffusion-models" class="anchor">Diffusion models </h2> |
1351 | | -<p>Diffusion models have several interesting features</p> |
1352 | | -<ul> |
1353 | | -<li> extreme flexibility in model structure,</li> |
1354 | | -<li> exact sampling,</li> |
1355 | | -<li> easy multiplication with other distributions, e.g. in order to compute a posterior, and</li> |
1356 | | -<li> the model log likelihood, and the probability of individual states, to be cheaply evaluated.</li> |
1357 | | -</ul> |
1358 | | -<!-- !split --> |
1359 | | -<h2 id="original-idea" class="anchor">Original idea </h2> |
1360 | | - |
1361 | | -<p>In the original formulation, one uses a Markov chain to gradually |
1362 | | -convert one distribution into another, an idea used in non-equilibrium |
1363 | | -statistical physics and sequential Monte Carlo. Diffusion models build |
1364 | | -a generative Markov chain which converts a simple known distribution |
1365 | | -(e.g. a Gaussian) into a target (data) distribution using a diffusion |
1366 | | -process. Rather than use this Markov chain to approximately evaluate a |
1367 | | -model which has been otherwise defined, one can explicitly define the |
1368 | | -probabilistic model as the endpoint of the Markov chain. Since each |
1369 | | -step in the diffusion chain has an analytically evaluable probability, |
1370 | | -the full chain can also be analytically evaluated. |
1371 | | -</p> |
1372 | | - |
1373 | | -<!-- !split --> |
1374 | | -<h2 id="diffusion-learning" class="anchor">Diffusion learning </h2> |
1375 | | - |
1376 | | -<p>Learning in this framework involves estimating small perturbations to |
1377 | | -a diffusion process. Estimating small, analytically tractable, |
1378 | | -perturbations is more tractable than explicitly describing the full |
1379 | | -distribution with a single, non-analytically-normalizable, potential |
1380 | | -function. Furthermore, since a diffusion process exists for any |
1381 | | -smooth target distribution, this method can capture data distributions |
1382 | | -of arbitrary form. |
1383 | | -</p> |
1384 | | - |
1385 | 1306 | <!-- !split --> |
1386 | 1307 | <h2 id="diffusion-models-basics" class="anchor">Diffusion models, basics </h2> |
1387 | 1308 |
|
|
0 commit comments