Skip to content

Commit 6c478b6

Browse files
committed
misc
1 parent 0558b02 commit 6c478b6

1 file changed

Lines changed: 40 additions & 71 deletions

File tree

lectures/numba.md

Lines changed: 40 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -48,11 +48,13 @@ import matplotlib.pyplot as plt
4848
In an {doc}`earlier lecture <need_for_speed>` we discussed vectorization,
4949
which can improve execution speed by sending array processing operations in batch to efficient low-level code.
5050

51-
However, as {ref}`discussed previously <numba-p_c_vectorization>`, traditional vectorization schemes, such as those found in MATLAB, Julia, and NumPy, have several weaknesses.
51+
However, as {ref}`discussed in that lecture <numba-p_c_vectorization>`,
52+
traditional vectorization schemes, such as those found in MATLAB, Julia, and NumPy, have weaknesses.
5253

53-
For example, they can be highly memory-intensive and, for some algorithms, vectorization is ineffective or impossible.
54+
* Highly memory-intensive for compound array operations
55+
* Ineffective or impossible for some algorithms.
5456

55-
One way around these problems is through [Numba](https://numba.pydata.org/), a
57+
One way to circumvent these problems is by using [Numba](https://numba.pydata.org/), a
5658
**just in time (JIT) compiler** for Python that is oriented towards numerical work.
5759

5860
Numba compiles functions to native machine code instructions during runtime.
@@ -62,6 +64,14 @@ When it succeeds, Numba will be on par with machine code from low-level language
6264
In addition, Numba can do other useful tricks, such as {ref}`multithreading` or
6365
interfacing with GPUs (through `numba.cuda`).
6466

67+
Numba's JIT compiler is in many ways similar to the JIT compiler in JULIA
68+
69+
The main difference is that it is less ambitious, attempting to compile a smaller subset of the language.
70+
71+
Although this might sound like a defficiency, it is in some ways an advantage.
72+
73+
Numba is lean, easy to use, and very good at what it does.
74+
6575
This lecture introduces the core ideas.
6676

6777
(numba_link)=
@@ -74,10 +84,10 @@ This lecture introduces the core ideas.
7484
(quad_map_eg)=
7585
### An Example
7686

77-
Let's consider a problem that's difficult to vectorize: generating the
78-
trajectory of a difference equation given an initial condition.
87+
Let's consider a problem that's difficult to vectorize (i.e., hand off to array
88+
processing operations).
7989

80-
We will take the difference equation to be the quadratic map
90+
The problem involves generating the trajectory via the quadratic map
8191

8292
$$
8393
x_{t+1} = \alpha x_t (1 - x_t)
@@ -168,22 +178,23 @@ The basic idea is this:
168178
* Python is very flexible and hence we could call the function qm with many
169179
types.
170180
* e.g., `x0` could be a NumPy array or a list, `n` could be an integer or a float, etc.
171-
* This makes it hard to *pre*-compile the function (i.e., compile before runtime).
172-
* However, when we do actually call the function, say by running `qm(0.5, 10)`,
173-
the types of `x0` and `n` become clear.
181+
* This makes it very difficult to generate efficient machine code *ahead of time* (i.e., before runtime).
182+
* However, when we do actually *call* the function, say by running `qm(0.5, 10)`,
183+
the types of `x0` and `n` become clear.
174184
* Moreover, the types of *other variables* in `qm` *can be inferred once the input types are known*.
175-
* So the strategy of Numba and other JIT compilers is to wait until this
176-
moment, and then compile the function.
185+
* So the strategy of Numba and other JIT compilers is to *wait until the function is called*, and then compile.
177186

178-
That's why it is called "just-in-time" compilation.
187+
That's is called "just-in-time" compilation.
179188

180-
Note that, if you make the call `qm(0.5, 10)` and then follow it with `qm(0.9, 20)`, compilation only takes place on the first call.
189+
Note that, if you make the call `qm(0.5, 10)` and then follow it with `qm(0.9,
190+
20)`, compilation only takes place on the first call.
181191

182192
This is because compiled code is cached and reused as required.
183193

184194
This is why, in the code above, `time3` is smaller than `time2`.
185195

186196

197+
187198
## Decorator Notation
188199

189200
In the code above we created a JIT compiled version of `qm` via the call
@@ -226,30 +237,22 @@ with qe.Timer(precision=4):
226237

227238
Numba also provides several arguments for decorators to accelerate computation and cache functions -- see [here](https://numba.readthedocs.io/en/stable/user/performance-tips.html).
228239

240+
229241
## Type Inference
230242

231243
Successful type inference is a key part of JIT compilation.
232244

233-
As you can imagine, inferring types is easier for simple Python objects (e.g., simple scalar data types such as floats and integers).
245+
As you can imagine, inferring types is easier for simple Python objects (e.g.,
246+
simple scalar data types such as floats and integers).
234247

235248
Numba also plays well with NumPy arrays, which have well-defined types.
236249

237250
In an ideal setting, Numba can infer all necessary type information.
238251

239-
This allows it to generate native machine code, without having to call the Python runtime environment.
252+
This allows it to generate efficient native machine code, without having to call the Python runtime environment.
240253

241254
When Numba cannot infer all type information, it will raise an error.
242255

243-
```{note}
244-
In older versions of Numba, the `@jit` decorator would silently fall back
245-
to "object mode" when it could not infer all types, which provided little or
246-
no speed gain. Current versions of Numba use `nopython` mode by default,
247-
meaning the compiler insists on full type inference and raises an error if
248-
it fails. You will often see `@njit` used in other code, which is simply
249-
an alias for `@jit(nopython=True)`. Since nopython mode is now the default,
250-
`@jit` and `@njit` are equivalent.
251-
```
252-
253256
For example, in the (artificial) setting below, Numba is unable to determine the type of function `mean` when compiling the function `bootstrap`
254257

255258
```{code-cell} ipython3
@@ -297,9 +300,7 @@ Let's add some cautionary notes.
297300
As we've seen, Numba needs to infer type information on
298301
all variables to generate fast machine-level instructions.
299302

300-
For simple routines, Numba infers types very well.
301-
302-
For larger ones, or for routines using external libraries, it can easily fail.
303+
For large routines or those using external libraries, this process can easily fail.
303304

304305
Hence, it's best to focus on speeding up small, time-critical snippets of code.
305306

@@ -333,32 +334,14 @@ function.
333334

334335
When Numba compiles machine code for functions, it treats global variables as constants to ensure type stability.
335336

336-
### Caching Compiled Code
337-
338-
By default, Numba recompiles functions each time a new Python session starts.
339-
340-
To avoid this overhead, you can pass `cache=True` to the decorator:
341-
342-
```{code-cell} ipython3
343-
@jit(cache=True)
344-
def qm(x0, n):
345-
x = np.empty(n+1)
346-
x[0] = x0
347-
for t in range(n):
348-
x[t+1] = α * x[t] * (1 - x[t])
349-
return x
350-
```
351-
352-
This stores the compiled code on disk so that subsequent sessions can skip
353-
the compilation step.
354337

355338
(multithreading)=
356339
## Multithreaded Loops in Numba
357340

358-
In addition to JIT compilation, Numba provides support for parallel computing on CPUs.
341+
In addition to JIT compilation, Numba provides support for parallel computing on CPUs and GPUs.
359342

360-
The key tool for parallelization in Numba is the `prange` function, which tells
361-
Numba to execute loop iterations in parallel across available CPU cores.
343+
The key tool for parallelization on CPUs in Numba is the `prange` function, which tells
344+
Numba to execute loop iterations in parallel across available cores.
362345

363346
To illustrate, let's look first at a simple, single-threaded (i.e., non-parallelized) piece of code.
364347

@@ -418,27 +401,10 @@ Now let's suppose that we have a large population of households and we want to
418401
know what median wealth will be.
419402

420403
This is not easy to solve with pencil and paper, so we will use simulation
421-
instead.
422-
423-
In particular, we will simulate a large number of households and then
424-
calculate median wealth for this group.
425-
426-
Suppose we are interested in the long-run average of this median over time.
427-
428-
For the specification that we've chosen above, we can
429-
calculate this by taking a one-period cross-sectional snapshot of median
430-
wealth of the group at the end of a long simulation.
431-
432-
Moreover, provided the simulation period is long enough, initial conditions don't matter.
433-
434-
(This is due to [ergodicity](https://python.quantecon.org/finite_markov.html#id15).)
435-
436-
So, in summary, we are going to simulate 50,000 households by
437-
438-
1. arbitrarily setting initial wealth to 1 and
439-
1. simulating forward in time for 1,000 periods.
404+
instead:
440405

441-
Then we'll calculate median wealth at the end period.
406+
1. Simulate a large number of households forward in time
407+
2. Calculate median wealth
442408

443409
Here's the code:
444410

@@ -492,6 +458,8 @@ with qe.Timer():
492458

493459
The speed-up is significant.
494460

461+
Notice that we parallelize across households rather than over time -- updates of
462+
an individual household across time periods are inherently sequential
495463

496464
## Exercises
497465

@@ -550,8 +518,9 @@ So we get a speed gain of 2 orders of magnitude by adding four characters.
550518
:label: speed_ex2
551519
```
552520

553-
In the [Introduction to Quantitative Economics with Python](https://intro.quantecon.org/intro.html) lecture series you can
554-
learn all about finite-state Markov chains.
521+
In the [Introduction to Quantitative Economics with
522+
Python](https://intro.quantecon.org/intro.html) lecture series you can learn all
523+
about finite-state Markov chains.
555524

556525
For now, let's just concentrate on simulating a very simple example of such a chain.
557526

0 commit comments

Comments
 (0)