-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathindex.html
More file actions
740 lines (535 loc) · 37.9 KB
/
index.html
File metadata and controls
740 lines (535 loc) · 37.9 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
<!DOCTYPE html>
<!--[if IEMobile 7 ]><html class="no-js iem7"><![endif]-->
<!--[if lt IE 9]><html class="no-js lte-ie8"><![endif]-->
<!--[if (gt IE 8)|(gt IEMobile 7)|!(IEMobile)|!(IE)]><!--><html class="no-js" lang="en"><!--<![endif]-->
<head>
<meta charset="utf-8">
<title>DoubleMap Engineering</title>
<meta name="author" content="The DoubleMap team">
<meta name="description" content="Python is slow. It’s still faster than a lot of other languages, but
describing Python code as “fast” will probably get some weird …">
<!-- http://t.co/dKP3o1e -->
<meta name="HandheldFriendly" content="True">
<meta name="MobileOptimized" content="320">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="canonical" href="http://doublemap.github.io">
<link href="/favicon.png" rel="icon">
<link href="/stylesheets/screen.css" media="screen, projection" rel="stylesheet" type="text/css">
<link href="/atom.xml" rel="alternate" title="DoubleMap Engineering" type="application/atom+xml">
<script src="/javascripts/modernizr-2.0.js"></script>
<script src="//ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js"></script>
<script>!window.jQuery && document.write(unescape('%3Cscript src="./javascripts/libs/jquery.min.js"%3E%3C/script%3E'))</script>
<script src="/javascripts/octopress.js" type="text/javascript"></script>
<!--Fonts from Google"s Web font directory at http://google.com/webfonts -->
<link href="//fonts.googleapis.com/css?family=PT+Serif:regular,italic,bold,bolditalic" rel="stylesheet" type="text/css">
<link href="//fonts.googleapis.com/css?family=PT+Sans:regular,italic,bold,bolditalic" rel="stylesheet" type="text/css">
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-50542806-1']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
</head>
<body >
<header role="banner"><hgroup>
<h1><a href="/">DoubleMap Engineering</a></h1>
<h2>The official tech blog of DoubleMap</h2>
</hgroup>
</header>
<nav role="navigation"><ul class="subscription" data-subscription="rss">
<li><a href="/atom.xml" rel="subscribe-rss" title="subscribe via RSS">RSS</a></li>
</ul>
<form action="https://www.google.com/search" method="get">
<fieldset role="search">
<input type="hidden" name="q" value="site:doublemap.github.io" />
<input class="search" type="text" name="q" results="0" placeholder="Search"/>
</fieldset>
</form>
<ul class="main-navigation">
<li><a href="/">Blog</a></li>
<li><a href="/blog/archives">Archives</a></li>
</ul>
</nav>
<div id="main">
<div id="content">
<div class="blog-index">
<article>
<header>
<h1 class="entry-title"><a href="/blog/2015/05/29/optimizing-python/">Optimizing Python With Cython</a></h1>
<p class="meta">
<time datetime="2015-05-29T11:09:00-04:00" pubdate data-updated="true">May 29<span>th</span>, 2015</time>
</p>
</header>
<div class="entry-content"><p>Python is slow. It’s still faster than a lot of other languages, but
describing Python code as “fast” will probably get some weird looks from
people who use compiled or JITted languages.</p>
<p>Still, I love Python for its design and speed at which I can write good code.
Most of the time, when writing Python for distributed or web applications, it
doesn’t matter. Your runtime is usually dominated by network traffic or data
accesses than it is by raw computational speed. In those cases, the correct
optimization is often reducing network requests or indexing your data.</p>
<p>Sometimes—just occasionally—we actually do need fast computation.
Traditional Python wisdom has been to profile your code (perhaps using
cProfile), identify the hot spot, and rewrite it in C as a Python extension.
I’ve had to do that before for a handwriting recognition algorithm with great
success (it turned a 2–3 second computation into a practically
real-time one), but writing C code and then figuring out the FFI was a pain.
I had to rewrite the algorithm in C, verify that my C version was correct and
didn’t have memory errors, and then figure out the right incantations to get
it to cooperate with Python.</p>
<p>Let’s try something different this time.</p>
<h2>Problem</h2>
<p>At DoubleMap, geographic distance calculation is something that gets used a
lot, and we use the haversine function to calculate distance between two
points on Earth. In some of our data analysis, we might run the haversine
function millions of times in a tight loop, and it was causing some reports to
take way too long.</p>
<p>Profiling the code showed that the two heaviest operations were fetching data
from the data store and the haversine function.</p>
<p><strong>Play along at home:</strong> Clone the repo at
<a href="https://github.com/doublemap/haversine-optimization">https://github.com/doublemap/haversine-optimization</a> and start from the first
commit to see how the code changes step by step.</p>
<p>The commands you should know are <code>pip install cython</code> and <code>python setup.py
build_ext --inplace</code>.</p>
<h2>Original code</h2>
<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="kn">from</span> <span class="nn">math</span> <span class="kn">import</span> <span class="n">sin</span><span class="p">,</span> <span class="n">cos</span><span class="p">,</span> <span class="n">acos</span>
</span><span class='line'>
</span><span class='line'>
</span><span class='line'><span class="k">def</span> <span class="nf">haversine</span><span class="p">(</span><span class="n">coord1</span><span class="p">,</span> <span class="n">coord2</span><span class="p">):</span>
</span><span class='line'> <span class="sd">"""Given two (lat, lng) tuples, returns the distance between them in</span>
</span><span class='line'><span class="sd"> meters."""</span>
</span><span class='line'> <span class="n">lat1</span><span class="p">,</span> <span class="n">lng1</span> <span class="o">=</span> <span class="n">coord1</span>
</span><span class='line'> <span class="n">lat2</span><span class="p">,</span> <span class="n">lng2</span> <span class="o">=</span> <span class="n">coord2</span>
</span><span class='line'>
</span><span class='line'> <span class="k">if</span> <span class="n">lat1</span> <span class="o">></span> <span class="mi">90</span> <span class="ow">or</span> <span class="n">lat1</span> <span class="o"><</span> <span class="o">-</span><span class="mi">90</span> <span class="ow">or</span> <span class="n">lat2</span> <span class="o">></span> <span class="mi">90</span> <span class="ow">or</span> <span class="n">lat2</span> <span class="o"><</span> <span class="o">-</span><span class="mi">90</span><span class="p">:</span>
</span><span class='line'> <span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s">"Invalid latitude (should be between +/- 90)"</span><span class="p">)</span>
</span><span class='line'> <span class="k">if</span> <span class="n">lng1</span> <span class="o">></span> <span class="mi">180</span> <span class="ow">or</span> <span class="n">lng1</span> <span class="o"><</span> <span class="o">-</span><span class="mi">180</span> <span class="ow">or</span> <span class="n">lng2</span> <span class="o">></span> <span class="mi">180</span> <span class="ow">or</span> <span class="n">lng2</span> <span class="o"><</span> <span class="o">-</span><span class="mi">180</span><span class="p">:</span>
</span><span class='line'> <span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s">"Invalid longitude (should be between +/- 180)"</span><span class="p">)</span>
</span><span class='line'>
</span><span class='line'> <span class="n">phi1</span> <span class="o">=</span> <span class="p">(</span><span class="mf">90.0</span> <span class="o">-</span> <span class="n">lat1</span><span class="p">)</span> <span class="o">*</span> <span class="mf">0.0174532925</span>
</span><span class='line'> <span class="n">phi2</span> <span class="o">=</span> <span class="p">(</span><span class="mf">90.0</span> <span class="o">-</span> <span class="n">lat2</span><span class="p">)</span> <span class="o">*</span> <span class="mf">0.0174532925</span>
</span><span class='line'> <span class="n">theta1</span> <span class="o">=</span> <span class="n">lng1</span> <span class="o">*</span> <span class="mf">0.0174532925</span>
</span><span class='line'> <span class="n">theta2</span> <span class="o">=</span> <span class="n">lng2</span> <span class="o">*</span> <span class="mf">0.0174532925</span>
</span><span class='line'>
</span><span class='line'> <span class="n">c</span> <span class="o">=</span> <span class="p">(</span><span class="n">sin</span><span class="p">(</span><span class="n">phi1</span><span class="p">)</span> <span class="o">*</span> <span class="n">sin</span><span class="p">(</span><span class="n">phi2</span><span class="p">)</span> <span class="o">*</span> <span class="n">cos</span><span class="p">(</span><span class="n">theta1</span> <span class="o">-</span> <span class="n">theta2</span><span class="p">)</span> <span class="o">+</span> <span class="n">cos</span><span class="p">(</span><span class="n">phi1</span><span class="p">)</span> <span class="o">*</span> <span class="n">cos</span><span class="p">(</span><span class="n">phi2</span><span class="p">))</span>
</span><span class='line'> <span class="n">arc</span> <span class="o">=</span> <span class="n">acos</span><span class="p">(</span><span class="n">c</span><span class="p">)</span>
</span><span class='line'> <span class="k">return</span> <span class="n">arc</span> <span class="o">*</span> <span class="mf">6367444.7</span>
</span></code></pre></td></tr></table></div></figure>
<p>Also, here’s the benchmark code that is used throughout:</p>
<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="kn">from</span> <span class="nn">__future__</span> <span class="kn">import</span> <span class="n">print_function</span>
</span><span class='line'><span class="kn">from</span> <span class="nn">timeit</span> <span class="kn">import</span> <span class="n">timeit</span>
</span><span class='line'>
</span><span class='line'><span class="k">print</span><span class="p">(</span><span class="n">timeit</span><span class="p">(</span><span class="s">"haversine((39.132213, -86.12439), (38.55213, -86.94910))"</span><span class="p">,</span>
</span><span class='line'> <span class="n">setup</span><span class="o">=</span><span class="s">"from haversine import haversine"</span><span class="p">,</span>
</span><span class='line'> <span class="n">number</span><span class="o">=</span><span class="mi">300000</span><span class="p">))</span>
</span></code></pre></td></tr></table></div></figure>
<p>Pretty straightforward. Let’s try pre-compiling this code into a C extension
to see if that helps.</p>
<p>Following the
<a href="http://docs.cython.org/src/tutorial/cython_tutorial.html">Cython Basic Tutorial</a>,
I renamed <code>haversine.py</code> to <code>haversine.pyx</code> and created a <code>setup.py</code> file:</p>
<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="kn">from</span> <span class="nn">distutils.core</span> <span class="kn">import</span> <span class="n">setup</span>
</span><span class='line'><span class="kn">from</span> <span class="nn">Cython.Build</span> <span class="kn">import</span> <span class="n">cythonize</span>
</span><span class='line'>
</span><span class='line'><span class="n">setup</span><span class="p">(</span>
</span><span class='line'> <span class="n">ext_modules</span> <span class="o">=</span> <span class="n">cythonize</span><span class="p">(</span><span class="s">"haversine.pyx"</span><span class="p">)</span>
</span><span class='line'><span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>
<p>Running <code>python setup.py build_ext --inplace</code> built <code>haversine.so</code> out of my
unmodified Python code. Easy-peasy.</p>
<p>Benchmark results (300,000 iterations):</p>
<p><strong>Original code:</strong> 2.85 seconds<br/>
<strong>Compiled:</strong> 2.01 seconds</p>
<p>So, 29% time savings without any modification to the original code.</p>
<h2>C math functions</h2>
<p>Currently, we’re still using Python’s math functions. Let’s see if we can cut
out some overhead by importing <code>math.h</code> instead.</p>
<p>Except, since this is Cython, all we need to do is:</p>
<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='diff'><span class='line'><span class="gd">--- a/haversine.pyx</span>
</span><span class='line'><span class="gi">+++ b/haversine.pyx</span>
</span><span class='line'><span class="gu">@@ -1,4 +1,4 @@</span>
</span><span class='line'><span class="gd">-from math import sin, cos, acos</span>
</span><span class='line'><span class="gi">+from libc.math cimport sin, cos, acos</span>
</span></code></pre></td></tr></table></div></figure>
<p>Benchmark results (300,000 iterations):</p>
<p><strong>Original code:</strong> 2.85 seconds<br/>
<strong>Compiled:</strong> 2.01 seconds<br/>
<strong>libc.math:</strong> 1.33 seconds</p>
<p>So far we’ve saved 53% of the time from the original code.</p>
<h2>C types</h2>
<p>Cython’s biggest extension of the Python language is its type annotations.
Speed ups can be had by telling Cython the types of each variable in advance.
We are dealing with geographic coordinates, so we’ll be using doubles for
pretty much everything:</p>
<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
<span class='line-number'>26</span>
<span class='line-number'>27</span>
<span class='line-number'>28</span>
<span class='line-number'>29</span>
<span class='line-number'>30</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="k">def</span> <span class="nf">haversine</span><span class="p">(</span><span class="nb">tuple</span> <span class="n">coord1</span><span class="p">,</span> <span class="nb">tuple</span> <span class="n">coord2</span><span class="p">):</span>
</span><span class='line'> <span class="sd">"""Given two (lat, lng) tuples, returns the distance between them in</span>
</span><span class='line'><span class="sd"> meters."""</span>
</span><span class='line'> <span class="n">cdef</span> <span class="n">double</span> <span class="n">lat1</span>
</span><span class='line'> <span class="n">cdef</span> <span class="n">double</span> <span class="n">lng1</span>
</span><span class='line'> <span class="n">cdef</span> <span class="n">double</span> <span class="n">lat2</span>
</span><span class='line'> <span class="n">cdef</span> <span class="n">double</span> <span class="n">lng2</span>
</span><span class='line'> <span class="n">lat1</span><span class="p">,</span> <span class="n">lng1</span> <span class="o">=</span> <span class="n">coord1</span>
</span><span class='line'> <span class="n">lat2</span><span class="p">,</span> <span class="n">lng2</span> <span class="o">=</span> <span class="n">coord2</span>
</span><span class='line'>
</span><span class='line'> <span class="k">if</span> <span class="n">lat1</span> <span class="o">></span> <span class="mi">90</span> <span class="ow">or</span> <span class="n">lat1</span> <span class="o"><</span> <span class="o">-</span><span class="mi">90</span> <span class="ow">or</span> <span class="n">lat2</span> <span class="o">></span> <span class="mi">90</span> <span class="ow">or</span> <span class="n">lat2</span> <span class="o"><</span> <span class="o">-</span><span class="mi">90</span><span class="p">:</span>
</span><span class='line'> <span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s">"Invalid latitude (should be between +/- 90)"</span><span class="p">)</span>
</span><span class='line'> <span class="k">if</span> <span class="n">lng1</span> <span class="o">></span> <span class="mi">180</span> <span class="ow">or</span> <span class="n">lng1</span> <span class="o"><</span> <span class="o">-</span><span class="mi">180</span> <span class="ow">or</span> <span class="n">lng2</span> <span class="o">></span> <span class="mi">180</span> <span class="ow">or</span> <span class="n">lng2</span> <span class="o"><</span> <span class="o">-</span><span class="mi">180</span><span class="p">:</span>
</span><span class='line'> <span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s">"Invalid longitude (should be between +/- 180)"</span><span class="p">)</span>
</span><span class='line'>
</span><span class='line'> <span class="n">cdef</span> <span class="n">double</span> <span class="n">ph1</span>
</span><span class='line'> <span class="n">cdef</span> <span class="n">double</span> <span class="n">ph2</span>
</span><span class='line'> <span class="n">cdef</span> <span class="n">double</span> <span class="n">theta1</span>
</span><span class='line'> <span class="n">cdef</span> <span class="n">double</span> <span class="n">theta2</span>
</span><span class='line'> <span class="n">cdef</span> <span class="n">double</span> <span class="n">c</span>
</span><span class='line'> <span class="n">cdef</span> <span class="n">double</span> <span class="n">arc</span>
</span><span class='line'>
</span><span class='line'> <span class="n">phi1</span> <span class="o">=</span> <span class="p">(</span><span class="mf">90.0</span> <span class="o">-</span> <span class="n">lat1</span><span class="p">)</span> <span class="o">*</span> <span class="mf">0.0174532925</span>
</span><span class='line'> <span class="n">phi2</span> <span class="o">=</span> <span class="p">(</span><span class="mf">90.0</span> <span class="o">-</span> <span class="n">lat2</span><span class="p">)</span> <span class="o">*</span> <span class="mf">0.0174532925</span>
</span><span class='line'> <span class="n">theta1</span> <span class="o">=</span> <span class="n">lng1</span> <span class="o">*</span> <span class="mf">0.0174532925</span>
</span><span class='line'> <span class="n">theta2</span> <span class="o">=</span> <span class="n">lng2</span> <span class="o">*</span> <span class="mf">0.0174532925</span>
</span><span class='line'>
</span><span class='line'> <span class="n">c</span> <span class="o">=</span> <span class="p">(</span><span class="n">sin</span><span class="p">(</span><span class="n">phi1</span><span class="p">)</span> <span class="o">*</span> <span class="n">sin</span><span class="p">(</span><span class="n">phi2</span><span class="p">)</span> <span class="o">*</span> <span class="n">cos</span><span class="p">(</span><span class="n">theta1</span> <span class="o">-</span> <span class="n">theta2</span><span class="p">)</span> <span class="o">+</span> <span class="n">cos</span><span class="p">(</span><span class="n">phi1</span><span class="p">)</span> <span class="o">*</span> <span class="n">cos</span><span class="p">(</span><span class="n">phi2</span><span class="p">))</span>
</span><span class='line'> <span class="n">arc</span> <span class="o">=</span> <span class="n">acos</span><span class="p">(</span><span class="n">c</span><span class="p">)</span>
</span><span class='line'> <span class="k">return</span> <span class="n">arc</span> <span class="o">*</span> <span class="mf">6367444.7</span>
</span></code></pre></td></tr></table></div></figure>
<p><strong>Original code:</strong> 2.85 seconds<br/>
<strong>Compiled:</strong> 2.01 seconds<br/>
<strong>libc.math:</strong> 1.33 seconds<br/>
<strong>C types:</strong> 0.466 seconds</p>
<p>In three easy steps, we’ve reduced the run time of this function by 84%. More
importantly, we’re still writing what’s basically Python. We don’t have to
remember to free our variables or check our array bounds. That additional
safety is not free (and there are various flags to disable Cython safety
checks) but with Cython, optimizing a Python function doesn’t have to be a
rewrite-everything task.</p>
<p>We probably could have gone further in optimizing this, but these speed-ups
got us into “good enough” territory, and that’s good enough.</p>
<h2>Distribution</h2>
<p>Our optimized haversine function is available on PyPI under the name
“cHaversine”. One key difference is that the distributed tarball includes the
C code generated by Cython so that you, as the package user, don’t need Cython
to download and build the C extension.</p>
<p><a href="https://news.ycombinator.com/item?id=9625737">Comments on Hacker News</a></p>
</div>
</article>
<article>
<header>
<h1 class="entry-title"><a href="/blog/2014/05/29/everyone-panic/">Everyone Panic! Almost Free Downtime Phone Alerts</a></h1>
<p class="meta">
<time datetime="2014-05-29T11:24:46-04:00" pubdate data-updated="true">May 29<span>th</span>, 2014</time>
</p>
</header>
<div class="entry-content"><p><a href="https://github.com/doublemap/everyonepanic">Uptime Robot (free) + App Engine (free) + Twilio (cheap) = almost free
downtime alerts!</a></p>
<ol>
<li>Uptime Robot notices that your website is not responding.</li>
<li>Everyone Panic! checks Uptime Robot and sees that something is down.</li>
<li>You get a call through Twilio telling you to panic.</li>
<li>You get another call 15 minutes later because you still haven’t fixed the
problem.</li>
</ol>
<p>Sure, it’s easy to get emailed when your website goes down, but real
honest-to-goodness phone calls have an immediacy that’s hard to beat.</p>
<p>Since our GPS bus-tracking system is used at all hours of the day, early on we
introduced automated monitoring. But when everyone is asleep, emails and text
messages go unnoticed until the morning and it sucks to not know where your
bus is at 1am.</p>
<p>We quickly whipped an app together using Python to tie
Uptime Robot and Twilio together. If one of the items in Uptime Robot goes
down then the Python app will ask Twilio to call us. We made it, put it on App
Engine, and haven’t touched it since. Even though our automated monitoring has
grown since then, this little app, with zero maintenance, still dutifully
watches Uptime Robot for any websites that are down.</p>
<p>Literally, there have been zero commits to the original repo since the first
day. Now the code’s been cleaned up and
<a href="https://github.com/doublemap/everyonepanic">put on GitHub</a>. You can download
it, configure it, and throw it onto App Engine or Heroku in a few minutes.</p>
<p>It’s not entirely free – you’ll need to pay Twilio for voice calls, but the
price is $0.02 per call and it’s hard to think of a situation where knowing
your website is down isn’t worth two cents. And, of course, you should keep
your Twilio account balance in the black.</p>
<p>But for the hobbyist who doesn’t mind trying out a new app, this is a really
simple way to have your phone ring when Hacker News tramples your latest pet
project.</p>
<p><a href="https://github.com/doublemap/everyonepanic">Everyone Panic! on GitHub</a></p>
<p><a href="https://news.ycombinator.com/item?id=7816241">Comments on Hacker News</a></p>
</div>
</article>
<article>
<header>
<h1 class="entry-title"><a href="/blog/2014/04/29/porting-600k-map-views-to-openstreetmap/">Porting 600k Map Views to OpenStreetMap/MapBox</a></h1>
<p class="meta">
<time datetime="2014-04-29T10:25:45-04:00" pubdate data-updated="true">Apr 29<span>th</span>, 2014</time>
</p>
</header>
<div class="entry-content"><p><img class="right" src="/images/doublemap-plus-mapbox.png">
This past week, we here at DoubleMap made a big change that users probably
<em>won’t</em> notice. Instead of Google Maps, all of our public maps are now powered
by MapBox with OpenStreetMap data.</p>
<p>We have hundreds of thousands of hits per month to our online maps from people
using our real-time bus tracking to find out where their bus is. This number
doesn’t even include transit riders using their agency’s custom app or public
TV displays.</p>
<p>For a company whose main feature is showing buses move on a map, it was a big
deal for everyone – we had to make sure that data at each of our client
transit agencies was good, update some of our sales material that specifically
mentioned Google Maps, decide on new providers, and figure out the overall
costs of switching and not
switching. We didn’t want to disrupt our transit agencies and the riders that
use our service to catch the bus every day. The actual coding changes were
relatively straightforward.</p>
<p>Thanks to MapBox, were able to switch everything over almost seamlessly
without interrupting our users or creating a jarring change.</p>
<h2>Evaluation</h2>
<p>It shouldn’t be a surprise that the main motivation was simply cost. Google
doesn’t publish its enterprise maps pricing, but it’s orders of magnitude more
expensive than MapBox.</p>
<p>We took a careful look at what we would have lost by switching away from
Google. The main feature that we can’t get anywhere else is Street View,
something that nobody has come close to replicating. We also lost out on
angled aerial imagery. Both of these are cool, but ultimately not worth the
price difference.</p>
<p><video width=592 style="width:592px" loop autoplay>
<source src="/images/iubdoublemap.mp4" type="video/mp4">
<source src="/images/iubdoublemap.webm" type="video/webm">
</video></p>
<h2>What’s in a map?</h2>
<p>Part of the thinking change that most people have to go through when switching
is that maps don’t always come in one giant bundle.
“Google Maps” can refer to everything from the JavaScript library to
the pop-ups with user reviews to the navigation to Street View. With
OpenStreetMap, you get your choice of JavaScript mapping engine and different
tile sets.</p>
<p>For example, you might choose to use <a href="http://leafletjs.com/">Leaflet</a> with
<a href="http://www.opencyclemap.org/">OpenCycleMap</a> and <a href="https://www.mapbox.com/blog/mapbox-satellite/">MapBox
Satellite</a> as your layers, or
you might use <a href="http://openlayers.org/">OpenLayers</a> with your own
<a href="http://tilestache.org/">tileserver</a>
plus the <a href="http://www.thunderforest.com/landscape/">Thunderforest Landscape</a>
map.</p>
<p><em>Everything is swappable.</em> When we were testing, it was a one-line change to
switch between MapQuest Open, CloudMade, and MapBox tiles.</p>
<p>We ended up using Leaflet with a custom MapBox style plus MapBox Satellite,
and it looks pretty awesome.</p>
<h2>Map styling</h2>
<p>One of the complaints levied against OpenStreetMap is that their map style
looks ugly. That’s not really fair, because OSM is there to provide the map
data, and most people who see OSM data see it through a custom map style, like
on FourSquare or Pinterest (or DoubleMap).</p>
<p>The power and curse of OSM is that you have total control over how the map
looks. The power is that if you want a bubble-gum-pink map with 36pt Comic Sans
labels and no roads, that’s easily doable.
The curse is that it can get complicated, such as when you decide that
you need to label each US highway using its state-specific shield image.
As TileMill matures, there will hopefully be more and more styles to build on
rather than starting from scratch.</p>
<p>One of the reasons why we initially went with CloudMade was because they had
an easy online map styler that let you point and click a feature, and then
change that feature class’s colors. This let us mostly duplicate our custom
style we were using with Google Maps. CloudMade, however, decided to
<a href="http://notes.ericjiang.com/posts/741">get out of non-enterprise services</a>
not too long after I had gotten the map to look the way I wanted it to.</p>
<p><em>Enter MapBox.</em> MapBox has a really slick desktop program, TileMill, that lets
you generate map tiles using a CSS-ish language called CartoCSS. The downside
of TileMill is that you have to download raw map data, render the tiles you
need (which easily gets into the gigabytes), and then upload them somewhere.
We have clients internationally and we would have had to download each
region’s OSM data and render new tiles every time we got another customer.</p>
<p>This is all fixed in <a href="https://github.com/mapbox/tm2">TM2</a>, which downloads
vector map tiles from MapBox
on the fly so you can upload just the compiled map style to MapBox and
they’ll handle all the data and rendering as needed.</p>
<p>Here’s our result:</p>
<p><img src="/images/osm-google-comparison.png"></p>
<p>Our custom style is designed to be muted with a limited color palette to let
our agencies’ custom route colors pop when overlaid. I hope to write about
TileMill 2 in the future, but right now it’s in “experimental” status.</p>
<h2>Porting the source code</h2>
<p>Our original front-end code was written to use v2 of the Google Maps JS API.
Once that was deprecated, we eventually rewrote it to use v3 of the Google
Maps API. When we started investigating OSM as a replacement, we noticed how
polished the Leaflet library was. The Leaflet API is based on v2 of the
Google Maps API so we were essentially doing the reverse of what we had
done in the past. All of the concepts mapped almost 1-to-1, and there was
nothing that Leaflet lacked in terms of functionality.</p>
<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
</pre></td><td class='code'><pre><code class='diff'><span class='line'><span class="gd">- r.polyline = new google.maps.Polyline({</span>
</span><span class='line'><span class="gd">- path: points,</span>
</span><span class='line'><span class="gd">- map: r.visible ? map : null,</span>
</span><span class='line'><span class="gd">- strokeColor: "#"+r.color,</span>
</span><span class='line'><span class="gd">- strokeOpacity: 0.6,</span>
</span><span class='line'><span class="gd">- strokeWeight: 4,</span>
</span><span class='line'><span class="gd">- zIndex: 1</span>
</span><span class='line'><span class="gd">- })</span>
</span><span class='line'><span class="gd">- google.maps.event.addListener(r.polyline, 'mouseover',</span>
</span><span class='line'><span class="gd">- function(routeId) {</span>
</span><span class='line'><span class="gi">+ r.polyline = new L.Polyline(points, {</span>
</span><span class='line'><span class="gi">+ color: "#"+r.color,</span>
</span><span class='line'><span class="gi">+ opacity: 0.6,</span>
</span><span class='line'><span class="gi">+ weight: 4</span>
</span><span class='line'><span class="gi">+ });</span>
</span><span class='line'><span class="gi">+ if(r.visible)</span>
</span><span class='line'><span class="gi">+ map.addLayer(r.polyline);</span>
</span><span class='line'><span class="gi">+</span>
</span><span class='line'><span class="gi">+ r.polyline.on('mouseover', function(routeId) {</span>
</span><span class='line'> return function () { highlightRoute(routeId, true); };
</span><span class='line'> }(r.id));
</span></code></pre></td></tr></table></div></figure>
<p>This above snippet shows how a bit of our polyline code changed.</p>
<p>One happy side effect of this conversion was that the mobile web page (for
users who can’t or don’t use our iPhone and Android apps) runs better.
Animating markers on Google Maps with a Chrome-on-Android user agent would
cause unbearable flickering. Leaflet runs smoothly on mobile.</p>
<h2>OpenStreetMap for the future</h2>
<p>We’re excited about joining the OpenStreetMap ecosystem. There’s an incredible
amount of work that’s been put into OSM by its contributors, and we hope to
contribute back what we can. One of our iPhone apps is already using MapBox on
top of Apple Maps, and OSM might be what bring sanity back to mobile maps.
We’re also looking at how we can use OSM data for geocoding, routing, and any
other geo needs we might have – saving money while helping an open community
grow is a win-win for the years to come.</p>
<p>You can check out one of live bus-tracking maps at
<a href="http://iub.doublemap.com/map/">iub.doublemap.com</a>.</p>
<p><a href="https://news.ycombinator.com/item?id=7675799">Comments on Hacker News</a></p>
</div>
</article>
<div class="pagination">
<a href="/blog/archives">Blog Archives</a>
</div>
</div>
<aside class="sidebar">
<section>
<h1><a href="http://www.doublemap.com/"><img style="border: 0; box-shadow: 0; border-radius: 0;-webkit-box-shadow:0" src="/images/doublemap-logo.png" /></a></h1>
<p>
DoubleMap provides real-time GPS systems for transit, including mobile apps,
passenger counters, voice announcements, and more. We’re a small, polyglot shop
located in Indianapolis, Indiana.
</p>
<p><a href="http://www.doublemap.com/">www.doublemap.com</a></p>
</section>
<section>
<h1>Recent Posts</h1>
<ul id="recent_posts">
<li class="post">
<a href="/blog/2015/05/29/optimizing-python/">Optimizing Python With Cython</a>
</li>
<li class="post">
<a href="/blog/2014/05/29/everyone-panic/">Everyone Panic! Almost Free Downtime Phone Alerts</a>
</li>
<li class="post">
<a href="/blog/2014/04/29/porting-600k-map-views-to-openstreetmap/">Porting 600k Map Views to OpenStreetMap/MapBox</a>
</li>
</ul>
</section>
</aside>
</div>
</div>
<footer role="contentinfo"><p>
Copyright © 2015 - The DoubleMap team -
<span class="credit">Powered by <a href="http://octopress.org">Octopress</a></span>
</p>
</footer>
<script type="text/javascript">
(function(){
var twitterWidgets = document.createElement('script');
twitterWidgets.type = 'text/javascript';
twitterWidgets.async = true;
twitterWidgets.src = '//platform.twitter.com/widgets.js';
document.getElementsByTagName('head')[0].appendChild(twitterWidgets);
})();
</script>
</body>
</html>