Lecture 3

Lecture 3 Python Performance Tips

Language Traits

Python is a dynamic interpreted language.
The flexibility of dynamic languages comes at a cost in terms of performance.
There are ways to improve the performance of Python, but it will never be as fast as compiled languages.

Timing Python Code

timeit module, timeit.timeit(my_function, number=100000). Only need to pass in the name of the function.
In IPython, use the %timeit -n <loops> or %%timeit -n <loops> magic functions. Requires the function call my_function() itself.
%timeit my_function is different %timeit my_funciton(), the first one is trying to reference a variable rather than call the function
Demo

In [1]: import timeit

In [2]: def my_function():
   ...:     y = 3.1415
   ...:     for x in range(100):
   ...:         y = y ** 0.7
   ...:     return y
   ...:

In [3]: print( timeit.timeit(my_function, number=100000) / 100000)
4.44680348346673e-05

In [4]: %timeit -n 100000 my_function()
100000 loops, best of 3: 21.8 µs per loop

Why does it get different? hint: read the documentation.

Built-in Functions

How can I speed up my function?
- Use built-in funciton. built-in functions are written in C, and so are generally very fast
- Another performance improvement is to use intrinsic operators (+, -, *, etc.) instead of a user defined function
- Example 1: my_min(valuse) vs min()
- Example 2: map(lambda x,y: x+y, random_numbers, random_numbers2) vs. map(operator.add, random_numbers, random_numbers2)
- Notice: Why do we convert the map to a list in the above code? In Python 3.5, the map function returns an iterator that does not evaluate the arguments until it needs to. By converting the iterator to a list, we are forcing map to compute every value.

In [7]: %timeit -n100000 my_min(random_numbers)
100000 loops, best of 3: 45.6 µs per loop

In [8]: %timeit -n100000 min(random_numbers)
100000 loops, best of 3: 28.4 µs per loop

In [12]: %timeit -n1 list(map(lambda x,y: x+y, random_numbers, random_numbers2))
1 loop, best of 3: 476 ms per loop

In [13]: import operator

In [14]: %timeit -n1 list(map(operator.add, random_numbers, random_numbers2))
1 loop, best of 3: 343 ms per loop

Function Call Overhead

Function call overhead in Python is relatively high, especially much higher than built-in functions. It is mainly due to the dynamic type checking of function arguments, before and after the function call
It is possible to reduce this overhead without sacrificing readability.
Demo: See how the modification of func-all-in-a-loop into loop-in-a-func can influence the performance.

In [16]: x = 0
    ...: def inner(i):
    ...:     global x
    ...:     x = x + i
    ...:
    ...: def outer_1():
    ...:     for i in range(100000):
    ...:         inner(i)
    ...:

In [17]: %timeit outer_1()
100 loops, best of 3: 22.8 ms per loop

In [19]: x = 0
    ...: def aggregate(list):
    ...:     global x
    ...:     for i in list:
    ...:         x = x + i
    ...:
    ...: def outer_2():
    ...:     aggregate(range(100000))
    ...:

In [20]: %timeit -n100 outer_2()
100 loops, best of 3: 13.9 ms per loop

Membership Testing

What is the best datatype (dicts, sets, lists, and tuples, etc.) to use when searching for elements?
The in and not in operator is very fast, both dict and set are implemented using a hash table
Checking for membership in a list or tuple is not as efficient
Conclusion: if need checking membership very often, use dict or set as container rather than list.
Does it mean we need to convert list to set everytime we try to search? Of course NO.
Demo

In [26]: letters = [x for x in 'abcdefghijklmnopqrstuvwxyz'*1000+'%']

In [35]: %timeit -n1000 'a' in letters
1000 loops, best of 3: 163 ns per loop

In [36]: %timeit -n1000 '%' in letters
1000 loops, best of 3: 979 µs per loop

In [38]: letters = set('abcdefghijklmnopqrstuvwxyz'*1000+'%')

In [40]: %timeit -n1000 'a' in letters
1000 loops, best of 3: 190 ns per loop

In [41]: %timeit -n1000 '%' in letters
1000 loops, best of 3: 190 ns per loop

String Concatenation

Strings in Python are immutable
As always: built-in is faster. E.g. str_a += str_b is slower than str_a.join(str_b). Since join takes an iterable as an argument it is effectively doing the same thing as the loop
In this case, built-in function out-perform built-in operator

Decorator Caching

Python decorators are normally used for tracing, locking, or logging
Can be used to cache result needed later.
- For example, the warper: https://docs.python.org/2/library/functools.html#functools.wraps
Meanwhile, Decorators can be used for a variety of different purposes.

from functools import wraps
def cache(f):
    cache = { }
    @wraps(f)
    def wrap(*arg):
        if arg not in cache: cache[arg] = f(*arg)
        return cache[arg]
    return wrap

Optimizing Loops

. - the dot operator, i.e. attributes referencing operator does a lot to support dynamic attributes as well as multiple namespaces
- So, if possible, save the methods reference to variables, then use that variable to call the functions inside a loop
Use local variable
- Another optimization for the loop version is to use local variables rather than global variables as these can be accessed much more efficiently in Python
- Notice: Optimizing code may result in performance improvements, but it often comes at a cost of readablitiy
Use map function to replace loop
Use list comprehension
- List comprehension is faster because it is optimized for the Python interpreter to spot a predictable pattern during looping.
- Besides the syntactic benefit of list comprehensions, they are often as fast or faster than equivalent use of map

Import Overhead

import can be use anywhere.
Sometimes it's useful to place them inside functions to restrict their visibility and/or reduce initial startup time
Although Python’s interpreter is optimized to not import the same module multiple times, repeatedly executing an import statement can seriously affect performance in some circumstances

def func1():
    import string
    string.lower('Python')

import string
def func2():
    string.lower('Python')

# Note that using string methods avoids the need to import at all, and runs even faster.
def func3():
    'Python'.lower()

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lecture 3

Lecture 3 Python Performance Tips

Language Traits

Timing Python Code

Built-in Functions

Function Call Overhead

Membership Testing

String Concatenation

Decorator Caching

Optimizing Loops

Import Overhead

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally