Skip to content

Lecture 3

JunjieW edited this page Feb 8, 2017 · 21 revisions

Lecture 3 Python Performance Tips

Language Traits

  • Python is a dynamic interpreted language.
  • The flexibility of dynamic languages comes at a cost in terms of performance.
  • There are ways to improve the performance of Python, but it will never be as fast as compiled languages.

Timing Python Code

  • timeit module, timeit.timeit(my_function, number=100000). Only need to pass in the name of the function.
  • In IPython, use the %timeit -n <loops> or %%timeit -n <loops> magic functions. Requires the function call my_function() itself.
  • %timeit my_function is different %timeit my_funciton(), the first one is trying to reference a variable rather than call the function
  • Demo
In [1]: import timeit

In [2]: def my_function():
   ...:     y = 3.1415
   ...:     for x in range(100):
   ...:         y = y ** 0.7
   ...:     return y
   ...:

In [3]: print( timeit.timeit(my_function, number=100000) / 100000)
4.44680348346673e-05

In [4]: %timeit -n 100000 my_function()
100000 loops, best of 3: 21.8 µs per loop
  • Why does it get different? hint: read the documentation.

Built-in Functions

  • How can I speed up my function?
    • Use built-in funciton. built-in functions are written in C, and so are generally very fast
    • Another performance improvement is to use intrinsic operators (+, -, *, etc.) instead of a user defined function
    • Example 1: my_min(valuse) vs min()
    • Example 2: map(lambda x,y: x+y, random_numbers, random_numbers2) vs. map(operator.add, random_numbers, random_numbers2)
    • Notice: Why do we convert the map to a list in the above code? In Python 3.5, the map function returns an iterator that does not evaluate the arguments until it needs to. By converting the iterator to a list, we are forcing map to compute every value.
In [7]: %timeit -n100000 my_min(random_numbers)
100000 loops, best of 3: 45.6 µs per loop

In [8]: %timeit -n100000 min(random_numbers)
100000 loops, best of 3: 28.4 µs per loop
In [12]: %timeit -n1 list(map(lambda x,y: x+y, random_numbers, random_numbers2))
1 loop, best of 3: 476 ms per loop

In [13]: import operator

In [14]: %timeit -n1 list(map(operator.add, random_numbers, random_numbers2))
1 loop, best of 3: 343 ms per loop

Function Call Overhead

  • Function call overhead in Python is relatively high, especially much higher than built-in functions. It is mainly due to the dynamic type checking of function arguments, before and after the function call

  • It is possible to reduce this overhead without sacrificing readability.

  • Demo: See how the modification of func-all-in-a-loop into loop-in-a-func can influence the performance.

In [16]: x = 0
    ...: def inner(i):
    ...:     global x
    ...:     x = x + i
    ...:
    ...: def outer_1():
    ...:     for i in range(100000):
    ...:         inner(i)
    ...:

In [17]: %timeit outer_1()
100 loops, best of 3: 22.8 ms per loop
In [19]: x = 0
    ...: def aggregate(list):
    ...:     global x
    ...:     for i in list:
    ...:         x = x + i
    ...:
    ...: def outer_2():
    ...:     aggregate(range(100000))
    ...:

In [20]: %timeit -n100 outer_2()
100 loops, best of 3: 13.9 ms per loop

Membership Testing

  • What is the best datatype (dicts, sets, lists, and tuples, etc.) to use when searching for elements?
  • The in and not in operator is very fast, both dict and set are implemented using a hash table
  • Checking for membership in a list or tuple is not as efficient
  • Conclusion: if need checking membership very often, use dict or set as container rather than list.
  • Does it mean we need to convert list to set everytime we try to search? Of course NO.
  • Demo
In [26]: letters = [x for x in 'abcdefghijklmnopqrstuvwxyz'*1000+'%']

In [35]: %timeit -n1000 'a' in letters
1000 loops, best of 3: 163 ns per loop

In [36]: %timeit -n1000 '%' in letters
1000 loops, best of 3: 979 µs per loop
In [38]: letters = set('abcdefghijklmnopqrstuvwxyz'*1000+'%')

In [40]: %timeit -n1000 'a' in letters
1000 loops, best of 3: 190 ns per loop

In [41]: %timeit -n1000 '%' in letters
1000 loops, best of 3: 190 ns per loop

String Concatenation

  • Strings in Python are immutable
  • As always: built-in is faster. E.g. str_a += str_b is slower than str_a.join(str_b). Since join takes an iterable as an argument it is effectively doing the same thing as the loop
  • In this case, built-in function out-perform built-in operator

Decorator Caching

from functools import wraps
def cache(f):
    cache = { }
    @wraps(f)
    def wrap(*arg):
        if arg not in cache: cache[arg] = f(*arg)
        return cache[arg]
    return wrap

Optimizing Loops

  • . - the dot operator, i.e. attributes referencing operator does a lot to support dynamic attributes as well as multiple namespaces
    • So, if possible, save the methods reference to variables, then use that variable to call the functions inside a loop
  • Use local variable
    • Another optimization for the loop version is to use local variables rather than global variables as these can be accessed much more efficiently in Python
    • Notice: Optimizing code may result in performance improvements, but it often comes at a cost of readablitiy
  • Use map function to replace loop
  • Use list comprehension
    • List comprehension is faster because it is optimized for the Python interpreter to spot a predictable pattern during looping.
    • Besides the syntactic benefit of list comprehensions, they are often as fast or faster than equivalent use of map

Import Overhead

  • import can be use anywhere.
  • Sometimes it's useful to place them inside functions to restrict their visibility and/or reduce initial startup time
  • Although Python’s interpreter is optimized to not import the same module multiple times, repeatedly executing an import statement can seriously affect performance in some circumstances
def func1():
    import string
    string.lower('Python')
import string
def func2():
    string.lower('Python')
# Note that using string methods avoids the need to import at all, and runs even faster.
def func3():
    'Python'.lower()

Clone this wiki locally