High Performance Python (from Training at EuroPython 2011) by Ian Ozsvald - HTML preview

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub, Kindle for a complete version.

CHAPTER

SIXTEEN

 

NUMPY VECTORS

Take a fresh copy of pure_python_2.py and copy it into numpy_vector/numpy_vector.py. Import the numpy library and change the calculate_z routine to look like the one below. Run it and test that you get the same output as before.

# ./numpy_vector/numpy_vector.py

import numpy as np # ’np.’ is a shorthand convention so you avoid writing ’numpy.’ all the time

 

def calculate_z_numpy(q, maxiter, z):

   """use vector operations to update all zs and qs to create new output array"""

array"""

   output = np.resize(np.array(0,), q.shape)

   for iteration in range(maxiter):

    z = z*z + q

    done = np.greater(abs(z), 2.0) # could have written it equivalently as ’done = abs(z) > 2.0’

    q = np.where(done, 0+0j, q)

    z = np.where(done, 0+0j, z)

    output = np.where(done, iteration, output)

return output

numpy‘s strength is that it simplifies running the same operation on a vector (or matrix) of numbers rather than on individual items in a list one at a time.

If your problem normally involves using nested for loops to iterate over individual items in a list then consider whether numpy could do the same job for you in a simpler (and probably faster) fashion.

If the above code looks odd to you, read it as:

  • z*z does a pairwise multiplication, think of it as z[0] = z[0] * z[0]; z[1] = z[1] * z[1];…; z[n-1] = z[n-1] * z[n-1].
  • z_result  +  q does a pairwise addition, just like the line above but adding the result
  • z  =  ... assigns the new array back to z
  • np.greater(condition, item_if_True, item_if_False) calculates the condition for each item in abs(z), for the nth value if the result is True it uses the item_if_true value (in this case 0+0j) else it uses the other value (in this case q[nth]) - each item in q either resets to 0+0j or stays at the value it was before
  • The same thing happens for z
  • output‘s items are set to iteration if done[nth]  ==  True else they stay at the value they were at previously.

If this is unclear then I urge you to try it at the command line, stepping through each result. Start with a small array of complex numbers and build it up.

You’ll probably be curious why this code runs slower than the other numpy version that uses Cython. The reason is that the vectorised code can’t stop early on each iteration if output has been set - it has to do the same operations for all items in the array. This is a shortcoming of this example. Don’t be put off by vectors, normally you can’t exit loops early (particuarly in the physics problems I tend to work on).

Behind the scenes numpy is using very fast C optimised math libraries to perform these calculations very quickly. If you consider how much extra work it is having to do (since it can’t exit each calculation loop when output is calculated for a co-ordinate) it is amazing that it is still going so fast!