Empty Pipes



VIM Python Snippets

  • 02 Oct 2014
  • |
  • linux
  • python
  • vim
  • |

There are many times when a task calls for a simple python script. It is usually something small that takes some input file as a parameter, does some processing, and then spits out some results. It might even take an options or two. It’s tempting to just throw some lines of code into a file and be done with it. This may work but often just makes things more difficult later.

Consider the following code (let’s creatively call it do_stuff.py) which simply converts the input to uppercase:

import sys

for line in sys.stdin:
    print line.upper()

What happens, however, when it grows a little bit and we add a function?

import random
import sys

def uppercase_and_scramble(line):
   ll = list(line)
   random.shuffle(ll)
   return "".join(ll)

for line in sys.stdin:
    print line.upper()

What happens if we want to include that function into another file? Then importing do_stuff.py will cause the for loop to run. A much better solution is to do all of the ‘scripty’ stuff in a main function that only gets called if the file is called as a script (as opposed to being imported as a library):

import random
import sys
from optparse import OptionParser

def uppercase_and_scramble(line):
    '''
    Make the line uppercase, scramble its contents and return the result.
    '''
    ll = list(line)
    random.shuffle(ll)
    return "".join(ll)

def main():
    usage = """
    python do_stuff.py

    Process lines from the input.
    """
    num_args= 0
    parser = OptionParser(usage=usage)

    #parser.add_option('-o', '--options', dest='some_option', default='yo', help="Place holder for a real option", type='str')
    #parser.add_option('-u', '--useless', dest='uselesss', default=False, action='store_true', help='Another useless option')

    (options, args) = parser.parse_args()

    if len(args) < num_args:
        parser.print_help()
        sys.exit(1)

    for line in sys.stdin:
        print uppercase_and_scramble(line)

if __name__ == '__main__':
    main()

That’s a lot of code for a simple task. Well it doesn’t necessarily require much typing to enter thanks to the SnipMate plugin for vim. By adding the following code in ~/.vim/snippets/python.snippets we can create almost all of the code by typing start and hitting tab right at the beginning of the script.

snippet start
    import sys
    from optparse import OptionParser

    def main():
        usage = """
        ${1:usage}
        """
        num_args= 0
        parser = OptionParser(usage=usage)

        #parser.add_option('-o', '--options', dest='some_option', default='yo', help="Place holder for a real option", type='str')
        #parser.add_option('-u', '--useless', dest='uselesss', default=False, action='store_true', help='Another useless option')

        (options, args) = parser.parse_args()

        if len(args) < num_args:
            parser.print_help()
            sys.exit(1)

    if __name__ == '__main__':
        main()

This will create the main function and position the cursor within the usage string thus making it a snap to write some quick documentation of what this script will do. The num_args variable is there to make sure the user enters the right number of arguments. Otherwise the script exits with an error. The rest of the processing code should go directly after the if statement. When scripts are written in this manner, they can be painlessly turned into libraries at a future point in time.


Scaled Colormap in Matplotlib

  • 05 Sep 2014
  • |
  • matplotlib
  • python
  • colormap
  • |

I often have to create plots where the color of some element reflects an underlying value. This mapping of color -> value is generally easily accomplished by using colormaps in matplotlib. The standard provided colormaps (such as cm.jet or cm.cool) map values over the interval [0,1]. Often times, however, our data will have a different range and will need to be normalized in order for it to be mapped to the full range of available colors.

The example below illustrates what the color range looks like without the normalization (top two plots) and with the normalization (bottom plot).

import numpy as np
import matplotlib.cm as cm

# create two input value ranges
gradient1 = np.linspace(0, 1, 256)
gradient2 = np.linspace(-1, 1, 256)

# taken from matplotlib's very own colormap_referency.py example
#http://matplotlib.org/examples/color/colormaps_reference.html
fig, axes = plt.subplots(nrows=3, figsize=(5,1))
fig.subplots_adjust(top=0.95, bottom=0.01, left=0.2, right=0.99)

# plot the gradients given the colormaps
for g1,g2 in zip(gradient1, gradient2):
    # the first two are plotted without normalization
    # the input values are mapped directly to colors using cm.jet
    axes[0].axvline(g1, color=cm.jet(g1))
    axes[1].axvline(g2, color=cm.jet(g2))
    
    # add a normalizing function to map the input values to the range [0,1]
    # and then use that with ScalarMappable to get the appropriate colors
    norm = mpl.colors.Normalize(vmin=min(gradient2), vmax=max(gradient2))
    axes[2].axvline(g2, color=cm.ScalarMappable(norm=norm, cmap=cm.jet).to_rgba(g2))

# add the labels, also from:
#http://matplotlib.org/examples/color/colormaps_reference.html
for name, ax in zip(["[0,1]", "[-1,1]", "Normalized [-1,1]"], axes):
    pos = list(ax.get_position().bounds)
    x_text = pos[0] - 0.01
    y_text = pos[1] + pos[3]/2.
    fig.text(x_text, y_text, name, va='center', ha='right', fontsize=10)
    ax.set_axis_off()

Mapping different value ranges to colors

Notice how the unnormalized [-1,1] input values have a dark blue color over the left half of the second plot. This is because cm.jet(-1) = cm.jet(-.5) = cm.jet(0.) Values outside of cm.jet’s accepted input range of [0,1] all return the same color. By using the Normalize class, the input values are scaled to the [0,1] input range and the output colors span the entire range of the colormap (third plot).


Fast 3D Vector Operations in Cython

  • 26 Feb 2014
  • |
  • python
  • cython
  • vector
  • |

The numpy library provides a plethora of fast functionality using for vectors. The question is, can we improve upon it by using cython and assuming vectors in three-dimensional space. Let’s take the following two functions which are not provided by numpy, but easily implemented. The first simply calculates the magnitude of a vector, while the second calculates the distance between two vectors.

import math as m
import numpy as np

def magnitude(vec):
    '''
    Return the magnitude of a vector (|V|).

    @param vec: The vector in question.
    @return: The magnitude of the vector.
    '''

    return m.sqrt(np.dot(vec, vec))
    
def vec_distance(vec1, vec2):
    v = vec2 - vec1
    return m.sqrt(np.dot(v, v1))

Now let’s take the equivalent implementation in cython. The only major difference from the pure code is the definition of the data types and the handling of each value individually without any loops.

cimport numpy as np

ctypedef np.double_t DTYPE_t
#print dir(np)
#from math import sin, cos, sqrt

from libc.math cimport sin,cos, sqrt, acos, atan2, pow

def vec_distance(np.ndarray[DTYPE_t, ndim=1] vec1, np.ndarray[DTYPE_t, ndim=1] vec2):
    cdef double d0 = vec2[0] - vec1[0]
    cdef double d1 = vec2[1] - vec1[1]
    cdef double d2 = vec2[2] - vec1[2]
    
    return sqrt(d0 * d0 + d1*d1 + d2*d2)
    
def magnitude(np.ndarray[DTYPE_t, ndim=1] vec):
    cdef double x = sqrt(vec[0] * vec[0] + vec[1] * vec[1] + vec[2] * vec[2])

    return x

Finally, the timing:

In [2]: import cytvec as cv

In [3]: a = np.array([1.,2.,3.])

In [4]: %timeit vec_distance(a, a)
1000000 loops, best of 3: 1.44 us per loop

In [5]: %timeit magnitude(a - a)
1000000 loops, best of 3: 1.48 us per loop

In [6]: %timeit cv.vec_distance(a,a)
1000000 loops, best of 3: 554 ns per loop

In [7]: %timeit magnitude(a)
1000000 loops, best of 3: 656 ns per loop

In [8]: %timeit cv.magnitude(a)
1000000 loops, best of 3: 287 ns per loop

Using the cython implementation leads to a nearly three-fold decrease in the running time for the vec_distance function and an almost two-fold decrease in the running time for the magnitude function. The cytvec implementation can simply be copied into a file (let’s say cytvec.pyx) and compiled into a module by following the instructions in the cython tutorial.

python setup.py build_ext --inplace

An already compiled implementation can be found in the forgi package under forgi.threedee.utilities.cytvec.