List Comprehensions in Python

When you need to do something involving a list in Python, either process an existing list or create a new list, there’s a kind of shorthand that can be used to do this in a single line between square brackets, called ‘List Comprehension’. Think of it as For loops within list brackets, with or without conditionals.

# given a list
casual_names = ['alec','jude','malcolm']

capitalized_names = [name.title() for name in casual_names]

You can add a conditional statement. For example, square all the even integers in a given range:

squares = [x**2 for x in range(16) if x % 2 == 0]

You can use if else statements, but they have to appear before the for. Convert positive and negative reviews to integers:

# for a list of reviews
reviews = ['positive','negative','positive']

# encode them as 1's and 0's
encoded_reviews = [1 if r == "positive" else 0 for r in reviews]

Snakemake Basics

Make files are a great way to get things done efficiently, if you know what they are and they make sense to you. My entry into makefiles started when I took C programming in the late 80’s. You would write code in C, then use a Makefile to turn that code into an executable program. Since it takes several intermediate steps and creation of different kinds of files along the way, a Makefile manages all these steps using recipes laid out in blocks. The blocks are designed to create output files from input files. You specify the names of the output files, and the names of the input files, and a set of rules for how to use one to create the other.

Snakemake is this same idea, implemented through python. Here’s a simple example.

all:
    alignment.bam         # We have to populate this part with the files we want to create

align:
    input:
        input.fastq       # [FileSystem] Does this file exist?
                          # Is there some rule to create it?
    output:
        alignment.bam     # Does this part match any part of "all"?
                          # [FileSystem] Does this file already exist?
    shell:
        bowtie input.fastq | samtools > alignment.bam

Generate Random Integers in Python

Getting random integers in Python

# import the function
from random import randint

# set your parameters
howmany = 10
min = 0
max = 100

# use a list comprehension to fill an array
rand_ints = [randint(min,max) for _ in range(howmany)]

Seems kinds of bulky, but you specify how many numbers you would like, what range to draw them from, and then call the radint() function over and over again until you have the numbers you need, using a for loop with a throaway variable “_”.

But there’s a few ways to do this! If you have NumPy installed, it has a similar function for generating random integers, and the lines of code above, can be reduced to just two: import the library, call the function.

from numpy.random import randint

randint(0,100,10)

The result is as follows:

array([42, 99, 30, 94, 60, 90,  7, 31, 91, 11])

Both of these methods would usually be preceded by a call to “seed” the random number generator, so that you can set a reproducible starting point for random number generation. The function has the same name in each library, and calling seed for one does not set the seed for the other. But that’s more than you need to know for now.

Python Dictionaries

A dictionary in python is an associative array. This means you can use it to hold an array of things, and associate names, or keys with those things so they can be retrieved. Dictionaries use curly brackets in their declaration:

my_dict = {'pi': 3.14, 'e': 2.71, 'gravityAccel': 9.8}

to access a given element, you use square brackets on the key:

my_dict['pi']

You can process a dictionary using a for loop and the “in” keyword:

for key in my_dict:
  print(key, "->", my_dict[key])

You can also loop through the keys and values together, using the items() method on your dictionary:

for key, value in my_dict.items():
  print(key, "->", value)

The elements of a dictionary can also be accessed by methods:

dictionary_values = my_dict.values()
dictionary_keys = my_dict.keys()

Quickly examine just the first 5 items in your dictionary, take a slice of a list:

list(my_dict.items())[:5]

Comprehensions are often used for processing dictionaries. For instance, reversing the keys and values:

my_reverse_dict = {v: k for k, v in my_dict.items()}

That’s a lot – creating a new dictionary, looping through the old dictionary, and reversing the keys and values all in one line.

And this can be combined with conditional statements such as “if”:

my_filtered_reverse_dict = {v: k for k, v in my_dict.items() if v > 3}