Why learn programming?
Maybe you see colleagues writing programs to save time and deal with large datasets.
Maybe your supervisor has told you that you need to learn programming for your next project.
Maybe you've been looking at job ads and noticed just how many of them are asking for programming skills.
Table of contents
- In chapter one, you'll learn why Python is a good choice for biologists and beginners alike. You'll also learn how to install Python for your operating system and how to set up your programming environment, complete with links to all the free software you'll need.
- In chapter two, you'll learn how to manipulate text (including DNA and protein sequences) and how to fix errors in your programs. Exercises: calculating AT content, splicing introns.
- In chapter three, you'll learn how to read and write data to and from files. You'll also learn how to deal with file paths and the FASTA file format.Exercises: splitting genomic DNA, writing a FASTA file.
- In chapter four, you'll learn how to process many pieces of data in a single program and more advanced tools for sequence manipulation. Exercises: trimming adapter sequences, concatenating exons.
- In chapter five, you'll learn how to make Python even more useful by creating your own functions, including the best ways to test those functions in order to speed up development. Exercises: Analyzing the amino acid composition of protein sequences.
- In chapter six, you'll learn how to write programs that can make smart decisions about how to handle data and how to make your programs follow complex rules. Exercises: filtering genes based on multiple criteria.
- In chapter seven, you'll learn an incredibly powerful tool for working with patterns in text - regular expressions - and how to use it to search in DNA and protein sequences. Exercises: filtering accession names and calculating restriction fragment sizes.
- In chapter eight, you'll learn how to store huge amounts of data in a way that can still allows it to be retrived very efficiently. This allows simplification of much of the code from previous chapters. Exercises: translating DNA sequences to protein.
- In chapter nine, you'll learn how to make your Python programs work in harmony with existing tools, and how to polish up your programs so that they're ready for other people to use. Exercises: counting k-mers, binning DNA sequences by length.
About the author
Dr. Martin Jones has been teaching biologists to write software for over five years and has taught everyone from postgraduates to PIs. He is currently Lecturer in Bioinformatics at Edinburgh University.