Michael Grünstäudl (Gruenstaeudl), PhD

Researcher at the Freie Universität Berlin

Teaching award 2017

Awarded annually to a lecturer in biology, chemistry and pharmacy

I recently received the teaching award for biology at the FU Berlin for 2017. Many thanks to those students who recommended me. I appreciate your praise!

 

New paper – Conserved plastid genome structure in early-diverging angiosperms

Not so different after all.

Together with two other scientists, I just published a new paper on the plastid genome structure of early-diverging angiosperms. We demonstrate that the plastid genomes of early-diverging angiosperms are much more conserved than previously considered.

 

Teaching in spring

In the lab with teachers-to-be and high-school students

One of the master-level courses I teach at the Freie Universität Berlin during spring semester, the NatLab Evolution, is geared towards a comprehensive teaching education for upcoming high-school teachers. During this course, we invite high-school students from across Berlin to spend a day in the lab and work with the Master’s students on biological experiments. Given the general responses, both sides seem to enjoy this arrangement a lot!

Michael Gruenstaeudl – Teaching – June 2017

Michael Gruenstaeudl – Teaching – June 2017

 

 

 

 

 

 

 

 

Ordering charsets within NEXUS files

Character sets for the orderly.

Defining character sets (charset) in NEXUS files can be an efficient way to annotated specific regions of a DNA or protein alignment. However, many software packages able to write NEXUS files (e.g., BioPython) do not save charsets in an ordered fashion, if multiple ones are present (i.e., that charset at alignment positions 100-250 were listed before charset at alignment positions 350-550). Instead, a typical NEXUS file written by Biopython looks something like this:

#NEXUS
BEGIN DATA;
DIMENSIONS NTAX=3 NCHAR=12;
FORMAT DATATYPE=DNA GAP=- MISSING=?;
MATRIX
seq1    ATGACAATATAA
seq2    ATGACTGTATAA
seq3    ATGATTGTCTAA;
END;
BEGIN SETS;
CharSet foo = 5-7;
CharSet bar = 3-8;
CharSet baz = 2-6;
END;

The following lines of bash code order the charsets of the NEXUS by their start position.

sed '/BEGIN SETS;/q' infile > outfile
awk '/BEGIN SETS;/,/END;/' | tail -n +2 | head -n -1 | sort -t' ' -n -k4 >> outfile
sed -n -e '/BEGIN SETS;/,$p' infile | sed -n -e '/END;/,$p' >> outfile

Teaching the teachers II

Teaching the teachers II

Like last year, I held training labs for current high school teachers as part of the NatLab initiative. It was a wonderful experience and great to find out what today’s high school teachers wish to communicate to their students.

NatLab Lehrerfortbildung – Apr 2017

NatLab Lehrerfortbildung – Apr 2017

NatLab Lehrerfortbildung 2017

NatLab Lehrerfortbildung – Apr 2017

 

 

 

 

 

 

 

 

 

 

Alignment Nex2Phy few-liner

Alignment file format conversion for the efficient

Today, I needed to convert a series of alignments, which were stored in the common nexus format, into newick format. In order to do this efficiently, I wrote the following few-liner.

#!/usr/bin/env python2.7
import sys
from Bio import AlignIO
inFn = sys.argv[1]
inp = open(inFn, 'rU')
outp = open(inFn+'.phy', 'w')
aln = AlignIO.parse(inp, 'nexus')
AlignIO.write(aln, outp, 'phylip')
outp.close()
inp.close()

The above-code is very ordinary and great to have handy.

Teaching freshmen

Currently, I am teaching a class on botany and global biodiversity to freshman-level students. Although the total number of students signed up for my course is large, and thus I have to guide dozens of students through the course materials every day, I love academic teaching and I hope that the students enjoy the course as much as I do.

Michael Gruenstaeudl - Teaching - Jan 2016

Michael Gruenstaeudl – Teaching – Jan 2016

 

Michael Gruenstaeudl - Teaching - Jan 2016

Michael Gruenstaeudl – Teaching – Jan 2016

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Using rPython to call Python in R

A simple example.

Calling Python from within R via the R-package rPython is fairly easy. However, very little documentation exists on this package, and some of the commands may appear quirky at first. Also, don’t confuse rPython with RPython (see capitals)! The paucity of written documentation on this package seems to scare away many biologists, who – if they are like me – prefer to work with well-documented code. But fear not!

Usage of rPython is straightforward with a bit of exercise. The following tutorial is intended to give a quick and simple example on how to work with rPython.

> library(rPython)
Loading required package: RJSONIO

I use the command python.exec() to execute basically any Python operation within R.

# Generating a dictionary in Python
> python.exec("capcities = {'sk':'Slovakia','de':'Germany',
'hu':'Hungary'}")

I use the command python.get() to assign a new variable in R with a value from Python.

# Pulling the dictionary from Python into R
> abbrev_dict = python.get("capcities")
> abbrev_dict
sk         de         hu
"Slovakia"  "Germany"  "Hungary"

Note, however, that python.get() is very inflexible in what it can retrieve from Python.

> abbrevs = python.get("capcities.keys()")
Traceback (most recent call last):
...
TypeError: dict_keys(['de', 'sk', 'hu']) is not JSON serializable

Instead, I use:

> abbrevs = python.get("list(capcities.keys())")
> abbrevs
[1] "sk" "de" "hu"

Concatenating .csv-files by column

A helper script in R

I recently needed to concatenate multiple .csv-files, which displayed the same row names. The concatenated matrix would thus consist of columns from different files. To that end, I wrote the following R script.

First, let us specify the input directory.

inDir = "/home/user/csv_files/"
outF = paste(inDir, "concatenated.csv", sep="")

Second, let us list all .csv-files in the input directory and load the first file.

lst = list.files(path=inDir, pattern="*.csv")
first = read.csv(lst[1], row.names=1, header=F)

Third, let us loop through all other .csv-files and attach them to the growing dataframe.

hndl = first
for (i in 2:length(lst)) {
add = read.csv(lst[i], row.names=1, header=F)
hndl = merge(hndl, add, by=0, all=T)
sub = subset(hndl, select=-c(Row.names))
rownames(sub) = hndl[,'Row.names']
hndl = sub
}

Fourth, let us order the rows of the final matrix according to the very first .csv-file, and then save the matrix as output.

out = hndl[match(rownames(first), rownames(hndl)),]
write.csv(out, file=outF)

DCPS Day 2016

Today, different groups of the Dahlem Center of Plant Sciences met for their annual symposium, the DCPS Day 2016. It was a great symposium, and I enjoyed giving my presentation to this great set of research groups.

Presentation at DCPS Day 2016

Presentation at DCPS Day 2016