Michael Grünstäudl (Gruenstaeudl), PhD

Researcher at the Freie Universität Berlin

Botanical field trip in the Carinthian Alps

Alpine plants in the mist

An alpine field trip in late September can mean cold temperatures and lots of foggy weather. Interesting plants abound nonetheless.

Rote Wand at Mt. Dobratsch

Rote Wand at Mt. Dobratsch

Carduus defloratus (Asteraceae)

Carduus defloratus (Asteraceae)

Michael Gruenstaeudl in September 2018

Michael Gruenstaeudl in September 2018

Talk at evolutionary plant biology conference

Talking about novel bioinformatic tools for DNA sequence submissions

This Thursday, I held a conference talk at the 24th International Symposium on Biodiversity and Evolutionary Biology of the German Botanical Society. I introduced the participants to some of my newly developed tools for streamlining and automating the submssion of plant DNA barcoding sequences to public sequence repositories. This conference was a wonderful example of how small conferences can both meet high scientific standards and be an enjoyable reprieve for the participants. Lots of interested talks and a great social programme among the gorgeous Carinthian scenery!

Talk at DBG Sektionstagung 2018 in Klagenfurt, Kaertnen

Talk at DBG Sektionstagung 2018 in Klagenfurt, Kaertnen

 

 

 

 

 

 

 

 

 

 

New paper – Bioinformatic Workflows for Generating Complete Plastid Genome Sequences

In science, standardization and repeatability is a must.

Together with two other scientists, I just published a new paper on bioinformatic workflows for generating complete plastid genome sequences in the context of plastid phylogenomics of the water-lily clade. We demonstrate that standardization and repeatability are essential elements for modern plant phylogenomics and how such standardization and repeatability can be achieved efficiently during plastid genome assembly, annotation and alignment.

 

One-liner: Splitting multi-sequence FASTA into single-sequence FASTA

Quick, split it!
There are one-liners that never get old. Here is another one of them.

$ csplit multisequence.fasta /\>/ {*} && 
find . -size  0 -print0 |xargs -0 rm --

Teaching in spring 2018 – Part II

Teaching in spring 2018

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Teaching in spring 2018 – Part I

In the lab with teachers-to-be

One of the master-level courses I teach at the Freie Universität Berlin during spring semester, the NatLab Evolution, is geared towards a comprehensive teaching education for upcoming high-school teachers.

Teaching in spring 2018

Teaching in spring 2018

 

 

 

 

 

 

 

 

 

 

One-liner: Interleaved to deinterleaved FASTA

Quick, de-interleave it!
There are one-liners that never get old. This is one of them.

$ perl -MBio::SeqIO -e 
'my $seqin = Bio::SeqIO->
 new(-fh => \*STDIN, -format => 'fasta');
 while (my $seq = $seqin-> next_seq)
 { print ">", $seq-> id, "\n", $seq-> seq, "\n"; }'
< interleaved.fasta > deinterleaved.fasta

Few-liner: Batch download of DNA sequences from NCBI

The wonders of entrez

Today I found myself in need of a script to download dozens of DNA sequences submitted to NCBI Nucleotide. The sequences in questeion were stores in file input.txt.

$ cat input.txt
  Liriope_muscari_USACult,JX080424
  Dracaena_adamii_IVORYCOAST,JX080436
  ...

Here is how I did it:

$ INF=input.txt
$ for line in $(cat $INF); do
    SEQNAME=$(echo "$line" | awk -F',' '{print $1}')
    ACCNUM=$(echo "$line" | awk -F',' '{print $2}')
    FULLNAM=$(echo ">${SEQNAME}_${ACCNUM}")
    SEQ=$(esearch -db nucleotide -query "$ACCNUM" | efetch -format fasta | tail -n +2)
    echo -e "$FULLNAM\n$SEQ" >> out.txt
  done

Bioinformatic spring cleaning – Part II

An improved few-liner to keep the data compressed

If you wish to recusively loop through a folder and its nested subfolders and automatically gzip all files greater than 1 GB, the following few-liner is for you:

for file in $(LANG=C find . -size +1G -type f -print); do
    if [[ ! $file == *.gzip ]]; then
    gzip $file
    fi;
done

Bioinformatic spring cleaning – Part I

A short one-liner to keep the data compressed

One of the bash one-liners that I use after every successful project, yet never remember when needed, is for the simple task of looping through your folders, tar-zipping them and then removing the original folders.

for i in $(ls -d */); do
    tar czf ${i%%/}.tar.gz $i && rm -r $i;
done