Alumnus of Freie Universität Berlin – Michael Grünstäudl, PhD

Successful habilitation in botany and bioinformatics

Alignment Phy2Nex few-liner

Alignment file format conversion for the efficient – Part II

Today, I needed to convert a series of alignments, which were stored in the phylip format, into the common nexus format. The output DNA alignment hereby needed to be of sequential format (i.e., non-interleaved).

In February 2017, I had already written a few-liner to conduct the inverse conversion (nexus to phylip) and was, thus, surprised to find that the conversion from phylip to non-interleaved nexus did not work out of the box. Instead, a few more lines (and a little trick using StringIO()) were necessary to get this specific conversion to work.

#!/usr/bin/env python2.7

import os
import sys
from Bio import AlignIO
from Bio.Alphabet import IUPAC, Gapped
from Bio.Nexus import Nexus
from StringIO import StringIO

inFn = sys.argv[1]
outFn= os.path.splitext(inFn)[0]+".nex"

inp = open(inFn, 'rU')
outp = open(outFn, 'w')

alphabet = Gapped(IUPAC.ambiguous_dna)
aln = AlignIO.parse(inp, 'phylip-relaxed', alphabet=alphabet)

out_handle = StringIO()
AlignIO.write(aln, out_handle, 'nexus')

p = Nexus.Nexus()
p.read(out_handle.getvalue())
p.write_nexus_data(outp, interleave=False)

outp.close()
inp.close()

And for those who wish to apply the above Python code (saved as “phy2nex.py“) to a collection of directories which contain a phylip-file each:

for dir in */; do python2 phy2nex.py $dir*.phy; done

Der Beitrag wurde am Friday, den 27. October 2017 um 18:21 Uhr von Michael Grünstäudl veröffentlicht und wurde unter bioinformatics, one-liners abgelegt. Sie können die Kommentare zu diesem Eintrag durch den RSS 2.0 Feed verfolgen. Sie können einen Kommentar schreiben, oder einen Trackback auf Ihrer Seite einrichten.

Leave a Reply

Captcha
Refresh
Hilfe
Hinweis / Hint
Das Captcha kann Kleinbuchstaben, Ziffern und die Sonderzeichzeichen »?!#%&« enthalten.
The captcha could contain lower case, numeric characters and special characters as »!#%&«.