Michael Grünstäudl (Gruenstaeudl), PhD

Postdoctoral Researcher at the Freie Universität Berlin

Phylogenetics and Rmpi – An Example

Better parallel than serial

In one of my past projects, I implemented the usage of multiple CPUs for time-intensive phylogenetic computations. I hereby used the R library Rmpi for the coordination of parallel processes. Rmpi is an R wrapper for the MPI communications protocol. The speed gains that my code execution was experiencing through this process parallelization was considerable, and it may, hence, be of interest to a wider audience. I therefore present here a small example of using Rmpi in phylogenetic investigations.

First, let us set up a set of calculations that warrant process parallelization. In a maximum parsimony tree search, for example, multiple starting trees may wished to be evaluated. A parsimony ratchet can be executed in parallel on these starting trees, as the runs are independent of each another.

data = read.phyDat("alignment.phy") # Alignment of 47 taxa with 1,000,000 bp length
trees = read.tree("trees.tre") # Set of 500 phylogenetic trees
myFunc = function(tree,data) {

Second, let us execute the parsimony ratchet serially (where an MP analysis is executed with the first starting tree, then the second, then the third, etc.). As we can see, MP inference performed serially on my data set takes roughly 10 min.


user system elapsed
584.513 1.070 584.977

Third, let us execute the parsimony ratchet in parallel, using the maximum number of CPUs available on my system (nproc = 8). As we can see, the MP inference performed in parallel on my data set takes roughly 4 min.

# Infer number of processors (i.e. CPUs) on system
nproc = as.numeric(system("nproc",intern=T))
# Specify one master and nproc-1 slaves
# Pass various functions to the Rmpi slaves
# Parallel apply the parsimony ratchet function on all starting trees;
# use load balancing for CPU assignment
# Close slaves

user.self sys.self elapsed user.child sys.child
129.244 93.736 224.178 0.000 0.000

Utilizing all 8 CPUs on my computer (one for the master process, the rest for the slave processes) is not necessarily the most efficient parallelization. Spawning each slave process requires a certain amount of CPU resources, and these resources may be larger than the benefit of the added calculation speed (i.e. law of diminishing returns).


Der Beitrag wurde am Sunday, den 6. March 2016 um 20:30 Uhr von Michael Grünstäudl veröffentlicht und wurde unter bioinformatics abgelegt. Sie können die Kommentare zu diesem Eintrag durch den RSS 2.0 Feed verfolgen. Sie können einen Kommentar schreiben, oder einen Trackback auf Ihrer Seite einrichten.

Leave a Reply

Hinweis / Hint
Das Captcha kann Kleinbuchstaben, Ziffern und die Sonderzeichzeichen »?!#%&« enthalten.
The captcha could contain lower case, numeric characters and special characters as »!#%&«.