A post without spello in the title Today, I attempted to blast entire chloroplast genomes against NCBI’s nucleotide database via the BLASTn command-line tool. Since typical plastomes are between 150,000 and 160,000 bp in length, BLASTn searches that are conducted remotely take approximately 20 min. on average. time blastn -db nt -query myinputseq.fasta -remote -out […]
Archiv der Kategorie 'bioinformatics'
Generating automatic figure labels
Following the German motto “Ordnung ist das halbe Leben!” Have you ever been frustrated by having to differentiate between dozens of similar graphs or figures, and the only memorable difference between them were their unique file names? In a recent simulation study, I had exactly that feeling, so I came up with some code to […]
State-matrix to presence-absence-matrix
A quick R example Today, I needed to convert a series of state matrices into presence-absence matrices. In order to automate this conversion, I wrote the following R code. (The initiated will recognize the output as a species-range matrix.) 1.a. Generate example input m = matrix(data=c(“A”,”B”,”C”), nrow=3, ncol=1) rownames(m) = c(“t1”, “t2”, “t3”) m [,1] […]
Filtering out unpaired raw reads from Illumina data
Working at the intersection I have always wondered why an Illumina machine occasionally generates “unpaired” paired-end reads (i.e., when you receive an R1 but no corresponding R2, or vice versa). While I am waiting for a satisfactory answer, I would like to remove any unpaired reads from my data set in the meantime. Superficially, this […]
Identifying oligonucleotide primers via the commandline
Priming for success It has been several years since I last developed a pair of customized oligonucleotide primers for DNA sequencing. At that time I tended to operate software via GUIs, not knowing that an entire toolkit of commandline tools exists, which can get the job done faster and more efficiently. Today I have an […]