@Alexis_RAxML recently gave a talk which ended with an interesting editorial about software quality. In it, he described the results of his investigation into the quality of existing phylogenetics software. I think he’s preparing that for publication, so I’m not going to describe his results, but IIRC he looked at things such as comment density and Valgrind results.
That is an excellent first step, but I’d like to think a little about function. There have been very many implementations of the core phylogenetic likelihood computation, but to my knowledge there haven’t been any with a comprehensive suite of unit tests for the core calculations. Does anyone know of one? (There are tests built into some packages, such as RevBayes, but these are scripts testing the broad-scale functioning of the package, and so aren’t what I’m talking about here.)
I am very glad that we are seeing the development of libraries such as Bio++ and pll, signaling a possibile shift from monolithic codebases to ones in which the core likelihood computation is isolated from the tree exploration part. I think/hope that this will lead to more creative developments on each side.
Having a phylogenetic likelihood computation library with 100% unit test coverage, and a collection of mini-examples with agreed-upon results, would seem very helpful for the field. Major bugs continue to appear, even in major inference packages implementing standard models.
Thoughts?
I note that this is related to, but different from, the topic of test data sets.
(cc @mlandis, @hoehna, @mtholder)