I wrote a function to create a Euclidean distance matrix of some amino acid substitution matrices and I wanted to find a built-in method find the Spearman's rank of two lists to create a distance matrix that way. I found that BioPython actually has a method that builds distance matrices using various different distance metric, including Euclidean and Spearman's rank:
import Bio.Cluster
dm = Bio.Cluster.distancematrix(data, dist="s")
If you change the dist to "e", then it will calculate the Euclidean distance.
I thought there might be a way to output this in phylip format so I could use quicktree, but if there is, I wasn't able to find it. So here's mine:
fout = open(filename, 'w')
fout.write('%d\n' % len(names))
for name, row in zip(names, dm):
fout.write(name)
for value in row:
fout.write('\t%s' % value)
fout.write('\n')
It assumes you have the distance matrix in the format created by the Bio.Cluster distancematrix function, and have a list of names for the sequences or matrices.
An example output would be:
3
A
B 1.2 0.8
C 3.2 1.6 2.0