But there are ways to approximate the distances. If two species have diverged from a common ancestor, then they will have evolved in their own separate ways, and, so, their characters will differ. We can construct another measure of the distance between species by summing the differences in their characters. We've got a lot of choices here. What characters should we choose? How do we measure the difference in a characteristic? Different choices give us different sets of distances between species, and these different sets of measures will lead us to reconstruct different phylogenetic trees.
There are other problems with this approach. Not all species evolve at the same rate. Some species have been stable for millions of years. Others evolve very fast. If a species depends on a characteristic for its continued survival, that characteristic will not change as any mutations of it will be eliminated. Call such characters essential. And most visible characters are essential for the species. This means that if we choose essential characters, any differences should count as very significant. There are, however, some difficulties with considering essential characters. If one species evolves by changing an essential characteristic, whatever ecological forces supported that change may also apply to other species, and that could lead to parallel evolution. Thus, differences or similarities in essential characters need not reflect large or small distances in the phylogenetic tree.
For an example of an irrelevant mutation, consider this. There are 64 (4 cubed) different codons for 20 amino acids. Some amino acids are coded by up to four different codons. For these multiply coded amino acids, typically the third nucleotide can take any of the four possible values. In other words, a mutation in this third nucleotide is irrelevant. The DNA can mutate at this site and the resulting protein doesn't change.
Here is a phylogenetic tree with five extant species alongside a matrix. This 5 by 5 matrix results from mutations of 40 irrelevant characterisics each with 4 alternate values (like those described in the previous paragraph). The mutation rate is uniform with a value of 100 mutations per 1000 time units, that is, 0.1 mutations per time unit. The (i,j)th entry in the matrix indicates how many of the 40 characters differ between species i and species j. If two species are not very distant in the tree, there hasn't been much time for mutations to occur, so the entry in this matrix should be small. If two species are quite distant (as, for example, when their common ancestor is at the top of the tree), the entry in the matrix should be large.
You can play around with the mutation matrix if you like. Press the "mutate" button to request a new set of mutations with the same number of characters, the same number of alternatives per characteristic, and the same mutation rate. You can also change these parameters, and each time you do, you'll get a new set of mutations automatically.
David E. Joyce
Department of Mathematics and Computer Science
Clark University
January, 1996