Cladograms and stratigraphy


Until recently, the only answer to the question 'How complete is the fossil record?' was a qualitative assertion that it was either wonderful, terrible, or all right. In the absence of a time machine, the only way to assess completeness was in a relative way. However, now that new semi-objective tree-making techniques are available to deal with both morphological and molecular data, the shapes of these trees can be compared with stratigraphic data.

Testing cladograms with stratigraphy

Phylogenetic trees, whether based on molecular techniques or from cladistic analysis of morphological characters are essentially independent of stratigraphy. Therefore, it is reasonable to compare the results of stratigraphy and phylogeny: do they agree or not? If they do not agree (lack congruence), then either the fossil record, or the phylogeny, or both are wrong. If the results are broadly congruent, then they are both probably telling the same story, and it can be assumed that that story is the true story of the history of life.

In assessing congruence, a claim is not made for the primacy of tree over stratigraphy, or vice versa. Indeed, there are many uncertainties involved in the construction of any tree, and in the recording of any stratigraphic sequence. It is the question of congruence that is important. So, stratigraphy can be assessed for congruence with trees, and trees can be assessed for congruence with stratigraphy.

One application of these approaches has been to use stratigraphy to test cladograms. In particular, stratocladistics (Fisher, 1992) is a method whereby stratigraphic information is actually incorporated into the tree-finding methods. Stratigraphic data are converted into 'character' data, and then combined with apomorphy coding. We do not pursue this technique since we feel that it obscures the real information content of the characters on the one hand, and the stratigraphy on the other. It is impossible to say what the resulting trees mean: they have lost character-based parsimony, and they have also diluted the stratigraphic signal. The technique has been criticised for these reasons, and others:

Techniques for comparison of cladograms with stratigraphic implications can, however, be informative in examining specific cases. For example, where several most parsimonious trees (MPTs) result from a cladistic analysis, it may be of interest to know which of these best fits current stratigraphic knowledge. The various metrics noted above can readily be calculated for each MPT.

Metrics and software

(1) The Spearman Rank Correlation (SRC) test is a well-established non-parametric test that simply compares the rank order of two series of numbers, in this case the order of first fossil appearances and the order of nodes. The technique has associated measures of confidence, but it is a rather poor estimator of the quality of matching of trees and stratigraphy, since it looks only at rank order, taking no account of the amount of time between specific fossils, and the relative ordering of nodes can be highly interdependent.

(2) The Relative completeness index (RCI), Stratigraphic consistency index (SCI), and Gap excess ratio (GER) are more informative, each assessing a different aspect of tree fit to stratigraphy. All three should be used in parallel. Hitherto, each has been a simple metric, but it is possible to assess a proxy for confidence intervals by means of randomization of the data sets. Methods are outlined more fully in Wills (1999) and Benton et al. (1999, 2000).

(3) Software is now available to assess the significance of RCI, SCI, and GER values, by random permutation of the raw data, and tests of the values against means of the generated random distributions. The software, 'Ghosts', written by Matthew Wills, is now available.