Deep Time Project: A Comprehensive Phylogenetic Tree of Living and Fossil Angiosperms

Comments and questions:

Dr. Doug Soltis

Copyright © FLMNH
This site is maintained by the
FLMNH

Last modified: 5/10/2002

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Back to Top

Rationale

Deep Time RCN--Objectives

1. Characterize and prioritize fossils

2. Correct time estimates

3. Construction of a morphological matrix

4. Integrating fossils into the angiosperm tree

5. Calibration of branch points in cladogram/molecular evolution

Annual Meetings

Workshops

Student Travel Awards

Student Research Training

Website Development

Management Plan

Coordination of Management Activities

Assessment of Research Coordiation Activities

Coordination Plan

Increasing Diversity

1. Under-represented Groups

2. Diversity of Institutional Settings

3. Opportunities for Students, Post-docs, and New Researchers

4. Integration


I. Background

Results of Prior Support: Several of us (P. Soltis, D. Soltis, J. Doyle, M. Sanderson) have been actively involved in the Green Plant Phylogeny Research Coordination Group (GPPRCG or "Deep Green") since its inception in 1994. Deep Green was organized to coordinate the reconstruction of the phylogeny of all green plants, a major branch of the tree of life (over 500,000 species).

Deep Green sponsored meetings and workshops, and fostered large collaborations on various groups of green plants (e.g., algae, bryophytes, ferns, angiosperms).In essence, however, the Deep Green initiative was modeled, in part, on the success achieved by angiosperm systematists in their large collaborative efforts to reconstruct phylogeny across all major groups of angiosperms.This was a grass roots effort that was initiated in 1991-1992 by D. Soltis and M. Chase; it received no formal funding, but quickly resulted in a highly successful major collaboration among 42 investigators who provided a topology for angiosperms based on 500 rbcL sequences (Chase et al., 1993).

Accompanying the publication of the first comprehensive phylogenetic tree for angiosperms (Chase et al., 1993) was a series of 13 companion papers focused on major subgroups of angiosperms (organized by D. Soltis, M. Chase, R. Olmstead). Together these 14 papers constituted an entire issue of the Annals of the Missouri Botanical Garden (volume 80, number 3; 1993) and revealed the enormous advances that could be achieved via collaboration on a grand scale.

The collaborations and cooperation among angiosperm systematists have continued. D. Soltis, P. Soltis, and D. Nickrent organized a collaboration of 17 investigators that resulted in a nuclear-based (18S rDNA) topology for angiosperms (D. Soltis et al., 1997). Additional large collaborations involving 16 investigators (organized by D. Soltis, P. Soltis, and M. Chase) resulted in the compilation of a data set for three genes (atpB, rbcL, 18S rDNA; nearly 5000 bp per species) for 560 species of angiosperms.

Phylogenetic analyses of this data set have resulted in a highly resolved and well-supported topology for angiosperms (P. Soltis et al., 1999a; D. Soltis et al., 2000). These collaborations have produced some of the largest phylogenetic analyses ever undertaken on any group of organism, resulting in a formal reclassification of the angiosperms by The Angiosperm Phylogeny Group (APG, 1998), an international group of 29 systematists.

This is the first time that a major group of organisms has been reclassified based largely on molecular phylogenetic hypotheses. Significantly, it was conducted by a large group of investigators, rather than by a single "expert," as has been the longstanding tradition in all areas of systematics. In addition, numerous other papers have subsequently resulted from the collaborations that we informally initiated nearly a decade ago (e.g., Savolainen et al., 2000a, b; Chase et al., 2000; D. Soltis et al., 1998; Mort et al., 2000).

These collaborations have been extended from angiosperms to involve phylogenetic studies of all land plants (P. Soltis et al., 1999b; Mishler et al., in prep.). These collaborations have not only resulted in a firm understanding of angiosperm relationships (APG, 1998; D. Soltis et al., 2000), but also fundamentally altered the manner in which all systematists approach the phylogenetic analysis of large data sets.

Assessing relationships in many large, problematic groups of organisms (e.g., fungi, bacteria, insects) requires the compilation and phylogenetic analysis of data sets (DNA sequences and/or nonmolecular traits) for numerous taxa. Although the phylogenetic analysis of large data sets involving hundreds of exemplars often is central to understanding relationships within many groups, the feasibility of analysis of these data sets has been debated (reviewed in D. Soltis and Soltis, 1998; Chase and Albert, 1998).

Until recently, some maintained that large data sets were intractable and could not be analyzed phylogenetically (see Graur et al., 1996). Significantly, as a direct result of the efforts of collaborative research among plant systematists, recent empirical and simulation studies suggest that large data sets are much more tractable than thought only a few years ago (Hillis, 1996; Graybeal, 1998).

One solution is the addition of taxa and characters (a total evidence approach), as demonstrated by D. Soltis et al. (1998). The large data sets of angisoperm systematists were also a stimulus for the development of computational advances, such as "fast" or "quick" search techniques including the fast bootstrap (Swofford, 1998) and the parsimony jackknife (Farris et al., 1996), and improved search algorithms such as the RATCHET (Nixon, 1999).

Thus, through our involvement with Deep Green and APG we are very familiar with the benefits of collaborative science and also have considerable experience in organizing and facilitating such undertakings. As a result, we feel that our collective expertise will result in the success of the current undertaking, "Deep Time," an integration of paleontology and phylogenetics.

II. Introduction

The angiosperms, or flowering plants, comprising over 250,000 species and approximately 400 families, are, by far, the largest clade of plants and represent the dominant group of land plants today. Putting things in perspective, the angiosperms are at least five times as speciose as the vertebrates.

In both morphology and chemistry, the angiosperms are highly diverse. In size, they range from plants known as duckweeds (the genus Lemna) that are roughly a millimeter in length to Eucalyptus trees well over 100 meters in height. They also encompass enormous floral diversity, with flowers ranging from less than a millimeter (Lemna, Lepuropetalon) to a meter (Rafflesia) in diameter, and possessing just a few floral organs (Chloranthaceae) to hundreds (Monimiaceae).

Owing to this enormous diversity, the relationships among extant flowering plants have, until recently, been highly contentious. Because of the apparent sudden appearance of a diverse array of early angiosperms in the fossil record, Charles Darwin referred to the origin of the flowering plants as "an abominable mystery."

Paleobotanical studies have shown that Early Cretaceous angiosperms were much less diverse than was thought in Darwin's time; nonetheless, fossil evidence indicates that the angiosperms radiated rapidly. Although there are reports of earlier angiosperm remains, the oldest fossils that are indisputably angiosperms are from the Early Cretaceous, about 130 million years ago (Sun et al., 1998; reviewed in Dilcher, 2000; Crane, 1993; Crane et al., 1995; Magallón-Puebla et al., 1999).

Based on fossil evidence, the angiosperms radiated rapidly after their origin, with extensive diversity already apparent by 115 million years ago. By 90-100 million years ago, the angiosperms had become the dominant floristic element on Earth. By 75 million years ago, many clades corresponding to modern orders and families were already present.

Enormous advances have been realized in both angiosperm phylogenetics and paleobotany during the past decade. We review the recent accomplishments of both groups below.These critical developments set the stage for the Research Coordination Network proposed.

Advances in Angiosperm Phylogenetics Until recently, the radiation of the angiosperms was thought to have occurred so rapidly that many systematists thought it might not be possible to identify the oldest extant angiosperm lineages (see Chase and Cox, 1998). Furthermore, the circumscription of, and relationships among, the major groups of angiosperms were uncertain, with different modern classifications proposing different patterns of relationship (e.g., Cronquist, 1981; Takhtajan, 1987, 1999; Thorne, 1992).

In large part through the contributions of molecular systematics, our understanding of extant angiosperm relationships and evolution has changed dramatically in the past decade. Early studies using cladistic analysis of morphological data (e.g., Donoghue and Doyle, 1989; Doyle and Donoghue, 1986; Loconte and Stevenson, 1991) challenged long-standing views of angiosperm relationships and evolution and quickly set the stage for molecular phylogenetic investigations.

Massive DNA sequencing efforts have prompted some of the largest phylogenetic analyses ever conducted, ultimately resulting in a highly resolved and well-supported topology for many of the angiosperms. Beginning with individual genes such as rbcL and 18S rDNA, angiosperm systematists constructed large DNA data sets containing hundreds of species (e.g., Chase et al., 1993; Soltis et al., 1997).

Although the analysis of such large data sets was controversial, angiosperm systematists combined data sets for different genes, revealing that one solution to the computational problems large data sets pose is the addition of taxa and genes (D. Soltis et al., 1998; Chase and Cox, 1998). Other efforts have combined molecular and non-molecular data sets (Doyle et al., 1994; Nandi et al., 1998; Doyle and Endress, 2000).

The largest data set to date involves 560 angiosperms (and seven outgroup taxa) sequenced for three genes (~5,000 bp per taxon). The topology provides, for the first time, strong support (as measured by bootstrap or jackknife values) for much of the spine of the tree, and for most major clades (Fig. 1). Other studies have implications for the closest relatives of the angiosperms (Bowe et al., 2000; Chaw et al., 2000).

Figure 1. Overview of angiosperm relationships based on phylogenetic analyses of a data set of 567 taxa each sequenced for three genes (from P. Soltis et al., 1999a). The relationships depicted among basal angiosperms have been modified to reflect the increased resolution and support realized in the analyses of data sets of six genes and over 12,000 bp per taxon (Zanis et al., submitted) and subsequent analyses based on ten genes and ~20,000 bp per taxon (Zanis et al., in prep.). Similarly, relationships among core eudicot lineages reflect new insights based on the recent analysis of a four-gene data set of ~8,000 bp per taxon (Senters et al., 2000). Numbers above branches are bootstrap values.

We review here the topology for living angiosperms; this firm understanding of extant angiosperm relationships is a major stimulus for the proposed RCN (Fig. 1). Overview of angiosperm relationships based on phylogenetic analyses of a data set of 567 taxa each sequenced for three genes (from P. Soltis et al., 1999a).

The relationships depicted among basal angiosperms have been modified to reflect the increased resolution and support realized in the analyses of data sets of six genes and over 12,000 bp per taxon (Zanis et al., submitted) and subsequent analyses based on ten genes and ~20,000 bp per taxon (Zanis et al., in prep.).

Similarly, relationships among core eudicot lineages reflect new insights based on the recent analysis of a four-gene data set of ~8,000 bp per taxon (Senters et al., 2000). Numbers above branches are bootstrap values. Overview of angiosperm phylogeny--A series of recent studies using different genes and different molecular approaches all agree in identifying the same early branches of the angiosperm tree for living taxa (P. Soltis et al., 1999a; D. Soltis et al., 2000; Qiu et al., 1999; Parkinson et al., 1999; Mathews and Donoghue, 1999; Graham and Olmstead, 1999).

The early branches of the angiosperms are Amborella (a shrub endemic to New Caledonia), the Nymphaeales (the water lilies), and the shrubs/lianas Illicium, Schisandra, Trimenia, and Austrobaileya. In addition to these early branches, there are a number of other lineages of "basal angiosperms": monocots, Laurales, Magnoliales, Chloranthaceae, Piperales, and Winterales (APG, 1998). Based on the three-gene topology, each of these lineages is well supported, but relationships among them were unclear (but see below).

Many of these early-diverging angiosperms possess pollen with a single groove, or aperture (line of weakness). Significantly, contrary to longstanding classifications, there is no monocot-dicot split in the flowering plants. In addition, the fact that the first branches of the topology are well supported and not species rich and are followed by a number of speciose clades suggests that the initial explosive radiation of the angiosperms did not coincide directly with the origin of flowering plants, but likely occurred slightly later (cf. Mathews and Donoghue, 1999; P. Soltis et al., 2000).Following these basal angiosperm lineages, the remaining angiosperms, representing the majority (75%) of flowering plants, form a well-supported clade referred to as the eudicots.

The early branches of the eudicots are well supported and include Ranunculales, Proteales, Trochodendraceae, and Buxaceae. These are followed by the core eudicots, a clade that includes well-supported major groups such as asterids, rosids, Caryophyllales, and Saxifragales. Additional resolution and support of relationships among core eudicots has been achieved by adding entire 26S rDNA sequences to the exisiting three-gene matrix (Senters et al., 2000).

Relationships among Basal Angiosperms--Recent analyses have clarified those deep-level relationships among basal angiosperm lineages that remained uncertain in the three-gene analyses. Via the analyses of data sets of six genes and over 12,000 bp (Zanis et al., submitted) and subsequent analyses based on ten genes and ~20,000 bp (Zanis et al., in prep.), relationships among these lineages are also well supported (Fig. 1), a result critical to this proposal.

These recent analyses indicate relationships among major clades of basal angiosperms identical to those reported by Qiu et al. (1999), but now all nodes are strongly supported. Following the grade of Amborella, Nymphaeales, and Illicium/Schisandra/Trimenia/ Austrobaileya, the clade of Ceratophyllaceae + monocots is sister to all remaining angiosperms. The Ceratophyllaceae/monocot clade is, in turn, followed by Chloranthaceae, which are sister to all remaining angiosperms. Following Chloranthaceae, Magnoliales and Laurales are strongly supported as sister groups, as are Winterales/Piperales; together, these four lineages also form a well-supported clade that is sister to the eudicots (Fig. 1).

Advances in Paleobotany During the past 20 years there have been great strides in developing techniques of investigation for early angiosperm remains, great increase in the collection and description of fossil material of early angiosperms, and an intense analysis of these fossil data with special reference to floral morphological characters in relationship to time of occurrence (Dilcher, 1979; Crane et al., 1995; Friis et al., 1999; Magallón-Puebla et al., 1999; Crepet et al., in press).

The amount of information available from fossil leaf, fruit, flower, pollen grains, or wood allows us to use character-based comparisons with extant angiosperms across taxonomic lines. These data can be assembled from megafossils, mesofossils, and microfossils, which all yield new information about taxonomic diversity and characters of early angiosperms.

Substantial amounts of data are becoming available each year. For example, Lower Cretaceous sediments from Portugal recently yielded 105 different kinds of flowers with 13 associated pollen types by the study of mesofossils (Friis et al., 1999). Mesofossils are those small floral buds, fruits, seeds, flowers, or plant parts recovered by sieving the sediments. Mesofossils also have been studied from Cretaceous sediments in New Jersey (e.g., Nixon and Crepet, 1993; Herendeen et al., 1993, 1994; Crepet and Nixon, 1988a, b; Gandolfo et al., 1998a, b, c; Crepet et al., in press), and Maryland through Georgia (e.g., Crane et al., 1993, 1995; Herendeen et al., 1995, 1999; Crane and Herendeen, 1996; Keller et al., 1996; Sims et al., 1998, 1999) where numerous new taxa have been described. Reports of early angiosperm flowers in China, which predate any other known flowers (e.g., Sun et al., 1998), come from the megafossil record.

One of these was reported as uppermost Jurassic, 142-145 million years old, but this age was revised to 120 million years and is the subject of some debate at this time. This emphasizes the need for us to include a working group to evaluate the ages reported for the fossil material. Also, we need to evaluate the fossils to be included in character-based analyses of fossil angiosperm remains, such as those used by Magallón-Puebla et al. (1999) to infer the presence of particular groups on the major branching points of angiosperm phylogeny.

 

III. Deep Time Research Coordination Network-Deep Time RCN

Rationale

As reviewed above, molecular data have provided a robust phylogeny for extant angiosperms. Concomitantly, paleobotanists have greatly improved our understanding of early angiosperm diversity. Integrating fossils into the tree of living taxa remains essential for understanding not only the origin of extant angiosperm groups, but also the origins of their structures (Doyle, 1998a, b). However, such attempts to integrate fossils and extant taxa in phylogeny reconstruction have been rare (e.g., Nixon and Crepet, 1998; Keller et al., 1996; Magallón-Puebla et al., 1996; Eklund, 1999).

Although angiosperm systematists and paleobotanists potentially have much in common and each group has made major strides in the past decade, there has been surprisingly little communication and integration of data between the two areas. Systematists are often unaware of the significance of fossil discoveries and of the characterizations of these fossils; paleobotanists do not always think phylogenetically and hence lack full appreciation of the excellent phylogenetic framework presently available for living angiosperms.

Until the 1970s, for example, fossil taxa were typically placed in relationship to living genera. With few exceptions, paleobotanists have been reluctant to define and name extinct angiosperm families or orders (reviewed in Dilcher, 2000). The paucity of attempts to integrate fossils into a phylogenetic framework can also be attributed to a necessary reliance on morphology. That is, a morphological matrix for living taxa into which to integrate fossils is a necessity, yet attempts to formulate such matrices for angiosperms are relatively recent and still incomplete (Nandi et al., 1998; Doyle and Endress, 2000).

Importantly, attention to the formidable problems of character analysis has tended to wane in the understandable enthusiasm for molecular systematics. Other factors responsible for the lack of interdisciplinary work include the difficulty in characterizing many fossils, and the analytical issues that must be considered when integrating fossils into a phylogenetic framework.

The timing is now appropriate to develop a new synthesis of angiosperm paleobotany and systematics/phylogenetics and a theory for integrating paleontological and neontological perspectives. The required phylogenetic framework for angiosperms is now in place (P. Soltis et al., 1999a; D. Soltis et al. 2000; Qiu et al., 1999; Zanis et al., submitted) to provide the underpinning for such a project.

In addition, considerable progress has been made in developing a morphological matrix for basal lineages of angiosperms (Doyle and Endress, 2000). The paleobotanical and plant systematics communities are each well organized and exhibit a spirit of collaboration and cooperation; these factors enhance the opportunities for interdisciplinary collaboration.

Lastly, formal interactions between reseachers in both areas recently have been established and strengthened (e.g., the sharing of unpublished morphological and DNA sequence data by Doyle and Endress, 2000 and P. Soltis et al., 2000). Hence, our proposed collaboration to integrate early fossil angiosperms into a phylogenetic framework seems both timely and feasible.

Through the proposed RCN, we will use our collective expertise on Cretaceous angiosperms and angiosperm phylogeny to develop a paradigm for the integration of paleontology and phylogenetics. For several reasons, we will focus initially on early angiosperm fossils (Cretaceous in age), rather than all angiosperms.

  1. The early diversification of the angiosperms is a critical time period of wide interest.
  2. Angiosperms are so numerous both in terms of extant groups and fossil taxa that it would be difficult to begin with an all-encompassing analysis.

Therefore, this project will focus on the Cretaceous record. This time period corresponds to the origin and explosive radiation of early angiosperms and the early branches of the eudicots (see Fig. 1). Importantly, these are also the branches of the angiosperm topology that are now best understood (Fig. 1); although the groups of core eudicots are clear and well supported, their interrelationship is still uncertain (Fig. 1).

However, the issues that we consider here and the approaches that we develop will ultimately be applicable to all angiosperms, as well as to other groups of green plants and other lineages of organisms in general.

We envision our RCN quickly expanding to include other angiosperm fossils. For this reason, several of the researchers included in this proposal (as well as others to be invited) work on fossils of a more recent age (e.g., Tertiary). In this way we are already anticipating and preparing for future research that will include all angiosperms. These and other researchers can contribute to the development of analytical approaches for integrating fossils and extant taxa and can immediately apply the approaches and tools developed for early angiosperms to fossil angiosperms of a more recent age.

 

Deep Time RCN--Objectives

The primary mission of Deep Time will be to facilitate, coordinate, and stimulate new research at the interface of paleobotany, geology, and systematics/phylogenetics. Our goal is not to co-opt the research of individual investigators, but to promote new research opportunities. If, for example, a new early angiosperm fossil is discovered and described and those investigators wish to explore the possible phylogenetic placement of this fossil, Deep Time will provide a vehicle for promoting that research by facilitating contact/research among appropriate investigators.

Thus, Deep Time opens new avenues of research, but does not compromise the ongoing efforts of individuals. In the example provided, the phylogenetic placement of this new fossil emerges as a separate research endeavor from its initial description, representing a research opportunity that paleobotanists perhaps would not normally consider.

It is our hope that through Deep Time it will become a standard procedure for paleobotanists to seek phylogenetic placement of fossils. Our considerable experience with Deep Green has made us aware not only of the benefits of such large collaborative efforts, but also of potential problems.

One problem with Deep Green is that some working groups are too large to be effective. We will therefore promote smaller working groups because in our experience this is the most efficient and effective fashion to promote research. This does not imply that Deep Time will be exclusionary; to the contrary, we intend to maximize participation by paleobotanists, systematists, geologists, and theoreticians via a number of avenues (see below). For example, a number of investigators other than those listed as Core Participants are interested in participating at some level: B. Mishler, M. Donoghue, J. Davis, N. Arens, G. Brenner, V. Krassilov, and L. Golovneva.

We feel that by developing modest-sized working groups we will increase the speed at which we make progress and enhance our chances of success. We envision five major components to this collaboration:

  1. Prioritization and correct characterization of fossils to be analyzed;
  2. Correct time estimation of fossils;
  3. Construction of a morphological data matrix for clades of extant angiosperms;
  4. Integration of fossils into the angiosperm tree;
  5. Calibration of branch points in the cladogram and studies of molecular evolution.

These five areas form the basis of five "Focus Groups," each of which is discussed below, with initial group leaders listed. Participation is not restricted to a single Focus Group; participants may be involved in one or more of these groups.

 

1. Characterize and prioritize fossils (D. Dilcher, P. Herendeen, S. Magallón-Puebla)

Fossil angiosperm remains are abundant and diverse in sediments of Cretaceous age (e.g., Doyle, 1969; Doyle and Hickey, 1976; Dilcher, 1979; Dilcher and Crane, 1984; Rodr’guez-de la Rosa and Cevallos-Ferriz, 1994; Crane et al., 1995; Crane and Herendeen, 1996; Sims et al., 1998, 1999; Friis et al., 1999; Herendeen et al., 1999; Magallón-Puebla et al., 1999; Dilcher, 2000). However, all fossils are not of equal utility or value for phylogenetic studies. They range from single pollen grains that may or may not possess distinctive identifying features, to plants that are known from flowers, fruits, seeds, pollen, and other plant parts.

Fossils will be treated as exemplars, and we will establish criteria by which fossils will be selected for inclusion in phylogenetic analyses. Fossils that are reasonably complete and thus can be scored for sufficient morphological characters (see below) can be included in cladistic analyses to explore phylogenetic relationships and evolutionary significance (e.g., Keller et al., 1996; Magallón-Puebla et al., 1996, 1997; Crepet and Nixon, 1998).

Fossils that do not have sufficient characters to yield a stable result will be resolved in different positions on the cladogram and thus cause the collapse of some clades in the consensus tree.Thus, fossils that are reasonably complete will be targeted over those that are more fragmentary.

However, defining "reasonably" is not a simple matter, and simply rejecting fragmentary fossils is not appropriate because an incomplete fossil may possess a single unique structure that is a synapomorphy for a single extant group, with the result that the fossil is unequivocally resolved on the cladogram (e.g., Magallón-Puebla et al., 1996).

Fossils that are insufficiently complete to withstand cladistic analysis may be of significance in other ways. For example, fossil triaperturate pollen grains are referable to the "eudicot" clade, and therefore the oldest fossil pollen grains of this form represent the minimum age for the eudicot clade. Thus, fossils that cannot be included in cladistic analyses can be of significance in analyses of evolutionary rates.

However, there is one significant difference between the fossils that are included in cladistic analyses and those that are not. Current interpretations of systematic relationships of fossils that are included in cladistic analyses need not be correct because they can be reassessed using the results of the analysis. In contrast, the identity of fossils that are not included in cladistic analyses (e.g., triaperturate pollen), but will be used to date divergences, must be correct.

Therefore, fossils must be evaluated and selected with care. In selecting fossils, our goal will be to maximize taxonomic diversity by seeking out representatives of as many angiosperm clades as possible.

In addition, representation of clades through time is important for investigations of rates of molecular evolution (see below). D. Dilcher, P. Herendeen, and S. Magallón-Puebla will coordinate the discussions of selection criteria and facilitate the prioritization and the selection of fossils.

 

2. Correct time estimates (P. Herendeen, R. Christopher, R. Lupia)

Fossils can be included in phylogenetic analyses and treated exactly like the extant exemplars. Indeed, that has been the approach that Herendeen and colleagues have taken in evaluating the relationships and implications for floral evolution in Cretaceous angiosperms (e.g., Keller et al., 1996; Magallón-Puebla et al., 1996). In fact, the age of the fossils can be disregarded entirely if one chooses to do so.

However, this discards the one unique aspect that fossils bring to evolutionary studies-time. Fossils represent the minimum age of the taxon to which the fossil can be assigned. When a fossil is included in a phylogenetic analysis and occupies a stable placement on the cladogram, it will represent the minimum age for the node where it is attached. Thus, accurate understanding of the age of fossils is critical to maximizing their utility.

The ages of diverse localities from which fossils are collected are often open to reinterpretation due to the discovery of new evidence or more accurate dating methods. It is therefore important that age estimates be as accurate as possible. Fossil sites that are amenable to radiometric dating are relatively trouble-free and generally do not present problems in estimation of age.

However, age determination for fossil deposits derived from terrestrial sediments in geological settings that lack appropriate rock for radiometric dating can be more difficult. In such cases biostratigraphy using dispersed pollen, spores, and other microfossils (palynology) must be used to establish relative ages (e.g., Christopher, 1978, 1979; Doyle and Robbins, 1977; Doyle, 1992).

Correlations between terrestrial palynological assemblages and assemblages from near-shore marine deposits, which are generally easier to date using radiometric methods, are used to assign an age to the terrestrial deposits. To assist in this work we have included as Core Participants two investigators with expertise in biostratigraphy: R. Christopher and R. Lupia (both are in Departments of Geology). R. Christopher, a palynological biostratigrapher who has worked on Cretaceous age sediments, especially of eastern North America, will work with R. Lupia, P. Herendeen, and others to determine which fossil sites have accurate dates and which require additional study for an accurate assessment of age.

 

3. Construction of a morphological matrix (D. Soltis, J. Doyle, W. Judd)

We will need to establish guidelines for the characters used in construction of a morphological matrix. As noted, the study of morphological characters and problems of character analysis has received less attention as more effort has been focused on molecular systematics.

However, analysis of morphology is required for a synthetic analysis of fossils and extant organisms.Thus, training and expertise in both paleomorphology and neomorphology will be an important contribution of this RCN. An initial goal is to develop a working list of morphological characters that could potentially be used for extant taxa. Several existing data sets can serve as starting points (Doyle and Endress, 2000; Nandi et al., 1998).

In constructing data matrices, it will be very important to take into consideration the limitations that fossils impose. That is, of the many morphological characters that can potentially be used for extant angiosperms, which characters are actually present in fossils? For example, epicuticular wax characters or features of embryology may be appropriate for a morphological data set for living taxa, but of limited utility for integrating fossils because they are not preserved. Many early angiosperm fossils are fragmentary, in some cases known primarily from pollen (see Characterize and prioritize fossils, above).

Although many fossils are incomplete and lack some suites of characters (e.g., epicuticular wax, molecular data), this is not sufficient justification to exclude these characters, which may be important in revealing relationships among extant taxa. The issue of missing data in fossil and extant taxa is addressed in the next section: Integrating fossils into the angiosperm tree.

Once characters have been selected, they will be divided into their component states; coding of these characters (e.g., presence vs. absence, multi-state, continuous) will be another important consideration. Researchers will also need to determine whether the species in the existing DNA data sets will be used as terminals and their morphological characters scored, or whether an entire family will be used as the terminal and the variation encompassed by that family taken into consideration. For example, Asimina and Annona are placeholders for Annonaceae in D. Soltis et al. (2000).

These two genera could be used as terminals and their morphological features alone considered, or the variation across the entire Annonaceae could be taken into account (Rannala et al., 1998; Kron and Judd, 1997; Doyle and Endress, 2000). Annonaceae are a good example of the problems that need to be discussed, because in both morphological and molecular analyses, Asimina and Annona are both deeply nested within the family and Anaxagorea is sister to the remainder of the family (Doyle and Le Thomas, 1996). Hence, to accomodate greater phylogenetic diversity for the family, the latter genus should probably be included as a terminal if placeholders are used.

Construction of a morphological matrix for living flowering plants will begin by several groups of researchers working on separate groups of extant plants. One group of researchers, for example, may take primary responsibility for Winterales, another group for Magnoliales, monocots, and so on. Conversely, some working groups may want to focus on the careful evaluation of a particular character or suite of characters to clarify homology and coding.

The existing ties and collaborations in place among angiosperm systematists (APG) will be extremely useful at this point in the process, as will the existing Deep Green network. At this stage our research endeavor will approach the interface between research coordination and actual research (gathering/assembling of morphological characters). The RCN will promote the coordination of this effort, but will not fund the actual gathering of data.

Funding for assembling a morphological data matrix could be sought elsewhere. More likely, the process will continue to be conducted by small groups of investigators with expertise in particular groups, but with the effort coordinated via RCN funding. The Deep Time RCN will also play a vital role in coordinating the next research phase, the compilation of morphological data into a single matrix.

The problems of assembling a global data matrix from the many different sets of disparate and overlapping characters for individual groups will be crucial topics of discussion at workshops. Once the data have been compiled for extant groups, researchers will then need to reevaluate characters and refine the matrix for inclusion of fossils; some characters may be considered unsuitable or uninformative, for example, and therefore would be removed. It will be critical to have the ability to bring together researchers to discuss and evaluate options for constructing a global angiosperm matrix.

Ultimately, the Deep Time RCN will facilitate the compilation of a final, comprehensive morphological data matrix for extant angiosperms.

 

4. Integrating fossils into the angiosperm tree (P. Soltis, J. Doyle, W. Judd)

Given that fossils have rarely been integrated in a phylogenetic context for any group, the Deep Time RCN will have several critical features of data analysis to consider and discuss, both methodological and analytical. The concepts and principles that are needed are still not clear, and a major contribution of this RCN will be to stimulate their development. Primary issues are missing data and the combinability of molecular and morphological data sets.

We currently envision three general approaches for placing fossils in the correct phylogenetic position, and other alternatives may arise.

  1. Constrain the taxa in the morphological matrix to conform to the DNA-based topology already available and conduct a phylogenetic analysis of the morphological matrix with fossils included. This approach assumes that the molecular-based tree is correct and that the inclusion of fossil groups would not change our inference of relationships among extant groups.
  2. Alternatively, the morphological matrix, with and without fossils, can be analyzed phylogenetically. This approach does not take advantage of the wealth of information provided by molecular analyses, but it allows relationships among extant taxa to vary with the addition of fossils.
  3. All characters, morphological and DNA, are used together to construct cladograms; this can be done, both with and without fossils. Comparison of the results among these analyses would follow. In some cases, the differences between the analyses will likely be minimal. In other cases, there may be substantial differences that will need to be discussed and explored in more detail. Analyses that include and omit fossils will allow us to assess the topological impact of including fossil taxa. Fossils may play a critical role in determining the final topology (e.g., Donoghue et al., 1989; Doyle, 1998a, b).

The analytical and methodological issues involved in integrating extant and fossil taxa will be addressed in a workshop dedicated to these issues in Year 4 of the funding period.

5. Calibration of branch points in cladogram/molecular evolution (P. Soltis, M. Sanderson)

Once fossils are integrated into a phylogenetic framework, they can be used to calibrate branch points in the cladogram, given that the improved estimates of the ages of the fossils will be available (see II, above). These divergence times will open up new research possibilities, such as providing estimates of the ages of particular cladogenic events and analysis of diversification rates (e.g., Sanderson and Donoghue, 1994; Sanderson, 1997, 1998).

Given a good phylogenetic framework, this information can also be used to date nodes for which fossil data are lacking, using molecular clock methods or related methods that allow for variation in rates of molecular evolution (Sanderson, 1998). These estimates of divergence times will also facilitate the study of molecular evolution of the genes used to generate the cladograms (e.g., rbcL, atpB, 18S rDNA), plus other genes that are currently under study in the angiosperms.

This objective will link the Deep Time RCN with the proposed RCN uniting plant phylogenetics and genomics ("Deep Gene," B. Mishler, PI). One of the goals of Deep Gene is to provide a framework for studying the evolution of genes and gene famiies across the angiosperms and ultimately across all green plants. To facilitate successful coordination between the two RCNs, D. Soltis, P. Soltis, and Y.-L. Qiu are Core Participants in Deep Gene, in addition to their roles in Deep Time.

 

Deep Time RCN--Activities

We will implement the following activities to ensure the success of Deep Time: annual meetings, workshops, student travel awards, student research training awards, and website development.

Annual Meetings

An annual meeting will be held in conjunction with the annual meeting of the Botanical Society of America (BSA) on the day or days immediately following the BSA meeting. In this manner, we can keep travel costs down, given that most participants would attend the BSA meetings.

In Year 5 (2005), the annual meeting will be held in conjunction with the XVII International Botanical Congress (IBC) in Vienna.In Year 1, the annual meeting will be two days in length to allow the paleobotanists and systematists to "educate" each other on their most recent accomplishments.

Day 1 will be a presentation of paleobotanical perspectives and goals, and Day 2 will be a presentation of systematics/phylogenetics perspectives and goals. During each subsequent year, the annual meeting will be held for only one day. During each annual meeting the goals and objectives for the following year will be established; progress to date will be discussed.

Participation of postdocs and undergraduate and graduate students will be encouraged (see below).

Workshops

One or two two-day workshops per year will deal with the specific objectives proposed (1-4 above).

In Year 1, we will have workshops that will:

  1. establish the fossil prioritization list, and
  2. provide character guidelines (i.e., develop character lists and character coding for extant taxa and discuss the limitations imposed by fossils).

In Year 2, workshops will be held on:

  1. stratigraphy and correct time estimates for fossils on the priority list, and
  2. construction of the morphological data matrix with discussion of problems in combining data matrices for different taxa.

In Year 3, a single workshop will be held dealing with the integration of fossils into the morphological matrix for extant taxa. This workshop will be of broad conceptual interest outside of the Deep Time RCN and will be organized to draw upon the expertise of biologists and theoreticians outside of Deep Time.

In Year 4, the single workshop will focus on

  1. calibration of branch points using fossil dates, and
  2. new analyses of molecular evolution.

In Year 5, a single workshop will be held prior to the IBC in Vienna to coordinate presentations for the IBC and to plan for the IBC.

A similar series of planning workshops was sponsored by Deep Green prior to the XVI IBC in St. Louis in 1999 and was in large part responsible for the cohesion of the symposia on green plant phylogeny. The angiosperm workshop, for example, held in May, 1999, at Washington State University, brought together the participants in the three angiosperm phylogeny symposia and allowed them to collaborate and modify their presentations prior to the IBC. We envision similar success with the proposed workshop in Year 5.

Student Travel Awards

Involvement of undergraduates, graduate students, and postdocs is critical for the growth and development of the integrative research we propose. We will award up to 10-15 student travel awards of $500 each per year during the first four years of the funding period to attend and participate in the annual meeting and/or the specialized workshops.Funds are requested to provide 20 awards of $1000 each for travel to the IBC in Year 5.

Student Research Training

In that we are proposing to train a new generation of students with interdisciplinary skills, we envision two categories of research training opportunities. Funds are not requested for research per se but to provide research training opportunities.

  1. "Cross-training" awards will allow students to visit a laboratory representing a different discipline from that of the student's advisor for up to two months. For example, a student in paleobotany could visit a lab in molecular systematics, phylogenetics, or geology to become directly exposed to research in those areas related to his/her own research. Alternatively, a student in molecular systematics or molecular evolution could visit a lab studying angiosperm morphology and character analysis. Four such awards of up to $2000 each will be made in Years 1 and 5, 5 in Year 2, and 10 in Years 3 and 4.
  2. Phylogenetics training awards will allow students to attend one of several courses available in phylogenetic theory and practice (e.g., Woods Hole; Bodega Bay). Four such awards of up to $1000 each will be made in Years 1 and 2, and six will be made in Years 3 and 4.

Website Development

Development of a website for the Deep Time RCN is critical to the success of this network. In addition to providing background information on the Deep Time initiative and its goals, the website will serve as a mechanism to connect the Deep Time participants.

A News and Topics Page will highlight both phylogenetics and fossils in the scientific and popular news.

An interactive Discussion Page will provide a forum for dialogue and exchange of information among participants. This page will also allow us to reach potential participants whom we have not yet identified and to interact with those who have not yet participated in our meetings and workshops. Through this mechanism we hope to expand the network as the network develops.

Also on the Deep Time website will be appropriate phylogenetic trees presently available for basal angiosperms and early-branching eudicots and a detailed geologic history of the Cretaceous. Very early in the development of the website we will also begin to provide phylogeny updates to the Tree of Life and TreeBASE; links to both will be established.

We will also use the website as a repository for published data, such as the molecular and morphological data sets available for angiosperms.

Data availability matrices (DAMs) will also be reported on the website. The Morphological DAM will provide the character list, character-state coding, and taxon list (with both species as terminals and a mixture of species and clades as terminals cf. Doyle and Endress, 2000) that are developed by the Morphological Matrix Focus Group. The DAM will also indicate which data are available and where.

We envision a page that would allow one to click on a specific cell and be linked to a page that shows what data are available for that cell and where. For example, if one clicked on floral characters for Ericales, a page listing the relevant publications by Judd and Kron and others would appear. Also, if appropriate, this page would indicate the name and contact for any participant who wishes to share unpublished information.

The Fossil DAM will provide the names, organs, dates, references, and contacts of relevant fossils. The fossil priority list will also be posted here. We propose to develop a "Virtual Fossil Collection" that will serve as both a research tool and an educational vehicle.

Photographs, drawings, and images from microscopy of Cretaceous fossils, beginning with those on the priority list, will be incorporated into the collection. The rationale for this is that few systematists are keenly aware of the fossils and their morphologies. We plan to design this collection with a magnification feature that will allow viewers to zoom in on a specific structure or region of the image.

Development of this feature may require consultation with a software developer such as Inxight, with whom Deep Green has worked to develop the hyperbolic phylogenetic trees on the Deep Green website. We will include only those images that have been donated to the collection and those published images for which we have obtained copyright permission. Each image will be labelled with the name of the author and the Deep Time RCN label. Relevant references for each image will also be provided.

The Virtual Fossil Collection will also serve as an important educational tool for students at a variety of levels. To promote this educuational component, we will have links to the Botanical Image Collection of the award-winning website