skip to main content

This content will become publicly available on December 16, 2022

Title: An image dataset of cleared, x-rayed, and fossil leaves vetted to plant family for human and machine learning
Leaves are the most abundant and visible plant organ, both in the modern world and the fossil record. Identifying foliage to the correct plant family based on leaf architecture is a fundamental botanical skill that is also critical for isolated fossil leaves, which often, especially in the Cenozoic, represent extinct genera and species from extant families. Resources focused on leaf identification are remarkably scarce; however, the situation has improved due to the recent proliferation of digitized herbarium material, live-plant identification applications, and online collections of cleared and fossil leaf images. Nevertheless, the need remains for a specialized image dataset for comparative leaf architecture. We address this gap by assembling an open-access database of 30,252 images of vouchered leaf specimens vetted to family level, primarily of angiosperms, including 26,176 images of cleared and x-rayed leaves representing 354 families and 4,076 of fossil leaves from 48 families. The images maintain original resolution, have user-friendly filenames, and are vetted using APG and modern paleobotanical standards. The cleared and x-rayed leaves include the Jack A. Wolfe and Leo J. Hickey contributions to the National Cleared Leaf Collection and a collection of high-resolution scanned x-ray negatives, housed in the Division of Paleobotany, Department of Paleobiology, more » Smithsonian National Museum of Natural History, Washington D.C.; and the Daniel I. Axelrod Cleared Leaf Collection, housed at the University of California Museum of Paleontology, Berkeley. The fossil images include a sampling of Late Cretaceous to Eocene paleobotanical sites from the Western Hemisphere held at numerous institutions, especially from Florissant Fossil Beds National Monument (late Eocene, Colorado), as well as several other localities from the Late Cretaceous to Eocene of the Western USA and the early Paleogene of Colombia and southern Argentina. The dataset facilitates new research and education opportunities in paleobotany, comparative leaf architecture, systematics, and machine learning. « less
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ;
Award ID(s):
1925755 1925552
Publication Date:
Journal Name:
Page Range or eLocation-ID:
93 to 128
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract The fossil record of Marsilea is challenging to assess, due in part to unreliable reports and conflicting opinions regarding the proper application of the names Marsilea and Marsileaceaephyllum to fossil leaves and leaflets similar to those of modern Marsilea . Specimens examined for this study include material assigned to Marsileaceaephyllum johnhallii , purportedly the oldest fossil record of a Marsilea -like sporophyte from the Lower Cretaceous of the Dakota Formation, Kansas, U.S.A.; leaves and leaf whorls of the extinct aquatic angiosperm Fortuna from several Late Cretaceous and Paleocene localities in western North America; and leaves and leaflets resembling Marsileamore »from the Eocene Green River Formation, Colorado and Utah, U.S.A. Literature on the fossil record of Marsilea was also reviewed. As a result, several taxonomic changes are proposed. Marsileaceaephyllum johnhallii is reinterpreted as an aquatic angiosperm that shares some architectural features with the genus Fortuna , although Marsileaceaephyllum is here maintained as a distinct genus with an emended diagnosis; under this reinterpretation, the name Marsileaceaephyllum can no longer be applied to sporophyte organs with affinities to Marsileaceae. Three valid fossil Marsilea species are recognized on the basis of sporophyte material that includes characteristic quadrifoliolate leaves and reticulate-veined leaflets: Marsilea campanica (J. Kvaček & Herman) Hermsen, comb. nov., from the Upper Cretaceous Grünbach Formation, Austria; Marsilea mascogos Estrada-Ruiz et al., from the Upper Cretaceous Olmos Formation, Mexico; and Marsilea sprungerorum Hermsen, sp. nov., from the Eocene Green River Formation, U.S.A. The species are distinguished from one another based on leaflet dimensions. Leaves from the Eocene Wasatch Formation, U.S.A., are transferred from Marsileaceaephyllum back to Marsilea , although not assigned to a fossil species. Finally, an occurrence of Marsilea from the Oligocene of Ethiopia is reassigned to Salvinia . A critical evaluation of the fossil record of Marsilea thus indicates that (1) the oldest fossil marsileaceous sporophytes bearing Marsilea -like leaves are from the Campanian; (2) only four credible records of sporophyte material attributable to Marsilea are known; and (3) the oldest dispersed Marsilea spores are known from the Oligocene.« less
  2. Abstract Many plant genera in the tropical West Pacific are survivors from the paleo-rainforests of Gondwana. For example, the oldest fossils of the Malesian and Australasian conifer Agathis (Araucariaceae) come from the early Paleocene and possibly latest Cretaceous of Patagonia, Argentina (West Gondwana). However, it is unknown whether dependent ecological guilds or lineages of associated insects and fungi persisted on Gondwanan host plants like Agathis through time and space. We report insect-feeding and fungal damage on Patagonian Agathis fossils from four latest Cretaceous to middle Eocene floras spanning ca. 18 Myr and compare it with damage on extant Agathis . Verymore »similar damage was found on fossil and modern Agathis , including blotch mines representing the first known Cretaceous–Paleogene boundary crossing leaf-mine association, external foliage feeding, galls, possible armored scale insect (Diaspididae) covers, and a rust fungus (Pucciniales). The similar suite of damage, unique to fossil and extant Agathis , suggests persistence of ecological guilds and possibly the component communities associated with Agathis since the late Mesozoic, implying host tracking of the genus across major plate movements that led to survival at great distances. The living associations, mostly made by still-unknown culprits, point to previously unrecognized biodiversity and evolutionary history in threatened rainforest ecosystems.« less
  3. While modern forests have their origin in the diversification and expansion of angiosperms in the late Cretaceous and early Cenozoic, it is unclear if the rise of closed-canopy tropical rainforests preceded or followed the end-Cretaceous extinction. The “canopy effect” is a strong vertical gradients in the carbon isotope (δ13C) composition of leaves in modern closed-canopy forests that could serve as a proxy signature for canopy structure in ancient forests. To test this, we report measurements of the carbon isotope composition of nearly 200 fossil angiosperm leaves from two localities in the Paleocene Cerrejón Formation and one locality in the Maastrichtianmore »Guaduas Formation. Leaves from one Cerrejón fossil assemblage deposited in a small fluvial channel exhibited a 6.3‰ range in δ13C, consistent with a closed-canopy forest. Carbon isotope values from lacustrine sediments in the Cerrejón Fm. had a range of 3.3‰, consistent with vegetation along a lake edge. An even narrower range of δ13C values (2.7‰) was observed for a leaf assemblage recovered from the Cretaceous Guaduas Fm., and suggests vegetation with an open canopy structure. Carbon isotope fractionation by late Cretaceous and early Paleogene leaves was in all cases similar to modern relatives, consistent with estimates of low atmospheric CO2 during this time period. This study confirms other lines of evidence suggesting closed-canopy forests in tropical South America existed by the late Paleocene, and fails to find isotopic evidence for a closed-canopy forest in the Cretaceous.« less
  4. Abstract Researchers typically rely on fossils from the Family Bovidae to generate African paleoenvironmental reconstructions due to their strict ecological tendencies. Bovids have dominated the southern African fauna for the past four million years and, therefore, dominate the fossil faunal assemblages, especially isolated teeth. Traditionally, researchers reference modern and fossil comparative collections to identify teeth. However, researchers are limited by the specific type and number of bovids at each institution. B.O.V.I.D. (Bovidae Occlusal Visual IDentification) is a repository of images of the occlusal surface of bovid teeth. The dataset currently includes extant bovids from 7 tribes and 20 species (~3900).more »B.O.V.I.D. contains two scaled images per specimen: a color and a black and white (binarized) image. The database is a useful reference for identifying bovid teeth. The large sample size also allows one to observe the natural variation that exists in each taxa. The binarized images can be used in statistical shape analyses, such as taxonomic classification. B.O.V.I.D. is a valuable supplement to other methods for taxonomically identifying bovid teeth.« less
  5. The Malay Archipelago is one of the most biodiverse regions on Earth, but it suffers high extinction risks due to severe anthropogenic pressures. Paleobotanical knowledge provides baselines for the conservation of living analogs and improved understanding of vegetation, biogeography, and paleoenvironments through time. The Malesian bioregion is well studied palynologically, but there have been very few investigations of Cenozoic paleobotany (plant macrofossils) in a century or more. We report the first paleobotanical survey of Brunei Darussalam, a sultanate on the north coast of Borneo that still preserves the majority of its extraordinarily diverse, old-growth tropical rainforests. We discovered abundant compressionmore »floras dominated by angiosperm leaves at two sites of probable Pliocene age: Berakas Beach, in the Liang Formation, and Kampong Lugu, in an undescribed stratigraphic unit. Both sites also yielded rich palynofloral assemblages from the macrofossil-bearing beds, indicating lowland fern-dominated swamp (Berakas Beach) and mangrove swamp (Kampong Lugu) depositional environments. Fern spores from at least nine families dominate both palynological assemblages, along with abundant fungal and freshwater algal remains, rare marine microplankton, at least four mangrove genera, and a diverse rainforest tree and liana contribution (at least 19 families) with scarce pollen of Dipterocarpaceae, today’s dominant regional life form. Compressed leaves and rare reproductive material represent influx to the depocenters from the adjacent coastal rainforests. Although only about 40% of specimens preserve informative details, we can distinguish 23 leaf and two reproductive morphotypes among the two sites. Dipterocarps are by far the most abundant group in both compression assemblages, providing rare, localized evidence for dipterocarp-dominated lowland rainforests in the Malay Archipelago before the Pleistocene. The dipterocarp fossils include winged Shorea fruits, at least two species of plicate Dipterocarpus leaves, and very common Dryobalanops leaves. We attribute additional leaf taxa to Rhamnaceae ( Ziziphus ), Melastomataceae, and Araceae ( Rhaphidophora ), all rare or new fossil records for the region. The dipterocarp leaf dominance contrasts sharply with the family’s <1% representation in the palynofloras from the same strata. This result directly demonstrates that dipterocarp pollen is prone to strong taphonomic filtering and underscores the importance of macrofossils for quantifying the timing of the dipterocarps’ rise to dominance in the region. Our work shows that complex coastal rainforests dominated by dipterocarps, adjacent to swamps and mangroves and otherwise similar to modern ecosystems, have existed in Borneo for at least 4–5 million years. Our findings add historical impetus for the conservation of these gravely imperiled and extremely biodiverse ecosystems.« less