skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on May 7, 2026

Title: Foundation Models for Archaeological Feature Detection: Advances and Prospects
To date, Deep Learning models for archaeological feature detection have generally been built on the back of off-the-shelf convolutional neural networks (CNNs) and vision Transformer (ViT) models, which are pretrained on a variety of image types, sources, and subjects that are not specific to analyzing high-resolution satellite imagery. Recent advances in transformer-based vision models and self-supervised training approaches make it possible for researchers to generate foundation models that are more finely attuned to specific domains, without huge amounts of human-annotated training data. We discuss the development of two such models employing Meta's transformer-based DINOv2 framework. The first, DeepAndes, is based on the ingestion of a 3 million chip sample from a two million square km area of high-resolution multispectral satellite imagery of the Andean region. This foundation model has broad utility across the social and earth sciences. The second, DeepAndesArch is fine-tuned labeled archaeological training data collected by the GeoPACHA project to create an archaeology-focused version of DeepAndes. We present the processes involved in generating DeepAndes and DeepAndesArch and discuss prospects for foundation models in archaeological research  more » « less
Award ID(s):
2419793
PAR ID:
10621637
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
Computer Applications in Archaeology Conference 2025
Date Published:
Format(s):
Medium: X
Location:
Athens, Greece
Sponsoring Org:
National Science Foundation
More Like this
  1. Accurate mapping of nearshore bathymetry is essential for coastal management, navigation, and environmental monitoring. Traditional bathymetric mapping methods such as sonar surveys and LiDAR are often time-consuming and costly. This paper introduces BathyFormer, a novel vision transformer- and encoder-based deep learning model designed to estimate nearshore bathymetry from high-resolution multispectral satellite imagery. This methodology involves training the BathyFormer model on a dataset comprising satellite images and corresponding bathymetric data obtained from the Continuously Updated Digital Elevation Model (CUDEM). The model learns to predict water depths by analyzing the spectral signatures and spatial patterns present in the multispectral imagery. Validation of the estimated bathymetry maps using independent hydrographic survey data produces a root mean squared error (RMSE) ranging from 0.55 to 0.73 m at depths of 2 to 5 m across three different locations within the Chesapeake Bay, which were independent of the training set. This approach shows significant promise for large-scale, cost-effective shallow water nearshore bathymetric mapping, providing a valuable tool for coastal scientists, marine planners, and environmental managers. 
    more » « less
  2. Archaeological surveys conducted through the inspection of high-resolution satellite imagery promise to transform how archaeologists conduct large-scale regional and supra-regional research. However, conducting manual surveys of satellite imagery is labour- and time-intensive, and low target prevalence substantially increases the likelihood of miss-errors (false negatives). In this article, the authors compare the results of an imagery survey conducted using artificial intelligence computer vision techniques (Convolutional Neural Networks) to a survey conducted manually by a team of experts through the Geo-PACHA platform (for further details of the project, see Wernkeet al. 2023). Results suggest that future surveys may benefit from a hybrid approach—combining manual and automated methods—to conduct an AI-assisted survey and improve data completeness and robustness. 
    more » « less
  3. Marine scientists have been leveraging supervised machine learning algorithms to analyze image and video data for nearly two decades. There have been many advances, but the cost of generating expert human annotations to train new models remains extremely high. There is broad recognition both in computer and domain sciences that generating training data remains the major bottleneck when developing ML models for targeted tasks. Increasingly, computer scientists are not attempting to produce highly-optimized models from general annotation frameworks, instead focusing on adaptation strategies to tackle new data challenges. Taking inspiration from large language models, computer vision researchers are now thinking in terms of “foundation models” that can yield reasonable zero- and few-shot detection and segmentation performance with human prompting. Here we consider the utility of this approach for ocean imagery, leveraging Meta’s Segment Anything Model to enrich ocean image annotations based on existing labels. This workflow yields promising results, especially for modernizing existing data repositories. Moreover, it suggests that future human annotation efforts could use foundation models to speed progress toward a sufficient training set to address domain specific problems. 
    more » « less
  4. In geographical image segmentation, performance is often constrained by the limited availability of training data and a lack of generalizability, particularly for segmenting mobility infrastructure such as roads, sidewalks, and crosswalks. Vision foundation models like the Segment Anything Model (SAM), pre-trained on millions of natural images, have demonstrated impressive zero-shot segmentation performance, providing a potential solution. However, SAM struggles with geographical images, such as aerial and satellite imagery, due to its training being confined to natural images and the narrow features and textures of these objects blending into their surroundings. To address these challenges, we propose Geographical SAM (GeoSAM), a SAM-based framework that fine-tunes SAM using automatically generated multi-modal prompts. Specifically, GeoSAM integrates point prompts from a pre-trained task-specific model as primary visual guidance, and text prompts generated by a large language model as secondary semantic guidance, enabling the model to better capture both spatial structure and contextual meaning. GeoSAM outperforms existing approaches for mobility infrastructure segmentation in both familiar and completely unseen regions by at least 5% in mIoU, representing a significant leap in leveraging foundation models to segment mobility infrastructure, including both road and pedestrian infrastructure in geographical images. The source code is publicly available. 
    more » « less
  5. This paper presents the results of a large scale, drone-based aerial survey in northeastern Jordan. Drones have rapidly become one of the most cost-effective and efficient tools for collecting high-resolution landscape data, fitting between larger-scale, lower-resolution satellite data collection and the significantly more limited traditional terrestrial survey approaches. Drones are particularly effective in areas where anthropogenic features are visible on the surface but are too small to identify with commonly and economically available satellite data. Using imagery from fixed-wing and rotary-wing aircraft, along with photogrammetric processing, we surveyed an extensive archaeological landscape spanning 32 km2 at the site of Wadi al-Qattafi in the eastern badia region of Jordan, the largest archaeological drone survey, to date, in Jordan. The resulting data allowed us to map a wide range of anthropogenic features, including hunting traps, domestic structures, and tombs, as well as modern alterations to the landscape including road construction and looting pits. We documented thousands of previously unrecorded and largely unknown prehistoric structures, providing an improved understanding of major shifts in the prehistoric use of this landscape. 
    more » « less