From Raw EO Data to AI-Ready Datasets: Lowering the Barrier to Geospatial Foundation Model Fine-Tuning

doi:10.3390/rs18132152

DOI: 10.3390/rs18132152 ISSN: 2072-4292

From Raw EO Data to AI-Ready Datasets: Lowering the Barrier to Geospatial Foundation Model Fine-Tuning

Mattia Santoro, Enrico Boldrini, Stefano Nativi, Paolo Mazzetti

Geospatial foundation models are a new frontier in artificial intelligence, designed to understand and analyze spatial data at scale. Trained on huge sets of EO data, these models can support a wide range of applications—from monitoring natural disasters to guiding urban development and tracking climate change. To this aim, researchers and practitioners need to fine-tune the foundation models for specific tasks, utilizing a relatively small amount of additional data. As a result, geospatial foundation models are reshaping how we observe, manage, and protect our planet. Fine-tuning a geospatial foundation model requires carefully curated training datasets that reflect specific regions, time periods, or tasks—such as detecting deforestation or mapping urban growth. Yet preparing these datasets is often labor-intensive, involving steps like selecting relevant imagery, aligning spatial formats, and generating accurate labels. In practice, this means that the effectiveness of GFMs hinges on the availability of AI-ready data. This bottleneck limits the accessibility and scalability of GFMs for scientific and operational applications. In this work, we introduce a software library designed to automate these preparatory steps, streamlining the transformation of geospatial datasets into consistent, high-quality inputs for GFM fine-tuning. By reducing technical overhead and ensuring data readiness, the library enables faster, more reliable, and more inclusive adaptation of foundation models to local environmental challenges and specialized domain needs.

Outline

From Raw EO Data to AI-Ready Datasets: Lowering the Barrier to Geospatial Foundation Model Fine-Tuning

More from our Archive