Imagine a system that could predict how severe a wildfire might get, or predict flood risk in real-time before they spiral out of control, powered by AI analyzing terabytes of satellite data. That’s the promise of Earth Observation Foundation models (EOFMs).
Foundation Models (FM) are a game changer in Earth Observation (EO) data analysis, proposing the potential to produce more accurate results faster and easier. While there are some outliers, most current FMs rely on transformer architectures. Transformers split complex data into tokens (smaller chunks of data) and use attention mechanisms (machine learning relative importance method) to learn contextualized patterns. These patterns are turned into embeddings (basically, numerical representations of the data), that allow the model to act as a general foundation (hence the term “Foundation Model”) from which they can be fine-tuned for specific different tasks.
Foundation Models are the engine of popular Large Language Model (LLM) and Computer Vision (CV) tools that we see widely used today, like ChatGPT and Stable Diffusion. By tapping into these innovations, EOFMs have the potential to change how we practitioners do geospatial analysis, and how broader society interacts and benefits from spatial data.
The GeoAI community (practitioners leveraging spatial data..) has eagerly embraced EOFMs, leveraging their power to transform vast satellite datasets into accessible, actionable insights, thereby democratizing geospatial intelligence for a broader audience. In the last 2-3 years we have seen an explosion of EOFMs trained on a wide array of earth observation data, covering optical and (non-optical.. SAR, weather satellites, derived products..). EOFMs are one of the key technologies that we in the GeoAI community want to see widely adopted. So how can we get to that level of adoption and tangible value matching Text and Vision?
Text Here