从专用模型到通用模型驱动的范式升级：GeoAI研究综述与展望

何捷; 王超群; 潘伟江; 徐欣

doi:10.13203/j.whugis20260025

从专用模型到通用模型驱动的范式升级：GeoAI研究综述与展望

From Task-Specific Models to General-Purpose Model-Driven Paradigm Shift：A Review and Future Directions of GeoAI Research

摘要

摘要: 随着地理空间数据的爆炸式增长与AI技术的快速发展，地理空间AI（geospatial artificial intelligence,GeoAI）已逐渐从依赖特定任务专用模型的传统范式，演进至以预训练与大模型等通用模型为核心的新范式。及时系统梳理该领域的研究进展与范式演变，对于把握GeoAI领域前沿动态具有重要意义。已有研究虽从多个细分领域对GeoAI的应用进行了回顾，但未结合技术迭代脉络，全景式梳理从深度学习模型、预训练模型到大模型阶段的完整发展变革历程。因此，亟需构建系统性的综述分析框架，全面阐释AI技术迭代下GeoAI的研究脉络、技术进展与未来挑战。基于对172篇代表性领域文献的分析，首先依据模型是否专为地理任务训练，将GeoAI研究划分为专用模型驱动与通用模型驱动两大类；进而依据建模目标与技术路径进行二次细分，总结各细分方向的核心进展与局限。分析结果表明，专用模型驱动的研究在时空规律挖掘、属性推断、构型优化等任务中已形成成熟的技术体系，但其任务迁移性与泛化能力有限；通用模型驱动的研究正呈现从空间嵌入、多模态对齐到大模型集成应用的演进趋势，展现出更强的多任务适应与知识迁移潜力；以大模型作为“智能中枢”来协同专用模型及人类专家知识、推动大模型实现从表层特征感知到深层空间认知的跨越，是未来的重要方向。综上，通过呈现从深度学习时代到大模型时代的GeoAI研究图景，帮助研究者快速把握该领域的发展全貌，为后续研究提供参考与启示。

Abstract: Geospatial artificial intelligence (GeoAI) is an interdisciplinary discipline formed by the in-depth integration of geospatial science and artificial intelligence. Leveraging cutting-edge artificial intelligence technologies and integrating geospatial big data with high-performance computing methods, it has profoundly revolutionized traditional spatial analysis approaches that rely heavily on GIS platforms and statistical methods. With the continuous iteration of artificial intelligence technologies, GeoAI is undergoing profound paradigm transformation, shifting from the conventional paradigm driven by task-specific models to an emerging general-purpose model paradigm centered on pre-trained models and large models. Nevertheless, existing review studies fail to cover the complete evolutionary trajectory of the GeoAI domain and lack a unified analytical framework. Therefore, it is imperative to clarify the evolutionary context of this field, summarize the disparities among technical pathways, and systematically consolidate state-of-the-art advances and development trends, so as to provide solid theoretical and methodological support for future research. In terms of research methodology, this study first adopts the "model-driven paradigm" as the primary classification criterion and divides existing literature into two major research branches: task-specific model-driven and general-purpose model-driven frameworks. Secondary classification is further conducted based on modeling objectives and model application patterns. A seed literature database is established on the basis of previous reviews, followed by the formulation of retrieval strategies and systematic screening of core literature. Ultimately, a total of 172 representative Chinese and English publications published since 2019 are analyzed. The proposed classification system is adopted to systematically summarize the typical modeling frameworks, application scenarios, and comparative advantages of each research branch. The research findings reveal the progress of two sub-branches within task-specific model-driven GeoAI research: First, research on spatiotemporal geographic knowledge graphs remains relatively limited, primarily owing to the dominant position of deep learning paradigms in the previous technological revolution. Early studies predominantly focused on the structured representation of geographic knowledge. The emergence of knowledge graph embedding (KGE) techniques and large language models has revitalized this domain. KGE enables the conversion of spatiotemporal geographic knowledge into machine-interpretable and computationally feasible vector representations, which effectively facilitates fine-tuning and knowledge enhancement for pre-trained and large models. It thereby serves as a critical bridge connecting structured geographic knowledge and data-driven models. Second, end-to-end deep learning models have developed into a sophisticated and mature technical system, covering five core tasks: spatiotemporal pattern mining, continuous indicator prediction, discrete label inference, spatial configuration generation, and visual semantics-assisted socioeconomic factor estimation. Although such models have achieved high accuracy and stability in previous development stages, they are still constrained by weak task transferability, insufficient cross-scene generalization capability, and excessive costs of repetitive modeling.General-purpose model-driven GeoAI research demonstrates a distinct evolutionary path alongside the iterative advancement of AI technologies: Prior to the widespread application of large models in GeoAI, spatial embedding research emerged as the earliest research direction, with a current developmental trend toward multi-modal embedded information aggregation. Universal spatiotemporal prediction models share similar paradigms yet achieve integrated spatiotem poral modeling within a unified framework. Supported by cross-city pre-training strategies, these models substantially enhance cross-domain transferability. Different from the static embedding characteristics of spatial embedding models, they focus more on addressing dynamic urban spatial issues. Compared with the previous two types, pre-trained models with geospatial-multimodal semantic alignment are more applicable to a broader range of geospatial tasks. They realize the underlying fusion of geographic, textual and visual information, eliminating intermediate spatial embedding procedures and exhibiting superior adaptability to complex spatial analysis tasks. Current GeoAI research integrated with large models mainly adopts mainstream strategies for geographic task adaptation, including lightweight fine-tuning, prompt learning, tool invocation and agent-based architectures. Meanwhile, unified evaluation frameworks are being constructed to quantify the performance of general large models in geospatial tasks. Despite the remarkable operational convenience and performance potential demonstrated by large model-enabled GeoAI, prominent challenges persist, such as deficient spatial reasoning capability, inherent geographic bias, and limited capacity for complex geographic computation. In conclusion, GeoAI has accomplished a comprehensive paradigm upgrade from task-specific models to general purpose models, with transferability, generalizability and interactivity emerging as the core development priorities. Future research should further promote the in-depth application of large models, shifting the research focus from conventional feature extraction and pattern recognition to high-level geographic knowledge comprehension, spatial reasoning and intelligent decision-making. It is essential to construct a next-generation GeoAI ecosystem that takes large models as the intelligent hub and achieves deep integration and collaborative collaboration with knowledge graphs, task-specific models and human expertise. Furthermore, the robust logical reasoning and cognitive capabilities of large models should be fully exploited to empower innovative geographic research.

HTML全文

参考文献(211)

施引文献

资源附件(0)