1468 / 2024-09-27 15:38:52
CoastalRAG: A Knowledge-Enhanced Retrieval and Generation Framework for Large Language Models in Coastal Zone Applications
large language model,rag,coastal zone
Session 32 - Digital twins of the ocean (DTO) and its applications
Abstract Accepted
The coastal zone, serving as a transitional area between land and sea, encompasses interdisciplinary and cross-domain knowledge systems. Consequently, the development of a coastal digital twin requires support from both multidisciplinary and cross-domain expertise. Traditional knowledge engineering approaches struggle to effectively manage such knowledge services. However, the advent of large language models (LLMs) provides a novel solution. Despite their potential, LLMs are prone to factual inaccuracies, and their generated text may lack the necessary precision, leading to hallucination issues in coastal applications. Moreover, the spatiotemporal, interaction, and driving mechanism complexities of the coastal system make it challenging for LLMs to deliver reliable responses. Traditional RAG methods attempt to enhance model performance in knowledge-intensive tasks by retrieving information from external knowledge bases. Yet, the efficacy of these retrievals depends heavily on the relevance and accuracy of the sourced information. Conventional similarity-based retrieval techniques, lacking domain-specific constraints, frequently yield low-quality text, compounding LLM errors. To address these challenges, this paper introduces a retrieval result evaluator specifically designed for the coastal domain, termed CoastalRAG. It re-evaluates and re-ranks retrieval results based on a seven-dimensional geographical framework, encompassing geographical semantics, spatial location, geometric morphology, attribute characteristics, element relationships, evolutionary processes, and driving mechanisms, thereby enhancing relevance and accuracy. Furthermore, a decomposition-reorganization algorithm is employed to optimize retrieved information, eliminate ineffective context, and improve data utilization. In coastal benchmark tests, CoastalRAG demonstrated superior performance, exceeding traditional RAG and baseline large models by 12.1% and 24.3%, respectively.