Flood disasters are one of the most common and destructive natural disasters, often resulting in casualties, building collapses, and the spread of diseases, posing immeasurable threats to people's lives and property. As runoff is the final outcome of the complex interactions within a chaotic system, accurately forecasting runoff remains a significant challenge. Therefore, this study aims to explore the key feature factors that affect runoff forecasting results under deep learning frameworks. A major challenge in studying the relationship between runoff and environmental factors is the widespread issue of missing or unavailable measurement data in most river basins around the world. To address this, we developed a new coupled model by integrating a distributed hydrological model (Soil and Water Assessment Tool, SWAT) with a deep learning model (BiLSTM). We input historical meteorological data into the SWAT model to construct a physically-based hydrological process, thereby extending missing meteorological data, and further study the relationship between runoff and environmental factors in the deep learning model. We conducted an in-depth study of the Yalu River Basin using this method. By incorporating meteorological data from seven observation stations and a calibration system, we constructed a complete distributed basin model (R=0.95, NSE=0.62, PBIAS=13.3). The results of the deep learning model show that the SWAT-BiLSTM coupled model outperforms both the distributed hydrological model (SWAT) and other neural network models (LSTM, RNN, SVM) in terms of runoff prediction accuracy. Precipitation is identified as the most critical feature for runoff forecasting. The precipitation data from stations adjacent to the runoff stations have a significantly higher contribution weight in the deep learning model compared to those from coastal and inland stations, which is consistent with the historical rainfall distribution patterns. Therefore, when selecting foundational data for deep learning networks, it is important to choose data distribution patterns that align with physical laws. Although there has been some progress in the application of deep learning to runoff forecasting, most models still face the T-1 (non-predictable) problem at runoff mutation points. The causal analysis of feature factors and forecasting results in this study provides a theoretical basis for achieving more accurate runoff predictions. High-frequency data inputs within a precise range will help deep learning models achieve better results.