GRACE反演长江流域陆地水储量变化的机器学习降尺度方法研究

Machine learning downscaling methods for inversion of terrestrial water storage anomalies in the Yangtze River Basin by GRACE

  • 摘要: GRACE 时变重力场反演陆地水储量变化(Terrestrial Water Storage Anomaly, TWSA) 的空间分辨率有限(300-500km), 制约了其在中小尺度区域水循环和气候变化研究中的应用潜力。 当前机器学习降尺度方法已有效应用于提高GRACE TWSA 的空间分辨率, 但合理选取预测因子及其对机器学习模型性能的影响、以及降尺度结果的准确评估等问题仍需深入探讨。 为此, 以长江流域陆地水储量变化的降尺度处理为例, 对比传统的水文模型降尺度方法和三种机器学习降尺度方法, 即随机森林(Random Forest, RF)、极端梯度提升(eXtreme Gradient Boosting, XGBoost)、 人工神经网络(Artificial Neural Network, ANN),将 GRACE反演的长江流域空间分辨率为 1°×1°的 TWSA数据降尺度到 0.25°×0.25°和 0.1°×0.1°空间分辨率。 为了对不同降尺度方法进行评价,首先使用 GLDAS 水文模型的 TWSA 数据进行闭合模拟实验,以评估不同降尺度方法的性能;其后对 GRACE 获取的 TWSA 数据降尺度处理,并采用水文站水位数据全面评估不同降尺度方法性能和结果。 研究结果表明: 机器学习降尺度方法的性能受到预测因子数量的影响,随着预测因子数量的增加降尺度性能也会持续增强,但三种机器学习模型之间的降尺度性能优劣也会发生改变, 并且采用偏最小二乘回归分析模型得到的重要性较高的前 6个预测因子(NDVI、土壤湿度、降水、气温、径流和 U形风) 已经在机器学习降尺度中取得较好的效果,增加更多的预测因子对降尺度性能的提升相对较小; RF 和 XGBoost 降尺度方法的性能表现较好且非常接近, 而 ANN 和水文模型降尺度方法的效果略差。 此外, 传统水文模型降尺度的结果依赖于水文模型与 TWSA 数据的相关性,而机器学习降尺度方法能够更好地融合多种水文、气象和植被等辅助数据(尤其是重要性较高的预测因子) 的变化特征, 因而可以更好地恢复中小尺度精细的 TWSA 信号。

     

    Abstract: Objectives: The spatial resolution of the time-varying gravity field model provided by GRACE (Gravity Recovery and Climate Experiment) satellite gravity for retrieving Terrestrial Water Storage Anomaly (TWSA) is limited (300-500km), which restricts its application potential in the study of regional water cycle and climate change. The current machine learning downscaling methods have been effective in improving the spatial resolution of GRACE TWSA data, but further exploration is needed on the reasonable selection of predictive factors and their impact on the performance of machine learning models, as well as the accurate evaluation of downscaling results. Methods: A hydrological model downscaling method and three machine learning model downscaling methods, namely Random Forest (RF), eXtreme Gradient Boosting (XGBoost), and Artificial Neural Network (ANN), are adopted to downscale the TWSA data obtained from GRACE inversion of the Yangtze River Basin with the spatial resolutions from 1 ° × 1 ° to 0.25 ° × 0.25 ° and 0.1 ° × 0.1 °, respectively. To evaluate different downscaling methods, a closed-loop simulation experiment was conducted using TWSA data from GLDAS (Global Land Data Association System) hydrological model to evaluate the performance of different downscaling methods. Subsequently, the TWSA data obtained from GRACE was downscaled, and the performance and results of different downscaling methods were comprehensively evaluated using measured water level data. Results: (1) The evaluation of machine learning downscaling performance is influenced by the number of prediction factors. As the number of predictive factors increases, the downscaling performance will continue to improve, but the downscaling performances of the three machine learning models are different. Partial least squares regression (PLSR) analysis shows that the top six prediction factors (NDVI, soil moisture, precipitation, temperature, runoff, and U-wind) with higher importance have already achieved good results in machine learning downscaling, while adding more prediction factors has a relatively small impact on the downscaling performance. (2) The closed-loop simulation experiment demonstrated that compared to machine learning model downscaling methods employing RF and XGBoost models, the ANN model exhibits relatively inferior downscaling performance metrics, yet yields optimal downscaling results. While comparing the downscaled results with water level observations in the Yangtze River Basin, the TWSA consistent with observed water levels both before and after downscaling. Notably, the correlation from the Random Forest (RF) method demonstrated significant improvement, with all correlation coefficients exceeding 0.7. (3) Taking the Poyang Lake as an example, Compare the long-term trend in machine learning downscaling results with those of the three predictive factors with the highest VIP scores, namely NDVI, soil moisture, and precipitation in the region. The results demonstrated that all three sets of machine learning downscaling results exhibited similar long-term trend characteristics to those of NDVI (1st VIP scores) and soil moisture (2st VIP scores), with the spatial extent of these changes closely aligning with the water-covered areas of Poyang Lake. In contrast, precipitation (3st VIP scores) did not show similar trend patterns. Conclusions: The best downscaling methods are RF and XGBoost, while the downscaling methods of ANN and hydrological model perform poorly. Additionally, the results of hydrological model downscaling depend on the correlation between the hydrological model and GRACE data. However, machine learning downscaling methods can better integrate the changing characteristics of different auxiliary data such as hydrology, meteorology, and vegetation (especially important predictive factors), thus enabling better recovery of detailed TWSA signals in the watershed.

     

/

返回文章
返回