Machine learning downscaling methods for inversion of terrestrial water storage anomalies in the Yangtze River Basin by GRACE
-
Abstract
Objectives: The spatial resolution of the time-varying gravity field model provided by GRACE (Gravity Recovery and Climate Experiment) satellite gravity for retrieving Terrestrial Water Storage Anomaly (TWSA) is limited (300-500km), which restricts its application potential in the study of regional water cycle and climate change. The current machine learning downscaling methods have been effective in improving the spatial resolution of GRACE TWSA data, but further exploration is needed on the reasonable selection of predictive factors and their impact on the performance of machine learning models, as well as the accurate evaluation of downscaling results. Methods: A hydrological model downscaling method and three machine learning model downscaling methods, namely Random Forest (RF), eXtreme Gradient Boosting (XGBoost), and Artificial Neural Network (ANN), are adopted to downscale the TWSA data obtained from GRACE inversion of the Yangtze River Basin with the spatial resolutions from 1 ° × 1 ° to 0.25 ° × 0.25 ° and 0.1 ° × 0.1 °, respectively. To evaluate different downscaling methods, a closed-loop simulation experiment was conducted using TWSA data from GLDAS (Global Land Data Association System) hydrological model to evaluate the performance of different downscaling methods. Subsequently, the TWSA data obtained from GRACE was downscaled, and the performance and results of different downscaling methods were comprehensively evaluated using measured water level data. Results: (1) The evaluation of machine learning downscaling performance is influenced by the number of prediction factors. As the number of predictive factors increases, the downscaling performance will continue to improve, but the downscaling performances of the three machine learning models are different. Partial least squares regression (PLSR) analysis shows that the top six prediction factors (NDVI, soil moisture, precipitation, temperature, runoff, and U-wind) with higher importance have already achieved good results in machine learning downscaling, while adding more prediction factors has a relatively small impact on the downscaling performance. (2) The closed-loop simulation experiment demonstrated that compared to machine learning model downscaling methods employing RF and XGBoost models, the ANN model exhibits relatively inferior downscaling performance metrics, yet yields optimal downscaling results. While comparing the downscaled results with water level observations in the Yangtze River Basin, the TWSA consistent with observed water levels both before and after downscaling. Notably, the correlation from the Random Forest (RF) method demonstrated significant improvement, with all correlation coefficients exceeding 0.7. (3) Taking the Poyang Lake as an example, Compare the long-term trend in machine learning downscaling results with those of the three predictive factors with the highest VIP scores, namely NDVI, soil moisture, and precipitation in the region. The results demonstrated that all three sets of machine learning downscaling results exhibited similar long-term trend characteristics to those of NDVI (1st VIP scores) and soil moisture (2st VIP scores), with the spatial extent of these changes closely aligning with the water-covered areas of Poyang Lake. In contrast, precipitation (3st VIP scores) did not show similar trend patterns. Conclusions: The best downscaling methods are RF and XGBoost, while the downscaling methods of ANN and hydrological model perform poorly. Additionally, the results of hydrological model downscaling depend on the correlation between the hydrological model and GRACE data. However, machine learning downscaling methods can better integrate the changing characteristics of different auxiliary data such as hydrology, meteorology, and vegetation (especially important predictive factors), thus enabling better recovery of detailed TWSA signals in the watershed.
-
-