Skip to main content

Machine learning downscaling of GRACE/GRACE-FO data to capture spatial-temporal drought effects on groundwater storage at a local scale under data-scarcity

Abstract

The continued threat from climate change and human impacts on water resources demands high-resolution and continuous hydrological data accessibility for predicting trends and availability. This study proposes a novel threefold downscaling method based on machine learning (ML) which integrates: data normalization; interaction of hydrometeorological variables; and the application of a time series split for cross-validation that produces a high spatial resolution groundwater storage anomaly (GWSA) dataset from the Gravity Recovery and Climate Experiment (GRACE) and its successor mission, GRACE Follow-On (GRACE-FO). In the study, the relationship between the terrestrial water storage anomaly (TWSA) from GRACE and other land surface and hydrometeorological variables (e.g., vegetation coverage, land surface temperature, precipitation, and in situ groundwater level data) is leveraged to downscale the GWSA. The predicted downscaled GWSA datasets were tested using monthly in situ groundwater level observations, and the results showed that the model satisfactorily reproduced the spatial and temporal variations in the GWSA in the study area, with Nash-Sutcliffe efficiency (NSE) correlation coefficient values of 0.8674 (random forest) and 0.7909 (XGBoost), respectively. Evapotranspiration was the most influential predictor variable in the random forest model, whereas it was rainfall in the XGBoost model. In particular, the random forest model excelled in aligning closely with the observed groundwater storage patterns, as evidenced by its high positive correlations and lower error metrics (Mean Absolute Error (MAE) of 54.78 mm; R-squared (R²) of 0.8674). The downscaled 5 km GWSA data (based on random forest) showed a decreasing trend in storage associated with variability in the rainfall pattern. An increase in drought severity during El Niño lengthened the full recovery time of groundwater based on historical storage trends. Furthermore, the time lag between the occurrence of precipitation and recharge was likely controlled by the drought intensity and the spatial recharge characteristics of the aquifer. Projected increases in drought severity could further increase groundwater recovery times in response to droughts in a changing climate, resetting storage to a new tipping condition. Therefore, climate change adaptation strategies must recognise that less groundwater will be available to supplement the surface water supply during droughts.

Introduction

Climate change poses a significant challenge to water resources in Sub-Saharan Africa, where forecasts predict not only a warming trend but also increased aridity and altered rainfall patterns, particularly a reduction in southern Africa (Mupangwa et al. 2023). These environmental shifts have direct implications for both surface and groundwater storage, which are crucial components of terrestrial water storage (TWS) (Serdeczny et al. 2017). The TWS encompasses all the water stored within the land environment, including lakes, groundwater, soil and rivers (Rodell et al. 2009). Traditionally, TWS measurements involve resource-intensive methods that directly observe hydrological variables in the water budget equation. However, the advent of the Gravity Recovery and Climate Experiment (GRACE) satellite mission in 2002 and its successor, GRACE Follow-On (GRACE-FO), revolutionised this process by providing a remote, geodetic approach to monitor water storage globally (Ferreira et al. 2023; Humphrey et al. 2023).

Groundwater storage (GWS) is a critical constituent of TWS because it is indispensable for human utilisation. Several studies concerning water availability and usage have been conducted in the Barotse catchment where groundwater was identified as the principal source of the rural water supply (Banda et al. 2021; Chongo et al. 2011; Milupi et al., 2022). GRACE/GRACE-FO has made GWS quantifiable at global scales; nevertheless, the coarse spatial resolution of the dataset limits its application at local and regional scales, necessitating the use of downscaling methods to improve its spatial resolution and utility. Downscaling can be accomplished using dynamic approaches, which use physically based global models that are computationally expensive, or statistical methods that leverage relationships between large-scale and small-scale data to improve local estimates (Fatolazadeh et al. 2022; Zuo et al. 2021).

Downscaling groundwater storage estimates from GRACE/GRACE-FO data has increased in dependence on high-resolution hydrometeorological variables to improve spatial resolution. Early methods focused on simple correlations, such as the study by Yin et al. (2018), which primarily utilised evapotranspiration (ET) data, leveraging the relationship between ET and groundwater storage by employing a correlation regression approach. However, this approach can be limiting when applied in different geological settings. More thorough techniques, such as those employed by: Kalu et al. 2024; Khorrami et al. 2023; Ning et al. 2014; and Zhong et al. 2021; integrate a wider variety of variables, such as vegetative indices, soil moisture and precipitation, to enable a deeper examination through preliminary correlation tests. This improvement highlights the importance of integrating different datasets to obtain precise and accurate downscaling outcomes (Yin et al. 2022).

Several past studies have incorporated various machine learning (ML) algorithms, such as: artificial neural networks (ANN); random forest (RF); boosted regression tree; extreme gradient boosting (XGBoost); and deep learning, to downscale GRACE satellite data to produce GWS variations at high resolution (Agarwal et al. 2023; Ali et al. 2024; Chen et al. 2019, 2020; Milewski et al. 2019; Miro and Famiglietti 2018; Rahaman et al. 2019). In their study, Chen et al. (2019) employed the RF algorithm to downscale the resolution of terrestrial water storage anomaly (TWSA) and groundwater storage anomaly (GWSA) data, which was achieved by integrating six hydrological variables. The results indicated a maximum Nash-Sutcliffe efficiency (NSE) of 0.68 and a correlation coefficient (R) of 0.83. Rahaman et al. (2019) employed the RF model to downscale GRACE-derived GWSA data. The authors observed a notable increase in the NSE values, ranging from 0.58 to 0.84; specifically, inside the Northern High Plains aquifer. The researchers successfully generated analytics depicting changes in the GWSA from 2009 to 2016, achieving a greater level of resolution. Ali et al. (2021), utilised RF and ANN techniques to downscale GRACE TWSA and GWSA data in the Irrigated Indus Basin Irrigation System, and the R values obtained ranged from 0.67 to 0.99. Previous studies have shown various limitations that have been identified in the machine learning-based downscaling of GRACE/GRACE-FO data. These include issues with data handling and variable selection, such limitations can affect the accuracy and reliability of the downscaled groundwater storage estimates. Furthermore, variability of machine learning output accuracy depends not only on the choice of input variables but also on the selection of ML algorithms and the way these models are built (Foroumandi et al. 2023; Satizabal-Alarc et al. 2023; Seyoum et al. 2019; Tao et al. 2023; Yazdian et al. 2023). Additionally, there is a need to explore better approaches for setting up machine learning models.

The Upper Zambezi catchment in southern Africa is a climate hotspot area, and projections predict a temperature increase more than twice as high as the global rate (Engelbrecht et al. 2015). This will have serious implications for the hydrological dynamics leading to increased evaporation and low soil moisture content. This study therefore aimed to use machine learning techniques to downscale the GWS derived from the GLDAS (Global Land Data Assimilation System) dataset, which incorporates GRACE/GRACE-FO estimates. The GLDAS GWS dataset was directly used for the downscaling process due to its detailed integration of terrestrial water and energy fluxes, including the high-precision harmonic solutions of GRACE/GRACE-FO data. This dataset separates soil moisture from TWS, thereby providing a distinct representation of GWS (Liu et al. 2015; Rodell et al. 2004; Save et al. 2016; Scanlon et al. 2012; Zaitchik et al. 2008). The capabilities of the XGBoost and RF algorithms were harnessed to perform downscaling, leveraging their strengths in handling complex, nonlinear data relationships. RF and XGBoost were chosen for their superior performance in the context of downscaling GRACE/GRACE-FO data. RF’s ability to handle large, multi-dimensional datasets and its resistance to overfitting, along with XGBoost’s high accuracy, efficiency and scalability, make them ideal. Under certain conditions, these algorithms outperform others such as: ANN, which requires extensive tuning and computing resources; k-nearest neighbours (kNN), which is inefficient with large datasets; and support vector machine (SVM), which struggles with large-scale data and multiclass problems. These strengths ensure accurate downscaling of satellite-derived data (Breiman 2001; Jyolsna et al. 2021; Rodriguez-Galiano et al. 2012). The unique contribution of this study was that it applied a unique threefold approach which integrated: data normalization; interaction of hydrometeorological variables; and application of a time series split for cross-validation (Bhanja and Das 2019; Izonin et al. 2022). This combination resulted in an improved distributed estimation of groundwater storage and depletion variations, leading to the development of a novel locally relevant remote sensing-assisted spatial water balance approach for identifying climatic effects (droughts) on groundwater storage. The results of this research are relevant for the development of water resource management interventions.

Materials and methods

The study area

The Barotse catchment is a major sub basin in Zambia’s western and southern provinces that is situated in the Upper Zambezi River Basin. This vast catchment spans 402 km from north to south and 530 km east to west, with an approximate total area of 106,486 km2 (Chomba et al. 2022).

With an average slope of only 0.015%, the catchment is characterised by Kalahari sands. It consists of a trellis drainage system maintained by the Luanginga, Lungwebungu and Kabompo Rivers, which drain into the Zambezi River, and the landscape’s elevation varies from approximately 1,187 m above sea level in the north east to 993 m in the south (Banda et al. 2019). The Zambezi River flows through this region, with many channels moving from upstream at Lukulu, to midstream at Mongu, and downstream at Senanga. The main factors influencing the hydrological dynamics of floodplains are yearly flooding sequences and rainfall (Money 1972). The catchment is an important hydrological and ecological zone, with an annual rate of evaporation reaching 1,578 mm. As noted by Banda et al. (2023) in their study, the region is characterised by an unconfined Kalahari aquifer system that exhibits specific yield rates that vary from 0.04 to 0.28, converging around a median value of 0.16. Silts, sands and sandstones with weak cement that compose the Kalahari aquifer are found there. The alternating clay layers in the region can sometimes lead to the formation of perched water habitats. The recharge of the Kalahari aquifer is primarily driven by seasonal rainfall, and water is supplied to the wetland area by this recharge. With aquifer transmissivity ranging from 0.44 m³/day to 63 m²/day, the transmissivity of the aquifers varies greatly. Its exact yield, which might vary from 3 to 10 L per second, is likewise unpredictable (Beilfuss 2012; Makungu and Hughes 2021).

The wet season in the Barotse spans from October to May of the following year (Pasqualino et al. 2015). Some academics contend that there is a knowledge gap, whereby what indigenous people currently understand about droughts and floods does not always align with what is confirmed by quantitative assessments (Mapedza et al. 2022). The Barotse Catchment is also a renowned tourist attraction because it hosts the Kuomboka Ceremony, which celebrates the evacuation of the monarch of the Lozi people to higher land before the commencement of the flood season (Cai et al. 2017).

The study utilised observation well data from three wells situated in the: Lukulu; Mongu; and Senanga districts, which correspond to upstream, midstream, and downstream locations along the Zambezi River. These wells were selected as they uniquely provide time series water level data within the region, offering daily measurements from December 2019 to October 2020, thus serving as the sole sources of time series water level data in the region. The wells vary in depth, but all exhibit a near-surface water table at a depth of less than approximately 10 m. The study region is depicted in Fig. 1 below.

Fig. 1
figure 1

Location map of the Barotse catchment in western and southern Zambia showing the, (a) location of towns and wetlands, (b) location of forest reserves and (c) location of the Barotse catchment in southern Africa. The map includes a hillshade base map to provide a representation of the underlying terrain for better visual interpretation

Study design

The study applied a statistical downscaling method using machine learning algorithms, where the GLDAS-GRACE GWS dataset was the target feature and utilised hydrometeorological variables as input features. The goal was to enhance the spatial detail of GLDAS-GRACE GWS data to a fine 5 km scale, as highlighted by the flow chart in Fig. 2.

a) Pre-processing

In order to ensure that every input variable had the same spatial and temporal resolution before being passed to the machine learning model, the spatial resolution and temporal structure of the input data was modified. The dependent variable in this study was the GWS derived from the GLDAS dataset, which incorporates GRACE/GRACE-FO data. This approach allowed the utilisation of the groundwater storage dataset provided by GLDAS which separates the groundwater component from the terrestrial water storage.

The independent variables included various hydrological variables that are essential components of the water balance, such as precipitation, soil moisture and evapotranspiration. These variables were selected to capture the complex interactions within the hydrological cycle that influence groundwater storage.

The target spatial resolution of 5 km was achieved through two processes: resampling high-resolution variables; and interpolating lower-resolution datasets. High-resolution datasets were directly resampled to the 5 km resolution. For lower-resolution datasets, bilinear interpolation was applied to increase their spatial resolution to 5 km. Bilinear interpolation was chosen due to it providing a smoother transition between pixels compared to simpler methods like nearest-neighbour interpolation, thereby preserving the spatial patterns of the data.

Additionally, all input data with daily or weekly temporal resolutions were aggregated into monthly means to align with the temporal structure of the GWS dataset.

b) Feature engineering and data normalization

The datasets were normalized using the Z score approach to standardise values and lessen the influence of outliers to optimise the machine learning process. The dataset’s prediction potential was subsequently increased by feature engineering, which included variable multiplication, to ensure that the intricate relationships observed in the hydrometeorological data were appropriately captured.

c) Downscaling

Downscaling through the use of RF and XGBoost-based machine learning models was the main focus of the research.

d) Validation

Groundwater level measurements from on-site observation wells were used to validate the downscaled model outputs.

e) Hypothesis testing

Thorough hypothesis testing was required in the final stage to validate the model outputs’ statistical significance in relation to the predictions that were made.

The programming of the downscaling model was carried out using Python 3.11.4, and the cartographic work was performed in ArcGIS Pro.

Fig. 2
figure 2

Flowchart of the processes involved in the study, where: GRACEGWS(0) is the GLDAS-GRACE GWS at a spatial resolution of 27 km; GRACEGWS(N) is the normalized GLDAS-GRACE GWS at 5 km; GRACEGWS(A) is the normalized downscaled GLDAS-GRACE GWS at 5 km that was not validated; GRACEGWS(D) is the denormalized downscaled GLDAS-GRACE GWS at 5 km that was not validated; and GRACEGWS(Z) is the denormalized downscaled GLDAS-GRACE GWS at 5 km that was validated

Datasets and processing

The study period, spanning from January 2009 to December 2020, was carefully selected to encompass a significant climatic event, as the study aimed to capture the impact of the 2015–2016 El Niño event as reported by Kolusu et al. (2019). This period provided a comprehensive dataset that includes significant climatic variability, which was crucial for the analysis. The datasets utilised in the study are detailed below and summarised in Table 1. Each dataset was carefully selected and processed to ensure compatibility with the GLDAS-GRACE GWS data and the specific needs of the downscaling approach.

GRACE/GRACE-FO

The measurement of Earth’s gravitational field by the GRACE/GRACE-FO mission has been an essential component of Earth observation efforts, providing insights into changes in ice sheets, water reservoirs and crustal movements. The German Research Centre for Geosciences (GFZ), Jet Propulsion Laboratory (JPL) and the University of Texas’ Centre for Space Research (CSR) are the three main analysis centres supporting this project (Byron et al. 2019). For the purpose of researching changes in terrestrial water storage, each centre processes GRACE data in a distinct manner, resulting in a variety of gravity field solutions.

This study integrates high-resolution hydrometeorological data with the CSR dataset to obtain a localised understanding of groundwater dynamics and to improve the management of water resources (Landerer and Swenson 2012).

GLDAS

In the effort to offer estimates of land surface states and fluxes, GLDAS combines land surface modelling with data from satellites and ground-based observations. At a spatial resolution of 0.25°, the GLDAS Catchment Land Surface Model (CLSM) Version 2.2 provides groundwater storage data. Detailed elements such as soil moisture, snow water equivalent and canopy water storage are provided by land surface models such as NOAH (Zaitchik et al. 2008). The residual, which is obtained by deducting these elements from the total TWS as determined by GRACE, shows the variations in groundwater storage. To effectively assess global groundwater variability, groundwater storage estimations from GLDAS use data assimilation techniques using GRACE observations from the CSR solutions (Rodell et al. 2004).

In this work, downscaling GRACE groundwater storage using high-resolution hydrometeorological variables critically depends on GLDAS data. Through the utilisation of GRACE’s gravimetric solutions processed by GLDAS, the study endeavoured to enhance the forecasts of groundwater availability at local scales (Li and Rodell 2015). The study refers to the GLDAS derived GWS dataset as the GLDAS-GRACE GWS, reflecting the incorporation of the GRACE/GRACE-FO CSR solutions in the GLDAS dataset to generate its groundwater storage component.

FLDAS

Optimised fields of land surface states and fluxes are produced globally by the Famine Early Warning Systems Network (FEWS NET) Land Data Assimilation System (FLDAS) Model L4 Global Monthly (McNally et al. 2022). Given that soil moisture fluctuations and water availability are important factors in hydrogeology, this dataset helps to pinpoint regions that are critical for groundwater recharge. This study uses the soil moisture parameter at a depth of 100 to 200 cm underground. This deeper soil moisture data provides a more stable and reliable input for downscaling, reflecting long-term trends and subsurface hydrogeological processes critical for predicting groundwater storage (van der Schalie et al. 2017).

MODIS

The National Aeronautics and Space Administration (NASA) Terra satellite provides daily land surface temperature (LST) data at a resolution of one kilometre as well as normalized difference vegetative index (NDVI) and enhanced vegetative index (EVI), packaged as the Moderate Resolution Imaging Spectroradiometer (MODIS) MOD11A1 version 6 and MOD13A2 version 6 datasets respectively (Didan, 2021; Wan et al, 2021). The study of the surface energy balance, ecosystem health and effects of climate change requires the use of LST data. Whereas the indices, which are compiled every 16 days at a resolution of one kilometre, are essential for determining the presence and amount of green vegetation as well as biomass, have the potential to identify regions with near-surface groundwater. While the EVI accounts for atmospheric conditions and canopy background signals, the NDVI is primarily associated with biomass productivity (Gstaiger et al. 2012).

WaPOR

Provided by the Food and Agriculture Organisation of the United Nations (FAO), the Water Productivity Open-access Portal (WaPOR) offers comprehensive ET datasets for the Near East and Africa (FAO 2020). WaPOR version 2.2 provides information on water consumption and stress in various landscapes through the integration of remote sensing technology (Zimba et al. 2024).

CHIRPS

The Climate Hazards Centre at the University of California Santa Barbara provides a moderate resolution precipitation dataset named the Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) (Funk et al. 2015). For trend analysis and drought monitoring, CHIRPS generates a rainfall time series dataset using in situ station data and satellite imagery at a spatial resolution of 5 km and availability from 1981 to the present.

Well data

Validating remotely sensed and modelled groundwater data requires observation well data, which are measured as water levels or head in metres (Gleeson et al. 2011). This dataset contains direct measurements, taken either manually or automatically, from wells drilled into aquifers. Head measurements provide accurate information about groundwater conditions, seasonal variations and long-term trends by indicating the height of the water column above a reference point (Fan et al. 2013). The well data for this study was collected from three water loggers installed in observation wells located in: Lukulu; Mongu; and Senanga.

Table 1 Datasets used in this research with detailed spatial and temporal resolutions

Groundwater storage from terrestrial water storage

The water budget equation can be modified to account for satellite data, such as GRACE/GRACE-FO data, which provide TWS changes in this estimation. Therefore, distinguishing between the main elements of TWS is crucial for isolating changes in GWS, as represented in the equation below (Rodell et al. 2009):

$$\:\varDelta\:TWS=\varDelta\:GWS+\varDelta\:SW+\varDelta\:SM+\varDelta\:VC\:$$
(1)

where:

(\(\:\varDelta\:TWS\)) is the change in total terrestrial water storage as measured by GRACE/GRACE-FO;

(\(\:\varDelta\:GWS\)) is the change in groundwater storage;

(\(\:\varDelta\:SW\)) is the change in surface water storage, including all surface reservoirs, lakes and rivers;

(\(\:\varDelta\:SM\)) is the change in the soil moisture content; and

(\(\:\varDelta\:VC\)) is the change in the vegetative cover.

Rearranging to solve for (\(\:\varDelta\:GWS)\):

$$\:\varDelta\:GWS=\varDelta\:TWS-\varDelta\:SW-\varDelta\:SM-\varDelta\:VC\:$$
(2)

Correlation of input variables

In the analysis, it was ensured that the datasets used in the ML models had significant predictive potential. Therefore, a correlation test between groundwater storage (GLDAS-GRACE GWS) and hydrometeorological variables was conducted, for the time period of January 2009 to December 2020 before passing them to the machine learning models. To better understand the relationships, we used seasonal trend decomposition with loess (STL) to isolate the trend component from seasonal and residual noise and we analysed the trend component only, which considered long-term changes, (Ouyang et al. 2021). Thereafter, the trend component was used to determine the Pearson, Kendall tau and Spearman correlations between the GLDAS-GRACE GWS and the input variables (Puth et al. 2015; Teng and Chen 2024).

Spearman’s rank correlation coefficient (ρ) measures the strength and direction of the association between two ranked variables, which can be calculated as:

$$\rho = 1 - \frac{{6\,\Sigma \,{d_i}^2}}{{n\left( {{n^2} - 1} \right)}}$$
(3)

where:

  • \(\:{d}_{i}\) is the difference between the ranks of corresponding values; and

  • n is the number of observations.

Kendall tau (τ) measures the association between two variables by considering the concordance and discordance of pairs of observations. It is calculated using the following formula:

$$\tau = \frac{{\left( {C - D} \right)}}{{\frac{1}{2}n\left( {n - 1} \right)}}$$
(4)

where:

  • C is the number of concordant pairs;

  • D is the number of discordant pairs; and.

  • n is the number of observations.

Pearson’s correlation coefficient (r) measures the linear relationship between two variables, as represented in the equation:

$$r = \frac{{\Sigma \,({x_i} - \bar x)({y_i} - \bar y)}}{{\sqrt {\Sigma \,{{({x_i} - \bar x)}^2}{{({y_i} - \bar y)}^2}} }}$$
(5)

where:

\(\bar x\)and \(\bar y\)are the individual sample points; and

x̄ and \(\bar y\) are the means of the x and y variables, respectively.

Feature engineering and data normalization

In machine learning frameworks, feature engineering and data normalization are essential for dataset optimisation. First, we applied Z score normalization, which recalibrates every variable to guarantee consistency and lessen the impact of outliers.

Next, we constructed interaction terms between pairs of normalized hydrometeorological variables. The dataset’s predictive power was increased by these interacted terms, which represent interactions between important hydrological and climatic elements.

Machine learning-based downscaling

We downscaled GLDAS-GRACE GWS data to a finer spatial resolution using RF and XGBoost algorithms. The data included a range of hydrometeorological variables and their interacted terms, which were normalized for consistency.

Random forest

Breiman (2001) developed the random forest algorithm, which reduces overfitting by building numerous decision trees during training and averaging their outputs. Each tree is built from a bootstrap sample of the data, with a random subset of features considered at each split introducing randomness and diversity among the trees. This ensemble method reduces the risk of overfitting and results in a model that is both accurate and able to be applied broadly. In the context of groundwater storage, RF models capture complex interactions and nonlinear relationships between the dependent and independent variables, enhancing the spatial resolution of GRACE-derived estimates (Rahaman et al. 2019). The algorithm also includes feature importance estimation, helping to identify significant predictors and handles missing data effectively.

Extreme gradient boosting (XGBoost)

XGBoost, introduced by Chen and Guestrin (2016), is a gradient boosting algorithm known for its performance and speed. Gradient boosting builds an ensemble of decision trees sequentially, where each tree corrects the errors of its predecessors, enhancing overall model accuracy. Unlike random forests, which build trees independently, XGBoost’s sequential approach and optimisation lead to significantly faster training times. XGBoost incorporates several strategies to prevent overfitting, including L1 (Lasso) and L2 (Ridge) regularisation, which control model complexity. It also prunes trees to remove branches that add little predictive power, reducing the risk of overfitting. Like RF, XGBoost supports parallel processing, further speeding up the training process. The algorithm has been used to downscale GRACE/GRACE-FO derived GWS estimates, obtaining promising results that model groundwater dynamics after validation with in situ data (Sahour 2020).

Model training and evaluation

The input datasets consisted of 25 features (7 hydrometeorological variables in their original form and 18 interacted terms generated from the hydrometeorological variables) which served as independent variables, while the GLDAS-GRACE GWS data was used as the dependent variable (Verdonck et al. 2024). The datasets were flattened into 2D arrays to fit the machine learning models. To account for the temporal structure of the data during cross-validation, scikit-learn’s `TimeSeriesSplit` was utilised. `TimeSeriesSplit` ensures that the training set always precedes the testing set, maintaining the temporal order and preventing data leakage, which is crucial for time series data (Peixeiro 2022).

For both models, a randomised cross-validation search was employed to optimise the hyperparameters. Randomised cross-validation search involves randomly sampling from a predefined set of hyperparameters, making the process more efficient compared to an exhaustive grid search (King et al. 2021). Sevenfold cross-validation and 700 fits were used in the tuning procedure for the XGBoost model and the parameters from the cross-validation search included a subsample rate of 0.7, 1,000 estimators, a maximum tree depth of 11, a learning rate of 0.005 and regularisation (Rodriguez et al. 2010). On the other hand, the RF model utilised 350 fits. The best parameters from the cross-validation search for the RF model included using 100 trees, configuring the minimum sample split and leaf, and enabling bootstrap sampling.

The accuracy of the two models was evaluated using: R-squared (R²); Mean Absolute Error (MAE); NSE; and Root Mean Square Error (RMSE) to obtain insights into their accuracy, consistency and predictive capabilities.

R² measures the proportion of the variance in the dependent variable that is predictable from the independent variables. MAE measures the average magnitude of the errors in a set of predictions, without considering their direction. NSE is used to assess the predictive power of hydrological models. RMSE quantifies the average magnitude of the prediction errors in a model, giving an indication of the accuracy of predictions. These can be calculated using the equations below.

$${R^2} = 1 - \frac{{\Sigma \,{{({y_i} - {{\hat y}_i})}^2}}}{{\Sigma \,{{({y_i} - \bar y)}^2}}}$$
(6)
$$MAE = \frac{1}{n}\Sigma \left| {{y_i} - {{\hat y}_i}} \right|$$
(7)
$$NSE = 1 - \frac{{\Sigma {{\left( {{y_i} - {{\hat y}_i}} \right)}^2}}}{{\Sigma {{\left( {{y_i} - {{\bar y}}} \right)}^2}}}$$
(8)
$$RMSE = \sqrt {\frac{1}{n}} \Sigma {\left( {{y_i} - {{\hat y}_i}} \right)^2}$$
(9)

where:

\(\:{y}_{i}\:\)is the observed value;

\(\hat y_{i}\)is the predicted value;

\(\bar y\) is the mean of the observed values; and

\(\:{y}_{max}\)

\(\:{y}_{min}\:\)

n is the number of observations.

Validation of downscaled GWS estimates

The need to validate groundwater storage estimates derived from GRACE/GRACE-FO data is underscored by the discrepancies observed between ground measurements and those obtained from remote sensing techniques. To effectively perform this validation process, it was essential to transform data from ground-based observation wells, specifically water level readings, into water storage (WS) anomalies. This transformation hinges on detailed knowledge of the aquifer’s characteristics. In the context of this study, data for the transformation was sourced from observation wells situated in the Kalahari aquifer, located within the Barotse Flood Plain. Consequently, to adjust the water level data for the purposes of validation, the following equation was utilised.

$$\:\varDelta\:GWS,\:unconfined=Sy*\varDelta\:h$$
(10)

(Nenweli et al. 2024)

 where:

(\(\:\varDelta\:GWS\)) is the groundwater storage derived from the specific yield value and the water level;

(\(\:Sy\)) is the specific yield of the aquifer; and

\(\:h\)) is the water level in the observation well.

Results

Correlations of the input hydrometeorological variables

The results from the Pearson, Kendall tau and Spearman correlation analyses in Table 2 present the relationships between the GLDAS-GRACE GWS and the hydrometeorological variables.

Table 2 Correlations between the GLDAS-GRACE GWS dataset and hydrometeorological variables

Precipitation’s moderate correlations with GLDAS-GRACE GWS indicated a monotonic relationship, but a higher Spearman correlation indicated a potential nonlinear relationship. ET exhibited considerable linear and nonlinear components, with stronger Spearman correlations indicating a nonlinear relationship. Whereas the LST - Day had substantial negative correlations with the GLDAS-GRACE GWS, indicating a reverse connection, the Pearson and Spearman scores were similar, indicating a strong linear correlation with probable nonlinear elements. The NDVI exhibited substantial positive correlations, revealing a strong linear relationship with groundwater storage. The EVI showed lower correlations, implying a weak linear relationship with potential nonlinear features, as evidenced by the greater Spearman correlation. The LST - Night exhibited broad negative correlations, indicating a strong inverse association with a strong linear component.

This analysis served as the foundation for selecting inputs for the machine learning model. We ensured that only datasets that met the 18% correlation threshold were selected for downscaling process. This approach was critical for the models’ best performance, as it guided the inclusion of variables that have a considerable influence on groundwater storage.

The permutation feature importance technique (PFIT) was applied to identify the most influential predictor variables in the machine learning models. According to PFIT in the downscaling of GLDAS-GRACE GWS data in Table 3, the XGBoost model exhibited a concentrated reliance on one variable, which was rainfall as it emerged as the predominant predictor, indicating a strong model fit to precipitation data. This focus suggests that XGBoost effectively leveraged the high predictive power of rainfall, along with soil moisture and NDVI. Conversely, the RF model demonstrated a more distributed reliance across a broader range of features, including ET, EVI, soil moisture, NDVI and both daytime and night time LST. This broader distribution indicates that RF integrated a diverse set of variables to capture the multifaceted nature of hydrological processes, providing a more balanced approach compared to the more narrowly focused XGBoost model.

Table 3 PFIT of variables in downscaling process

Model accuracy

Different capabilities of each model were demonstrated in their handling of the input hydrometeorological data. Hyperparameter tuning was used in the optimisation phase for both models to customise the learning process of each algorithm to the particular qualities of the GLDAS-GRACE GWS data, and the obtained accuracies and error metrics are shown in the scatter charts in Fig. 3 and the error metrics in Table 4.

Table 4 Error metrics from the two downscaling models showing their variance in predictive capabilities

When scatter plots comparing the predictions of the two models were visually analysed, it was evident that the RF model generated predictions that were less random and more evenly distributed. This indicated that the RF method of averaging several decision trees reduces overfitting and yields more consistent and reliable results.

Fig. 3
figure 3

Scatter plots depicting the GLDAS-GRACE GWS and the outputs from the two downscaling models

Validation of downscaled GWS with in situ well data

We validated the downscaled GLDAS-GRACE GWS estimates where observed well data (WS anomaly) served as the ground truthing proxy and the validity of the downscaling procedure was ascertained by analysing the correlations between the residuals (detrended and deseasonalised data) of the WS anomaly and downscaled GWS anomalies from the XGBoost and RF models.

Fig. 4
figure 4

Plotted are the time series of the RF GWS anomaly and XGBoost GWS anomaly on the primary axis and water storage anomaly determined from observation well data on the secondary axis over a monthly time series for the locations in the study area, where: (a) is the Lukulu district hospital in Lukulu; (b) is the Department of Water Resource Development (DWRD) in Mongu; and (c) is the Lukanda primary school in Senanga

Our analysis of GWS predictions using XGBoost and RF models across three locations, DWRD in Mongu, Lukanda primary school in Senanga and Lukulu hospital in Lukulu, revealed significant insights into the models’ performance, as summarised by Fig. 4; Table 5.

Table 5 Correlations of downscaled GLDAS-GRACE GWS with observation well data expressed as water storage anomalies

The XGBoost model results at the DWRD in Mongu exhibited strong negative correlations, suggesting that accurately capturing the groundwater dynamics was difficult. On the other hand, the RF model showed a strong positive correlation, indicating its better accuracy in representing the groundwater storage that was observed. This finding implies that at this midstream site, the random forest model is more reliable.

Similarly, high positive associations were observed at Lukulu hospital, where the RF model outperformed the XGBoost model. The ability of the RF model to capture groundwater fluctuations upstream in the study region was confirmed by the consistency of the correlation metrics.

The outcomes at Senanga’s Lukanda primary school differed. The XGBoost model revealed positive associations that were modest but not statistically significant. In contrast, the RF model showed significant negative monotonic correlations but moderate negative linear relationships. This negative correlation is likely due to the lag effect that exists between surface processes and groundwater response. The inability of the models to accurately capture the groundwater dynamics at this location could be attributed to the observation of artesian wells, which are characteristic of confined aquifer systems. In such systems, groundwater is stored under pressure, leading to delayed responses to surface recharge events.

Rainfall-induced groundwater recharge

After the validation procedure, we used data spanning the entire period from January 2009 to December 2020 to examine the relationship between rainfall during the wet season and groundwater recharge. Our research concentrated on three distinct groundwater storage datasets: RF; XGBoost; and GLDAS-GRACE. The time frame from October to May of the following year was designated the wet season. The objective was to identify wet season rainfall thresholds that, if not reached, resulted in considerable decreases in groundwater recharge, as depicted in Fig. 5. Generally, with a significant reduction in rainfall across the study region, there was a linked reduction in recharge for concurrent months, as denoted by the El Niño years (2015, 2016, 2018 and 2019).

Fig. 5
figure 5

(a) GLDAS-GRACE GWS volume and predicted GWS volume from the XGBoost and RF models on the primary vertical axis with total monthly CHIRPS rainfall on the secondary vertical axis (01-2009 to 12-2020) and (b) GLDAS-GRACE GWS anomaly and predicted GWS anomalies from the XGBoost and RF models on the primary vertical axis with total monthly CHIRPS rainfall on the secondary vertical axis (01-2009 to 12-2020)

Spatiotemporal improvement in GWS estimates

Figure 6 below illustrates the visual differences between the downscaled datasets and the GLDAS-GRACE GWS data. This visual comparison highlighted the enhanced resolution and detail captured by the XGBoost and RF downscaling models. The GLDAS-GRACE GWS data provided a broad overview of groundwater storage changes, while the downscaled datasets offered a more refined view capturing localised variations and finer spatial patterns.

Fig. 6
figure 6

The three GWS datasets from the study for January 2015 depicting the improvement in spatial resolution between the GLDAS-GRACE GWS and the two downscaled datasets: (a) GLDAS-GRACE GWS at 27 km spatial resolution, (b) downscaled GWS from the XGBoost model at 5 km spatial resolution and (c) downscaled GWS from the RF model at 5 km spatial resolution

Discussion

Downscaling of GRACE/GRACE-FO GWS estimates using machine learning

This study used hydrometeorological variables derived from the water budget equation to downscale GRACE/GRACE-FO GWS data from GLDAS to a 5 km spatial resolution. The XGBoost and RF machine learning algorithms were employed in a monthly time series spanning from January 2009 to December 2020. The most significant inputs for the downscaling procedure were chosen by conducting an initial correlation analysis on the input variables using the GLDAS-GRACE GWS as the target. Spearman, Kendall and Pearson correlations were examined to ensure that the correlations were greater than 18% (Gemitzi et al. 2021).

The XGBoost model showed promising results, capturing a significant portion of the variance in GLDAS-GRACE GWS data, which is similar to what was reported by Sahour (2020). The model showed heterogeneity in its dependability based on local conditions, with some places showing high correlations with observed well data and others unable to adequately depict true groundwater dynamics and underestimate the anticipated GWS data. In comparison, the RF model proved to be more accurate and dependable in general, producing accuracy metrics that were similar to the findings of Ali et al. (2021). The validation results showed that this model regularly produced substantial positive correlations with the observed well data. This pattern was consistent at both upstream and downstream locations, highlighting the effectiveness of the model under different hydrogeological settings.

Both models had weakly significant correlations at places exhibiting artesian well features, which are characteristic of deep confined aquifer systems. Model predictions of confined aquifers are difficult due to their pressurised conditions, isolated recharge zones and delayed responses to surface conditions. This finding suggested that neither model works well under these conditions, indicating that unconfined aquifer systems with simpler groundwater dynamics are better suited for this machine learning-based downscaling technique (Fajar et al. 2021; Wang et al. 2015; Zhang et al. 2022).

Despite the promising results, there are several limitations to the downscaled GWS estimates. Firstly, the accuracy of the downscaling process is highly dependent on the quality and spatial resolution of the input hydrometeorological variables. Any discrepancies or errors within these input datasets can propagate through the model, leading to significant inaccuracies in the downscaled outputs (Li et al. 2011). Secondly, the performance of the models is suboptimal in regions characterised by complex hydrogeological conditions, such as deep confined aquifers. The unique pressurised conditions and isolated recharge zones inherent to these areas complicate the accurate estimation of GWS (Fan et al. 2013). Furthermore, the temporal resolution of the input data is a critical factor, as it can restrict the model’s ability to accurately capture short-term fluctuations in groundwater storage. Additionally, the validation of downscaled GWS estimates was constrained by the short-term time series data from a few observation wells, leading to unusually high correlations. This highlights the need for more extensive observational data to fully validate the models. These limitations highlight the need for ongoing refinement and validation of downscaling techniques to enhance their reliability and applicability.

Drought effects on groundwater

We determined precipitation thresholds using the 90th and 10th percentiles of the dataset as upper and lower values, respectively, for rainfall, leading to a drastic reduction in groundwater storage with methodologies inspired by Huang et al. (2015), Shilengwe et al. (2023) and Deng et al. (2022); thereafter, we determined periods that fell within or outside the determined threshold values (Table 6). According to the data, there was a considerable decrease in groundwater recharge during El Niño years (2015, 2016, 2018 and 2019) due to the cumulative rainfall falling below the lower bound of the 10th percentile threshold. These periods are marked by low groundwater storage and low rainfall, suggesting that groundwater recharge is highly sensitive to lower rainfall during these anomalous years (Leasor et al. 2020).

Table 6 Groundwater storage thresholds inferred by rainfall availability where the upper threshold is determined by the 90th percentile and the lower threshold by the 10th percentile

For example, the XGBoost dataset was used to confirm that low rainfall during El Niño years was less than 330.56 mm, which led to GWS values less than 62.76 billion m³ (XGBoost). These observations are similar to those of studies on reservoir conditions during anomalous years reported by Mathivha et al. (2024). The majority of the remaining intervals were within the threshold range, indicating typical circumstances for recharge. Rainfall totals surpassing 438.78 mm (XGBoost) were recorded in high-rainfall years such as 2010 and 2012, which resulted in GWS levels reaching 77.16 billion m³ (XGBoost). The RF dataset also revealed that rainfall and GWS were below the 10th percentile threshold during El Niño years. In 2015 and 2016, for example, rainfall was frequently less than 331.23 mm, and the associated GWSs were less than 63.20 billion m³ (RF). These results are in agreement with the findings of Kolusu et al. (2019).

Additionally, the analysis revealed several outlier periods where the GWS and rainfall were significantly outside the normal thresholds. In 2010 and 2012, there was exceptionally high rainfall and GWS values across all the datasets. The GLDAS-GRACE GWS exceeded 78.49 billion m³, the XGBoost GWS also exceeded 77.16 billion m³, and the RF GWS exceeded its GWS threshold of 77.78 billion m³, which is consistent with the findings of previous studies by Xulu et al. (2020). The corresponding rainfall values also surpassed the upper thresholds of 441.87 mm (GLDAS-GRACE GWS), 438.78 mm (XGBoost) and 440.45 mm (RF). Based on these storage variations, it is clear that previous hydrological season storage has a bearing on the level of increase in the subsequent season depending on the precipitation reached.

Spatiotemporal characterisation of GWS

GWS fluctuations from 2009 to 2020 were analysed spatiotemporally, and the results revealed important patterns and trends throughout the study region. The quantity of groundwater decreased significantly in the western portion of the catchment, where the largest reductions in GWS were observed, as depicted in Fig. 7. Significant decreases in GWS were observed upstream in Lukulu, for which this trend was also visible. Similar declines in GWS were observed in the most populous regions of the study area, which are Mongu and Senanga. There were some increases in groundwater storage in the middle portion of the basin, which is sparsely populated and suggests localised groundwater recharge (Oiro et al. 2020). Additionally, the south-western part of the catchment also experienced reductions in groundwater, although these reductions were not as severe as those on the western side.

A significant decrease in the quantity of groundwater was observed during the run-up to and following the El Niño events of 2015–2016 and 2018–2019, mainly on the western side of the catchment. These decreases may have been caused by a combination of groundwater depletion caused by climate change and interactions between surface water and groundwater (Ndehedehe et al. 2023). This indicates that these climatic events combined with human activity have a substantial negative effect on groundwater levels, as was noted in the research by Bierkens and Wada (2019). With values ranging from − 400 mm to + 36 mm, the resultant map shown in Fig. 7 represents areas of significant groundwater depletion and marginal gains in storage.

Fig. 7
figure 7

The multiyear mean groundwater storage change detection map from 2009 to 2020 of the random forest GWS, where (a) is the Lukulu district hospital in Lukulu, (b) is the DWRD in Mongu and (c) is the Lukanda primary school in Senanga

Long-term trend of GWS

Unique trends in groundwater storage were identified, spanning from January 2009 to December 2020 based on the trend analysis data for groundwater storage for the RF, XGBoost and GLDAS-GRACE datasets (Figs. 8 and 9). For all three models, the research showed a period of significant groundwater recharge beginning in mid-2009 and peaking around mid-2011, indicating an early increase in groundwater storage. However, after this peak, a clear downward trend started in 2012 and persisted until the end of 2016. According to the research of Hellwig and Stahl (2018) and Gong et al. (2015), this time period coincides with decreasing rainfall and may have been influenced by El Niño episodes, resulting in drier conditions and decreased groundwater recharge.

Groundwater storage showed noticeable stabilisation and some recovery starting in 2017; however, it did not reach the levels noted in the first half of the study period. Throughout the course of the study, the general trend showed a net decrease in groundwater storage throughout the Barotse catchment, as highlighted in Fig. 9. While the downscaled XGBoost and RF GWS datasets revealed more localised patterns, with RF demonstrating greater consistency and reliability in accurately capturing the trends, the GLDAS-GRACE GWS data exhibited the most variability, reflecting its broader geographical scope. Understanding these trends is essential for comprehending long-term shifts in groundwater storage.

Fig. 8
figure 8

Trend component from STL decomposition for the GLDAS-GRACE GWS, XGBoost GWS and RF GWS models on the primary vertical axis (01-2009 to 12-2020)

Fig. 9
figure 9

The spatiotemporal variation in the RF GWS dataset, as represented by the yearly means of the GWS from 2009 to 2020, depicting the long-term changes

Conclusions

In this study, using a novel machine learning approach, we have shown that we can downscale the GLDAS-GRACE GWSA from 27 km to a finer 5 km. Hydrometeorological data, primarily obtained from remote sensing were used in the downscaling. Compared with those of XGBoost, the RF estimates performed better. Based on the downscaled GWSA, this study identified climatic thresholds of cumulative rainfall less than 330 mm throughout the rainy season that result in considerable decreases in groundwater recharge. Rainfall frequently dropped below the lower threshold during El Niño years (2015, 2016, 2018, and 2019), resulting in drastically reduced groundwater recharge. Spatially, the changes in groundwater storage varied, indicating that they are potentially controlled by rainfall and aquifer recharge properties. A change detection analysis conducted from 2009 to 2020 revealed an overall trend in GWS that captured changes in GWS anomalies ranging from − 400 mm to + 36 mm, indicating a net reduction in groundwater in the Barotse region. In conclusion, this study highlights the usefulness of machine learning models for downscaling GRACE/GRACE-FO GWS data, the significance of choosing suitable input variables and the crucial role that the determined climatic thresholds play in groundwater recharge. Expected increases in drought severity in the future will likely increase aquifer vulnerability to droughts as aquifers’ recovery time may increase. Climate change adaptation strategies must therefore recognise that less groundwater will be available to supplement the surface water supply during drought conditions.

Data availability

Datasets used and analysed during the study are described in the dataset section, where links for accessing these datasets are also provided. For any additional data produced in the study, interested parties may contact the corresponding author for access.

References

Download references

Funding

This work was supported by the Germany Federal Ministry of Education and Research-supported SASSCAL 2.0 project, Tipping Points Explained by Climate Change (TIPPECC).

Author information

Authors and Affiliations

Authors

Contributions

C. S. conceived the ideas and designed methodology, collected the data, analysed the data, visualised the data and led the writing of the manuscript. K. B. co-developed the methodology, interpreted the study, designed the study, supervised and sourced the funding. I. N. supervised the study and sourced the funding. All authors contributed critically to the drafts and gave final approval for publication.

Corresponding author

Correspondence to Christopher Shilengwe.

Ethics declarations

Conflict of interest

The authors declare no conflicts of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shilengwe, C., Banda, K. & Nyambe, I. Machine learning downscaling of GRACE/GRACE-FO data to capture spatial-temporal drought effects on groundwater storage at a local scale under data-scarcity. Environ Syst Res 13, 38 (2024). https://doi.org/10.1186/s40068-024-00368-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40068-024-00368-1

Keywords