Evaluation of reanalysis and global meteorological products in Beas river basin of North-Western Himalaya

It is a great challenge to obtain reliable gridded meteorological data in some data-scarce and complex territories like the Himalaya region. Less dense observed raingauge data are unable to represent rainfall variability in the Beas river basin of North-Western Himalaya. In this study four reanalyses (MERRA, ERA-Interim, JRA-55 and CFSR) and one global meteorological forcing data WFDEI have been used to evaluate the potential of the products to represent orographic rainfall pattern of Beas river basin using hydrology model. The modeled climate data have compared with observed climate data for a long term basis. A comparison of various rainfall and temperature products helps to determine uniformity and disparity between various estimates. Results show that all temperature data have a good agreement with gridded observed data. ERA-Interim temperature data is better in terms of bias, RMSE (Root Mean Square Error), and correlation compared to other data. On the other hand, MERRA, ERA-Interim and JRA-55 models have overestimated rainfall values, but CFSR and WFDEI models have underestimated rainfall values to the measured values. Variable Infiltration Capacity (VIC), a macroscale distributed hydrology model has been successfully applied to indirectly estimate the performance of five gridded meteorological data to represent Beas river basin rainfall pattern. The simulation result of the VIC hydrology model forced by these data reveals that the discharge of ERA-Interim has a good agreement with observed streamflow. In contrast there is an overestimated streamflow observed for MERRA reanalysis estimate. JRA-55, WFDEI, and CFSR data underestimate the streamflow. The reanalysis products are also poor in capturing the seasonal hydrograph pattern. The ERA-Interim product better represents orographic rainfall for the Beas river basin. The reason may be the ERA-Interim uses a four-dimensional variational analysis model during assimilation. The major drawback of MERRA is the non-inclusion of observed precipitation data during assimilation and modeling error. The poor performance of JRA-55, CFSR and WFDEI is due to the gauge rainfall data assimilation error. This research finding will help for broader research on hydrology and meteorology of the Himalayan region.


Background
Rainfall and temperature data are considered as a significant input for water resource management and hydrological processes of the Himalayan river basin. The high altitude precipitation is mainly dependent on orography. The other factors that control the variation of precipitation are space, time and altitude. The association of orography with broad atmospheric circulation system, zonal climate process and rate of local evapotranspiration control the pattern of distribution and variability of mountain precipitation (Nesbitt and Anders, 2009). Therefore it is necessary to evaluate the precipitation estimates to understand the spatio-temporal distribution of mountain precipitation. Several studies (Bhattacharya et al. 2019;Tiwari et al. 2018) have reported the advantage of using reanalysis temperature products for snowmelt modeling and simulation of streamflow in high altitude rugged terrain where observation networks are inaccessible. A comparison of various reanalysis temperature estimates with observation is needed to understand the variability of temperature with altitude and to estimate suitable gridded temperature data as a proxy of observation stations for data-limited mountain regions. Ledesma and Futter (2017) have reported that the observed air temperature from a station is more realistic than rainfall. The spatial variation and error in the station air temperature are less as compared to precipitation. For the Himalayan river basin the major challenges are less spatial coverage of raingauge data, difficulty in data collection and missing data. This will reduce the capability of raingauge stations to accurately capture the spatiotemporal variability of rainfall (Liu and Zipser 2014;Palazzi et al. 2013). Due to data scarcity the management and assessment of water resources are much needed for remote regions (Buytaert et al. 2012). The region where orography is complex and human settlement is less regular grids to be created by reanalysis and satellite retrievals to fill the lack of observations in an ungauged basin (Bai and Liu 2018). Many studies have suggested that higher frequency events better acquired by high spatial resolution climate data (Ward et al. 2011;Fuka et al. 2014). The performance of satellite precipitation products in the mountain region is dependent on complex topography, change of elevation, snow cover and seasonality. The reason of error in quantification of satellite precipitation events may be due to sampling error, error due to algorithms and instruments. The satellite rainfall data also have limitations of their short length of record (Derin and Yilmaz 2014). To address these challenges of datascarce basin high-resolution global reanalysis data have been widely used for hydrology models around the world (Zhao et al. 2010). Global forcing data developed using bias-correction (based on observation) of reanalysis data are also preferred nowadays for hydrological studies in mountain regions. The reanalysis products are gridded data at different spatial and temporal scales to represent the state of the atmosphere using the output of numerical atmospheric models, different data assimilation techniques and multiple observed datasets for multiple variables (humidity, temperature, solar radiation etc.) (Dee et al. 2011;Chen and Liu 2016). Climate reanalysis mechanic combines the model result with observation at regular grids. The reanalysis data are available for almost every region of the earth and a long term basis (Caroletti et al. 2019). Additionally the reanalysis products are not limited to topography and provide high-resolution precipitation at a quasi-global scale. The grid point distance of the reanalysis data is quasi-uniform. Therefore, the reanalysis estimates can be used to investigate the rainfall spatial variability on streamflow in mountain areas and provide long-term records (Lobligeois et al. 2014;Zhao et al. 2013). The individual performance of different reanalysis products depends on the assimilation of different portions of input observations, model physics, observing techniques, data assimilation schemes, available observations and resolutions (Lin et al. 2014;Haylock et al. 2008;Shea et al. 1994, Bao andZhang 2013). As a result, the applicability of the reanalysis products differs by region and evaluation plans (Essou et al. 2016a, b). The performance of different reanalysis data on a regional and global scale has carried out by many studies. The studies reveal that the large-scale performance of these data is useful but shows considerable variability at the regional scale. For example, Janowiak et al. (1998) have found a good agreement between National Centers of Environmental Prediction (NCEP) -National Center for Atmospheric Research (NCAR) and Global Precipitation Climatology Project (GPCP) raingauge-satellite combined data when compared at the global scale. However, these reanalysis data perform poorly on a regional scale. Lin et al. (2014) concluded that the seasonality of global Monsoon precipitation is correctly reproduced by MERRA (Modern-Era Retrospective Analysis for Research and Applications) and European Center for Medium-Range Weather Forecasts (ECMWF) ERA-Interim reanalysis data. Essou et al. (2016a, b) have compared the output of the hydrology model using global and regional reanalysis data in the United States. The reanalysis data show their potential to reproduce interannual variability of rainfall except for subtropical and humid continental regions. According to Hodges et al. (2011) Climate Forecast System Reanalysis (CFSR), MERRA and ERA-Interim perform better in Southern Hemisphere. So, it is necessary to review the efficiency of different reanalysis estimates in a particular region, especially in the mountain regions. Furthermore the measurement bias of precipitation between reanalysis and observed rainfall in the mountain regions is due to changing observation systems, low elevation stations and gauge undercatch problems (Fujiwara et al. 2017;Rasmussen et al. 2012;Li 1995). The researches reveal that the reanalysis products are improving with the development of data assimilation method, numerical modeling and increased computing power. The assessment of variability, trend and uncertainty is therefore needed before using reanalysis products in the climate study (Parker 2016). The representation of spatio-temporal processes by distributed hydrology model needs precipitation as the most important driver variable (Thiemig et al. 2013). The inferior quality of temperature and rainfall data due to observation and data processing error can be responsible for poor model efficiency in generating streamflow. Nowadays the hydrology models have been used to evaluate precipitation properties of the catchment by calibrating them to observed discharge. Due to high variability and dependency on station network the discharge observations are also utilized to correct orographic precipitation in the elevation zone. Several studies have conducted to evaluate the precipitation estimates based on streamflow simulation by hydrology modeling framework (Bai and Liu 2018;Sun et al. 2018;Li et al. 2015;Tong et al. 2014a, b;Mei et al. 2016). These researches have assumed that the error of rainfall products can be communicated into the simulated discharge. Even many studies have suggested that the best accessible evidence for catchment precipitation in the data-scarce basin is discharge which is superior to suggested meteorological observations (Duethmann et al. 2013;Henn et al. 2015;Sevruk and Mieglit 2002).
The structural error (incorrect description of processes), the error generated by model parameters and input data error are the main reasons for modeling uncertainty. The uncertainties from various sources are a crucial challenge for hydrological simulation. Therefore the improvement of hydrology models is needed to improve model efficiency and reduce uncertainty (Beven, 2006;Clark et al. 2011). Energy-balance based distributed high-resolution hydrology model more precisely analyze the sensitivity of the hydrology cycle in snow and glacierfed river basin by following the process-based physical rules. The model's grid-based configuration allows it to be coupled directly to land-surface schemes and highresolution climate models. The advantage of using the model over widely used temperature-index and degreeday model is-(i) simulation of complex events like rain on snow, (ii) snowpack melting where the only temperature has no direct correlation with energy, (iii) different physical aspects of generating runoff and snow/glacier melt runoff, (iv) describing the glacio-hydrology physical processes to reduce parameter uncertainty (Walter et al. 2005;Shrestha et al. 2015). The spatial variability of subcatchment elements is usually described by the distributed models using a node-link structure instead of spatial averaging (Zoppou, 2000) done by the lumped model to describe catchment behavior. One of the process-based distributed hydrology models is the Variable Infiltration Capacity (VIC) hydrology model. However very few studies have used VIC hydrology model to compare gridded rainfall dataset in the mountainous region (Yanto and Rajagopalan 2017;Tong et al. 2014a, b;Islam and Dery 2017). These studies prove that the streamflow quality depends on input forcing, model set up and capability.
The Beas river basin is a topographically complex, mountainous, high altitude and data-scarce Himalayan river basin. The elevation of the basin varies from 361 to 6188 m. The gauge and discharge stations are located at elevations ranging from 436 m at pong dam to 904 m at Pandoh dam and 2050 m at Manali. For the Beas river basin 21% area above 4800 m exists above sea level. At this elevation little or no weather stations exists. For this reason, reliable snowfall measurements are scarce by the raingauges at this elevation. According to Kumar et al. (2007) no observation stations are located in the Eastern part of the basin. Therefore the reliable gridded reanalysis meteorological data can be used as a proxy of observation stations for the hydro-climatic assessment in the Beas river basin. The reanalysis temperature data can also be used as the best parameter for snowmelt modeling of upper Beas where 65% of area is covered with snow during Winter (Singh and Jain 2002) and no observations exist. Very few studies have conducted to evaluate the performance of observed, satellite and model-generated precipitation for the Beas river basin of NorthWestern Himalaya (Li et al. 2013;Li et al. 2017). They have also applied conceptual and temperature index hydrology models using gridded precipitation data for a short term basis. Most of the studies have estimated streamflow for a single location. The finding of those studies is the underestimated streamflow as compared to observed data. But no studies have examined various high-resolution gridded reanalysis data for process-based distributed hydrology models to investigate their capability to represent precipitation patterns of the Beas river basin. The variability of reanalysis temperature and precipitation with observations has also not evaluated in previous research. This study focuses on these research gaps by assessment of various gridded reliable meteorological data. In this research a thorough assessment of five widely used reanalysis and global meteorological products [Modern-Era Retrospective Analysis for Research and Applications (MERRA), European Center for Medium-Range Weather Forecasts (ECMWF) ERA-Interim reanalysis, Japanese 55 year Reanalysis (JRA-55), Climate Forecast System Reanalysis (CFSR) and WATCH forcing data methodology applied to ERA-Interim (WFDEI)] are undertaken by direct comparison of these products with observations and evaluating these estimates by utilizing the hydrology model in the data-scarce Beas river basin. The modeling approach has also used in this study to understand the variability of orographic precipitation in Beas. Moreover, streamflow using reanalysis products has estimated for different locations of different elevations to consider the effect of topography on discharge. The purpose of the study is to evaluate the quality of the reanalysis products for hydrology research using a process-based distributed Variable Infiltration Capacity hydrology model (VIC) and how well each data reproduces spatial rainfall patterns for Beas river basin. One of the limitations of the study is the non-availability of point temperature data from weather stations. Therefore, the global monthly observed gridded Climate Research Unit (CRU) temperature data has been used for comparison purposes.

Study area
The area selected for the study is the Beas basin (up to Pong dam) lies in the North-Western part of the Indian Himalayan Region (Fig. 1). It has an elevation of 4361 m (14,308ft) and is situated at geographical co-ordinates 32° 21′ 59″-31° 16′ 09″ N and 77° 05′ 08″ E-74° 58′ 31″ E. The catchment area of the basin is 12,417 km 2 . The snow-covered and glaciated portion of the basin in upper reaches contributes meltwater to streamflow. The Winter season of Beas river basin has an average maximum temperature of 14.1 °C to a minimum of 0.22 °C. The average rainfall during April-June has estimated to be 106.12 mm. During Summer, temperature varies from a maximum of 24.6 ̊ C to a minimum of 8.9 °C, and average rainfall during this season is 86.83 mm. The Monsoon months (June-September) receive 70% of the annual rainfall. There is an occurrence of severe snowfall for this basin during Winter. Whereas the basin gets small amounts of rain from October to November (Ahluwalia et al. 2015).

Data used
The hydro-meteorological data play a crucial role in computing streamflow, rainfall-runoff, and Beas river basin's snow component. The hydro-meteorological data used as an input for the hydrology models are daily maximum, minimum temperature, rainfall, wind speed and streamflow. Daily observed point rainfall and streamflow from 1990 to 2009 for raingauge stations are obtained from Bhakra Beas Management Board, Himachal Pradesh. The raingauge stations are Banjar, Bhuntar, Janjehal, Larji, Manali, Pandoh, Pong, and Sainj. The streamflow data for 1990-2009 is obtained for the Pong dam, Pandoh dam, Thalout and Manali. The spatial distribution of precipitation and temperature data over the Beas river basin during different seasons has shown in Figs. 2 and 3. Instead of using an available huge number of data and surface fluxes layers only 4 parameters at daily scale has been used. The algorithm developed by Maurer et al. (2002) has been used to calculate the meteorological data such as vapor pressure, incoming shortwave radiation and net longwave radiation for Variable Infiltration Capacity hydrology model. The reanalysis and global meteorological data have used in this study are MERRA, ERA-Interim, JRA-55, CFSR and WFDEI. CFSR has a horizontal resolution of 38 km spanning the period of 1st January 1979 to the present day (Saha et al. 2014). CFSR has a 3D-variational analysis scheme of the upper-air atmospheric state with 64 vertical levels. The WFDEI Forcing data (Weedon et al. 2014) is produced from Watch forcing data and ERA-Interim reanalysis data. The mechanic follows sequential interpolation to a 0.5° resolution, MERRA covers the satellite era (from 1979 to the present). MERRA is generated from the Goddard Earth Observing System Model, version 5.2.0 (GEOS-5.2.0) and a data assimilation system based on a three-dimensional variational approach (3DVAR). The Japan Meteorological Agency (JMA) conducted JRA-55 (Japanese 55-year reanalysis), the second Japanese global atmospheric reanalysis project. It covers 55 years, extending back to 1958. Compared to it's predecessor, JRA-55 is based on new Data Assimilation And Prediction System (DA) that improves many deficiencies found in the first Japanese reanalysis (Kobayashi et al. 2015). ERA-Interim is the latest global atmospheric reanalysis produced by the European Centre for Medium-Wave Forecasts (ECMWF) and covers the period from 1st January 1979 to the present day (Dee et al. 2011). MERRA and ERA-Interim have a high spatial resolution of 0.5 × 0.67° (Rienecker et al. 2011) and 0.75 × 0.75° (Dee et al. 2011). JRA-55 data also has a high spatial resolution of 1.25 × 1.25°. CFSR and WFDEI have a less spatial resolution (0.5 × 0.5°) than other reanalysis data. All the temperature and rainfall data are interpolated to 0.5˚ by bilinear interpolation to make consistency among all datasets. Table 1 gives information on various reanalysis and global meteorological data sources.
The spatial data has used for the Variable Infiltration Capacity hydrology model are the Digital Elevation Model (DEM), LULC (Land use and land cover) and soil data. Elevation, basin and slope are derived from Aster DEM at 30 m resolution. Land use and land cover data (100 m) are obtained from Oak Ridge National Laboratory (ORNL) Distributed Active Archive Center (DAAC) for the year 2005 (Roy et al. 2015). The LULC product is comprised of water body, evergreen broadleaf forest, deciduous broadleaf forest, mixed forest, wasteland, grassland, shrubland, plantation, cropland, built-up and snow-ice classes. The other vegetation properties are taken from Global Land Data Assimilation System (GLDAS) vegetation parameter database. Soil map and information has obtained from the National Bureau of Soil survey and land use planning (NBSS&LUP) at 1:250 000 scale. The Elevation band parameter for the Variable Infiltration Capacity hydrology model is obtained from DEM. The elevation bands

Methodology used
It is a great challenge for interpolated coarse resolution reanalysis data to reproduce rainfall spatial pattern in the mountain regions. Due to the non-reliability of observed raingauge the basin-scale validation of precipitation is less studied. In this research an attempt has been made to understand the ability of reanalysis data as a proxy of observation and their ability to produce spatio-temporal rainfall patterns of the Himalayan Beas river basin. Figure 6. presents the methodology flowchart of this research. The below-mentioned methods have carried out using the following steps: 1. All the gridded reanalysis data, in NETCDF (Network Common Data Form) format has processed in Linux (Lovable intellect not using XP) platform and the rainfall data for each station has extracted according to latitude and longitude in notepad for a specific time period. 2. The temperature and rainfall data from ERA-Interim, MERRA and JRA-55 reanalysis estimates have converted to 0.5-degree resolution by bilinear interpolation method to compare reanalysis, global meteorological and raingauge data. The validation of temperature data has done gridwise. Whereas point to pixel comparison has done for modeled and observed rainfall data. The evaluation of temperature and precipitation data has carried out at mothly and annual scale using various statistical indices. 3. The simulation of the VIC hydrology model has done using reanalysis and global meteorological data. 4. A comparison of simulated streamflow with observed discharge data has carried out using calibration and validation for five different raw and bias-corrected reanalysis products, which is similar to the research methodology of Bai and Liu 2018.

Data comparison: temperature and precipitation
Due to the coarser spatial resolution and assimilation of limited observations the quality of reanalysis temperature and precipitation is needed to be compared with observed climate data before applying for the hydrology model (Essou et al. 2016a, b). In the present study the mean annual cycles and monthly climate data are calculated and compared for individual stations of the Beas river basin. For each climatic region bias, correlation and Root Mean Square Error (RMSE) are calculated between reanalysis, global meteorological data and observed post-Monsoon (September-November) period. Over a given period of time bias is the difference between temperature and precipitation data with observations. The overestimation and underestimation of rainfall and temperature data with observation are estimated by bias. The positive bias indicates overestimation and negative bias indicates underestimation. In comparison a perfect fit is indicated by null bias. The formula of bias is given in Eq. 1. RMSE is a measure of the deviation between the model and observed climate data (Eq. 2). The correlation coefficient (r) calculates the strength of the relationship between the relative movements of observed and modeled forcing data. A correlation of -1.0 indicates a perfect negative correlation while a correlation of 1.0 shows a perfect positive correlation (Eq. 3).
P i is modeled data and O i is observed data. In this study grid to point comparison has been made to compare reanalysis and global meteorological (1) precipitation data with raingauge data. Bilinear interpolation has been used for comparing gridded rainfall with ground observation rainfall data for eight stations Banjar, Bhuntar, Janjehal, Larji, Manali, Pong dam, Pandoh dam and Sainj at monthly scale.

Variable infiltration capacity hydrology model
Variable Infiltration Capacity hydrology model (VIC) is a semi-distributed, grid-based macroscale hydrology model (Nijssen et al. 1997;Liang et al. 1994). VIC hydrology model uses grid wise daily inputs of vegetation parameter, snow parameter, soil parameter, elevation and daily meteorological forcing parameters. The VIC hydrology model considers the effect of vegetation, topography and soil at daily or sub-daily time steps. VIC, process-based hydrology model simulates surface runoff, evapotranspiration, baseflow, snowpack and other hydrologic processes. A large number of parameters are required to run the VIC hydrology model i.e. vegetation, soil, elevation and meteorological forcing at each grid cell. The Beas river basin runoff not only dependant on precipitation. The topography, soil and land cover also have a significant impact on runoff. The finer resolution inputs (topographic, land cover and forcing data) in distributed hydrology models can reduce simulation uncertainty (Haddeland et al. 2002). Therefore the VIC hydrology model has implemented over the entire Beas river basin at spatial resolution 0.01 × 0.01°. The meteorological forcing (temperature and rainfall) data are converted to 0.01˚ resolution from 0.5˚ resolution by interpolation. The soil, LULC and DEM are also converted to 0.01˚ resolution by resampling. The mosaic scheme is one of  (Cherkauer and Lettenmaier 1999). The VIC hydrology model is also integrated with a glacier scheme. The glacier scheme can simulate glacier runoff (mm) from the glaciated area, including liquid precipitation and snow/ Fig.6 Flowchart of methodology used in this study glacier melt water. In this study VIC model simulation has been done in energy balance mode to estimate snowmelt/glacier melt runoff. The selection of calibration parameters plays an important role in controlling infiltration and baseflow factors that regulate the streamflow hydrograph. According to Nijssen et al. (1997) the parameters to be adjusted during calibration of the VIC hydrology model are infiltration parameter (b_inf ), the depth of the first and second soil layers (d1, d2), and three baseflow parameters (Ds, Ws, Dsmax). The parameter b_inf defines the shape of the variable infiltration capacity curve and the range has taken 0-0.4. Enhanced runoff production is due to an increase in b_inf. Whereas a reduced runoff is due to a decrease in b_inf. The soil thickness controls the soil moisture storage capacity. Thick soil depth has higher moisture storage capacity, less runoff, higher evapotranspiration and higher baseflow. The amount of water for transpiration and baseflow is controlled by the thickness of the bottom soil layer (d2). The ranges for the first and bottom soil layer varies from 0.05-0.25 m and 0.3-1.50 m. The maximum baseflow from the lowest soil layer (Dsmax) ranges from 0 to 30 depending on the soil's hydraulic conductivity. Ds is the fraction of Dsmax where the rapidly increasing nonlinear baseflow starts. The value of Ds ranges from 0 to 1. Higher baseflow occurs due to a higher value of Ds. Ws is the fraction of the maximum soil moisture of the lowest soil layer. The higher value of Ws tends to delay the peak runoff. The calibration and validation have conducted for each reanalysis and global meteorological data. The magnitude of different calibration parameters for the Pandoh dam and Thalout has shown in Figs. 7 and 8. The ranges of parameters vary for different rainfall data.
The agreement of simulated and observed streamflow during calibration and validation is judged by statistical parameters like: NSE (Nash Sutcliffe Efficiency), Coefficient Of Determination (R 2 ), Root Mean Square Error (RMSE) and PBIAS (Percentage Bias). R 2 is the squared ratio between covariance and multiple standard deviations of observation and modeled data. The R 2 ranges between 0-1 and indicate the relation between predicted and observed dispersion. Nash Sutcliffe efficiency varies between -∞ to 1 (perfect fit) (Moriasi et al. 2007). PBIAS indicates the overestimation and underestimation tendency of the simulated data with observed value (Gupta et al. 1999). The RMSE value indicates the match between observed and modeled data with perfect value 0. The poor match is indicated by increased RMSE value (Moriasi et al. 2007). The lower the RMSE value the better the model performance.
Q obs and Q sim are the average observed and simulated discharge. Q obs (t) and Q sim (t) are the observed and simulated discharge at time t, N is the number of observation.

Comparison of modeled and observed temperature
The 2-m average maximum, minimum and mean temperatures for the Beas river basin are compared for ERA-Interim, JRA-55, MERRA, CFSR and WFDEI. All the rainfall products are interpolated to 0.5-degree resolution. Figure 9 shows the monthly maximum, minimum and mean temperature of each reanalysis and global meteorological data.  Figure 10 shows the seasonal mean bias of all climatic data with monthly average CRU temperature data for all grids. All temperature data have less bias as compared to observation. ERA-Interim and CFSR have a very small bias compared to other data.
The seasonal and annual spatial distributions of the mean temperature biases have presented in Figs. 11 and 12. For MERRA the distribution of bias is between − 5.08-2.73 °C for all seasons. The bias of MERRA is warmer in the Western and North-Western parts of the Beas river basin (> 2 °C during Winter) and cooler in other regions. ERA-Interim has a cooler bias for both seasonally and annually. The bias of ERA-Interim temperature for all seasons is between − 6.71-0.16 °C. For (4) Except for Summer season the Western and North-Western part of river basin has a warmer bias. The bias distribution for CFSR for all seasons is − 9.39-2.45 °C. WFDEI agrees well with observed temperature (− 0.5-0.5 °C) for all seasons. During the Winter season a warmer bias of WFDEI exists for the whole river basin. During Summer and post-Monsoon season whole basin experience a cooler bias for WFDEI (− 0.54-− 0.05 °C). In Fig. 11 the spatial distribution of monthly temperature bias has presented. Figure 12 represents the annual temperature bias of the Beas river basin. For JRA-55 the middle portion of the Beas basin has a warmer bias (0.50-1.40 °C) annually. The Eastern and North-Eastern portion of basin has a cooler bias for MERRA, JRA-55 and CFSR. For ERA-Interim a cooler bias observed for the whole basin (− 5.33 to − 0.9 °C). WFDEI has a warmer bias for the entire basin. There is a variation of biases for different seasons and different reanalysis data. Figure 12 also shows the correlations between the annual temperature of the various data for the period 1990-2009 and observations. The correlation of MERRA with observation is > 0.50 for the whole basin. The spatial pattern of correlation at Western, North-Western and mid portion of the basin is similar for both ERA-Interim and JRA-55 reanalysis data (> 0.70). ERA-Interim has a higher correlation (0.50-0.80) for all basin except a smaller portion of Southern basin (− 0.44). JRA-55 and CFSR have a small correlation in the North-Eastern part of the basin (< 0.30). However JRA-55, CFSR and WFDEI have correlation of 0.50-0.77, 0.50-0.67 and 0.52-0.77 respectively for whole basin.
The RMSE value of WFDEI (Fig. 12) is less than other reanalysis products (< 0.50 °C). For CFSR the RMSE value is more for Eastern and North-Eastern parts of the Beas river basin (4.34-8.96 °C). The RMSE of JRA-55 is higher for the North-Western part of the basin (5.79-6.90 °C). In other parts of the basin the RMSE value of JRA-55 is between 0.48-3.09 °C. ERA-Interim has RMSE value 3.61-5.07 °C in the North-East and Mid-East part of the Beas river basin. A lower RMSE value of ERA-Interim between 0.98-2.67 °C has observed in other portions of the basin. MERRA has RMSE < 5 °C for the entire basin.

Comparison of modeled and observed rainfall products
MERRA, ERA-Interim, JRA-55, CFSR and WFDEI rainfall data have compared with observed rainfall data for raingauge stations Banjar, Bhuntar, Janjehal, Larji, Pandoh, Pong and Sainj. Figure 13 represents the average monthly rainfall variation for these five climatic products for the Beas river basin. MERRA, ERA-Interim, JRA-55, CFSR and WFDEI rainfall varies from 112.05-1307.19, A comparison of mean monthly rainfall is presented in Fig. 14. All the rainfall products for all stations show a seasonal variation. However, MERRA followed by ERA-Interim and JRA-55 highly overestimates observed rainfall for all stations. According to Fig. 16 the three rainfall products MERRA, ERA-Interim and JRA-55 overestimate the rainfall for the dry season. Because during December-February (Winter) and March-May (Summer) there is a high positive bias for these three rainfall estimates. The poor ability of reanalysis data to capture Summer convective precipitation for their spatial complexity is likely the main reason for overestimated Summer precipitation. Whereas the overestimated Winter precipitation is due to mismeasurement of snowfall by raingauges compared to liquid precipitation (Rasmussen et al. 2012;Goodison et al. 1998) or likely for non-raining clouds due to warm tropical convective systems (Ashouri et al. 2015). Hence Monsoon season generates a substantial bias. According to Bosilovich et al. (2008) the Monsoon season precipitation bias is likely for overestimated moisture content and observation system-generated precipitable water. The other season possesses a relatively low bias except MERRA in Banjar, Bhuntar, Larji and Sainj. However the MERRA rainfall bias for all stations is abnormally higher compared to other rainfall products for all seasons.
The mean annual comparison using statistical approaches has carried out in this study for modeled data with observed point rainfall data to quantify their performances. Pearson's correlation coefficient is used to evaluate how well the estimates correspond to the observed data. Table 2 showing the monthly statistical indicators to understand the performance of the reanalysis products for eight raingauge stations. CFSR and WFDEI have a very good correlation coefficient ranges 0.80 to 0.86 (except WFDEI at Banjar and CFSR at Pandoh (0.65)) as compared to MERRA, ERA-Interim and JRA-55. ERA-Interim and JRA-55 have a moderate R 2 value that varies from 0.65-0.73 next to CFSR and WFDEI except for ERA at Sainj (0.60). MERRA shows a lower R 2 value of 0.60-0.65. All the reanalysis products correlate observed point rainfall data.
A higher bias and RMSE value have been observed for MERRA rainfall at a monthly scale compared to other rainfall estimates for all stations. Next to MERRA, ERA-Interim and JRA-55 also possesses a higher RMSE for all stations. The CFSR and WFDEI data have low RMSE (15-50 mm month −1 ) as compared to other products. ERA-Interim precipitation has significantly higher overestimation next to MERRA. Overestimation of JRA-55 rainfall also observed next to ERA-Interim. However the underestimation (negative bias) of CFSR and WFDEI precipitation has found when compared with observation.

Simulation of monthly streamflow
In this study the Variable Infiltration Capacity hydrology model has been used to compare streamflow from observed, reanalysis and global meteorological data after finding overestimation/underestimation of reanalysis products compared to observations. The ERA-Interim temperature data has used as meteorological input along with rainfall data from different reanalysis products for VIC hydrology model.  1994-1999, 1993-1997, 1992-1997 and 1994-1999. Whereas the period of validation for the above-mentioned stations are 2003-2009, 1999-2005, 2000-2009 and 2003-2009 respectively. The accepted value for NSE is considered as 0.6 (Essou et al. 2016a, b). According to Moriasi et al. (2007) the model performance is very good when PBIAS < ± 10, good when ± 10 ≤ PBIAS ≤ ± 15, satisfactory when ± 15 ≤ PBIAS ≤ ± 25 and unsatisfactory when PBIAS > ± 25. If R 2 considered alone for model evaluation criteria the major drawback is dispersion is quantified. Even the model underpredict or overpredict systematically the R 2 value still results in very good and close to 1 (Krause et al. 2005 underestimates observed streamflow for the entire simulation period. The reanalysis is also not able to properly follow the observed hydrograph pattern. The reason may be the dataset fails to reproduce spatial pattern precipitation for the Beas river basin. However it tends to follow the low flow pattern of observed hydrograph with slight underestimation. For Pandoh dam slight overestimation of low flow observed for JRA-55. The performance of JRA-55 also not acceptable in terms of NSE (0.10-0.44) and PBIAS (− 50.00 to − 11.00%). CFSR and WFDEI heavily underestimate the streamflow (peak and low flow) for simulation. The model performances using these reanalysis products are not good due to poor quality rainfall estimates. The NSE and PBIAS values are inferior for these two products when compared to observed streamflow. The NSE and PBIAS for CFSR ranges − 0.65-0.13 and − 80.87 to − 56.74%. Whereas for WFDEI the value of NSE and PBIAS varies from − 1.16 to − 0.17 and − 93.75 to − 82.00%. The JRA−55, CFSR and WFDEI also have less ability to produce peak flow during Monsoon. Figure 15 also indicates that JRA-55, CFSR and WFDEI data have poor seaonal cycles and miss most of the peak during Monsoon season. A good performance of the ERA-Interim dataset observed for the study basin for the whole simulation period. The NSE (0.73-0.77) and PBIAS (− 13.68 to 18.00) value indicate a good match of modeled and simulated flow using the ERA-Interim data. However the RMSE value of ERA-Interim is less as compared to other reanalysis products. Additionally ERA-Interim data follows the hydrograph pattern properly (High flow and low flow). ERA-Interim is also able to simulate high peak of streamflow for all stations as compared to other rainfall products. However

Discussions
Temperature and precipitation from four reanalysis and global meteorological data are evaluated to examine their perspective to use as a substitute of observation. From the result it has observed that there is a good similarity exists between reanalysis temperature products and observed temperature. There is also no high variation of temperature has observed between all reanalysis data. The radiosondes and satellite derived atmospheric temperature products have regularly assimilated with the reanalysis system which is the main reason for their good association with observed temperature (Essou et al. 2016a, b). However the seasonal and annual bias of temperature is high at the Western and North-Western portion of the Beas river basin for all reanalysis products. In contrast the Eastern and North-Eastern part of the basin has a cooler bias. The reason for differences between various reanalysis temperatures is the variability of land-atmosphere interaction and land surface scheme. Different SST (Sea surface temperature) datasets used in reanalysis data can be responsible for their discrepancy to some extent (Shah and Mishra 2014). The observed temperature pattern better represented by ERA-Interim regarding bias, RMSE and correlation which resembles the findings of Shah and Mishra (2014). Due to scarcity of direct snow measurement in snow/glacier covered Eastern Beas the reanalysis temperature can also be a useful data source for estimating snowmelt runoff by energy balance based VIC hydrology model. In data-scarce basin like Beas there is uncertainty in getting high quality of observed rainfall data (Rolland 2003). The reason is the irregular distribution of weather stations, cold weather terrain, wind and massive snowfall. According to Barros et al. (2004) in the Himalayan region like Beas precipitation varies between valleys and ridges. So, the variability at the scale of kilometers cannot be determined by a single raingauge station. There is a considerable difference when station rainguage data have compared with the associated pixel value of modeled gridded rainfall data. The non-availability of raingauge stations at high elevation can also cause underestimated precipitation at highland. Because of this there is a high need to investigate grid-based high-resolution reanalysis rainfall data as a substitute of point raingauge data for the Beas river basin. In this study MERRA, ERA-Interim, JRA-55, CFSR and WFDEI rainfall data have compared with point rainguage data for a long term basis. All the rainfall data have correlated observed rainfall in terms of R 2 and NSE. CFSR and WFDEI reanalysis data underestimate observed rainfall due to the uncertainty of these Fig. 11 The mean seasonal temperature bias (°C) between modeled (reanalysis and global meteorological) and CRU observed gridded data for time period 1990-2009 for the Beas river basin. djf (December-February), mam (March-May), jja (June-August) and son (September-November) data. MERRA, ERA-Interim and JRA-55 overestimate observed rainfall. All the reanalysis and global meteorological estimates have a high spatial resolution. The inconsistency of spatial scale between grid-cell average value and observed data of raingauge stations could cause some degree of overestimation or underestimation of these coarse resolution reanalysis products (Maraun 2013). The interpolation of MERRA, ERA-Interim and JRA-55 rainfall data to 0.5-degree spatial resolution can also induce error to the outcome which affects the validation result. The higher RMSE value for MERRA, ERA-Interim and JRA-55 reanalysis products may be due to sensitivity of RMSE to heavy convective and local precipitation events in high altitude Beas river basin. Whole pixel estimation of rainfall during localized precipitation events can also cause error in gridded reanalysis products. Due to complex topography and high spatio-temporal variability of rainfall the straight comparison of different reanalysis and global meteorological products upon raingauge is not possible for the Beas river basin. As a result the simulated discharge of the VIC hydrology model has been evaluated in this research to indirectly review the quality of these reanalysis estimates to reflect topographical complexity of rainfall. The capability of these reanalysis data to capture the magnitude of mountain rainfall patterns also have evaluated through the modeling approach. The non-bias corrected rainfall data has used as hydrology input due to the poor quality of rainguage data. According to Essou et al. (2016a, b) the bias correction of reanalysis data with observation could introduce additional error in reanalysis data in datascarce mountain region. The result of the study brings out the difference in skill between different reanalysis data to reproduce orographic rainfall.
1. MERRA overestimates the observed raingauge data (Figs. 14, 16). The coarse resolution reanalysis also overestimates discharge when applied to the hydrology model (Figs. 17,18,19,20). The reason is the dependability of the reanalysis data on the weather forecast model's mechanic to simulate precipitation and not to assimilate surface precipitation data (Essou et al. 2016a, b). Due to limited ground observation the assimilation system of MERRA faces problems due to tropical continental precipitation. Another major reason for it's recovery of performance over land is cloudy conditions. The reanalysis data is also unable to properly parameterize landatmosphere interactions (Blacutt et al. 2015). Therefore the higher uncertainty exists between observed and MERRA precipitation. 2. Next to MERRA a higher bias of ERA-Interim rainfall has observed upon raingauge data (Figs. 14,16). For coarse resolution reanalysis products like MERRA, ERA-Interim and JRA-55 the high precipitation likely comes from parameterized convection. The other reason may be the assimilation of a limited set of observations and limitation of parameterization during the process of precipitation generation (Beck et al. 2018). ERA-Interim uses less surface observations as compared to JRA-55, CFSR and WFDEI reanalysis data. Shah and Mishra (2014) and  Ghodichore et al. (2018) have also found the overestimated precipitation of ERA-Interim in North India as compared to observation. Hence, the satisfactory result has been obtained after simulation of the model by ERA-Interim as it tends to follow the hydrograph pattern of observed streamflow regarding other products (Figs. 17,18,19,20). The performance of ERA-Interim is also acceptable in terms of statistical parameters NSE, PBIAS and RMSE during both calibration and validation periods. The reanalysis performs well in streamflow simulation for complex terrain of Manali where reliable snowfall measurement is tough for gridded rainfall products. This proves that ERA-Interim reanalysis better represents kinetic precipitation of the Beas river basin. Bhattacharya et al. (2019) also found that the rainfall gradient of ERA-Interim is linearly correlated with altitude. The reason may be the ERA-Interim reanalysis product uses a four-dimensional variational (4D-var) analysis model. The 4D-var is automatically adjusted to the bias of satellite observation of radiance, modifies convective and boundary layer cloud schemes, increases the stability of the atmosphere and produces a small amount of rainfall (Dee et al. 2011). Additionally observations are more effectively used by 4D-var due to the extraction of details of mass field trends (Rabier et al. 1998(Rabier et al. , 2000. The ERA-Interim also formulate a background error problem, improve the physical model and perform better in simulating various land surface schemes (Simmons et al. 2010;Olauson 2018).  3. Instead of using merged precipitation from the observed station during assimilation, the overestimation of JRA-55 precipitation (compared to raingauge stations) happens due to excess rainfall after the beginning of forecasts (Figs. 14, 16) due spin-down problem, dry bias in tropical Beas (Kobayashi et al. 2015), convective scheme (Arakawa and Schubert 1974) adopted by JRA-55 and implementing convection-triggering mechanism (DCAPE) which generates higher rainfall. Ghodichore et al. (2018) found overestimated JRA-55 compared to observation in North India. Hence the underestimation of stream- flow has observed for JRA-55 during the simulation period (Figs. 17,18,19,20). The reanalysis data is also weak in following the observed hydrograph pattern. JRA-55 incorporates advanced features like, 4D-var during assimilation, improved bias-correction method for satellite data, high-resolution of model and integration of several observed data (Kobayashi 2020). Still the poor ability of the reanalysis to reproduce interannual variability of Beas rainfall may be due to error induced during bias-correction for observed data. Less dense and poor quality of observed data used in the bias-correction algorithm is the main cause of such errors (Essou et al. 2016a, b;Kobayashi 2020). The other reasons for inaccuracy of the reanalysis products are uncertainty of model and alteration of the observing system. Therefore serious attention is needed to apply JRA-55 rainfall (Bosilovich et al. 2011;Trenberth et al. 2011) for hydrology modeling. 4. CFSR rainfall underestimates the rainguage data (Figs. 14,16). The reanalysis product also heavily underestimates the observed hydrograph (Figs. 17,18,19,20). Shah and Mishra (2014) have a similar finding of underestimated CFSR rainfall in the North-Western region of India. Further the rainfall estimate fails to reproduce the Summer and Winter hydrograph pattern due to the variability of rainfall. CFSR uses three-dimensional variational data assimilation (3D-var) scheme, assimilate satellite radiance, use automated variational scheme for bias-correction of satellite radiances and generates precipitation field by observed rainguage data (Saha et al. 2010;Wang et al. 2011;Xie et al. 2007). The weak spatial distribution of CFSR rainfall and hydrology model uncertainty may cause by assimilated poor quality raingauge data (Kobayashi et al. 2015). The other reasons for the inferior data may be error involved in the algorithm for combining several observed rainfall data, bias-correction error and error due to the specific algorithm from observation operator during estimation of precipitation (Janjic et al. 2017;Shen et al. 2010). 5. Like the CFSR, WFDEI data underestimate both rainguage and discharge data (Figs. 14,16,17,18,19,20). The reanalysis is also not able to accurately simulate the temporal streamflow pattern. WFDEI uses CRU/GPCC observed data to correct ERA-Interim reanalysis for precipitation bias (Weedon et al. 2014).
The bias-correction of WFDEI with CRU observed data intruse error in the data. The CRU data uses only a portion of all rainguages (old observation data located at valley floor) which is unable to represent the orographic rainfall pattern leads to improper precipitation phase of WFDEI (Beck et al. 2017b, a;Weedon et al. 2014;Li et al. 2013).
The result of the study reveals that the performances of MERRA, JRA-55, CFSR and WFDEI are not good in Beas due to their dependency on altitude. Essou et al. (2016a, b) have reported that the inferior standard of reanalysis products in producing an adequate simulation of streamflow in the subtropical Beas river basin is may be due to non-uniform distribution of precipitation, sensitivity to daily precipitation for seasonality, weak mean annual cycle and poor simulation ability of local events, i.e. convective storm during Summer. Sun et al. (2018) have found overestimated precipitation of MERRA and ERA-Interim at high elevation as compared to observation, higher precipitation of JRA-55 at tropical regions, poor ability of reanalysis products in estimating orographic precipitation and higher interannual variability of Monsoon season precipitation. Additionally Ghodichore et al. (2018) have compared NCEP/NCAR, CFSR, ERA-Interim, MERRA and JRA-55 reanalysis products for India. The study finds notable seasonal and regional differences exist between reanalysis and observed rainfall data in the complex data-scarce mountain region. In this study ERA-Interim has found to be best among all reanalysis products. The other researchers (Essou et al. 2016b(Essou et al. , 2017Beck et al. 2017a;Sun et al. 2018) also found the superior performance of ERA-Interim in the data-scarce region. The adjustment of model calibration parameters in this study has done so that model results consistently come close to observed data. However the parameters adjusted during the calibration period to increase or decrease discharge cannot substantially improve streamflow for poor-performing reanalysis products. Oudin et al. (2006) have reported that modified calibration parameters have a little influence to compensate misrepresentation of streamflow due to precipitation. According to various studies (Fu et al. 2011;Nkiaka et al. 2017) for catchment areas larger than 1000 Km 2 the rainfall data from large area smooths the spatial resolution effect on streamflow. So, there is an insignificant impact of rainfall spatial resolution on streamflow for the Beas river basin (catchment area 12,417 km 2 ). The improvement of Beas streamflow totally depends on the ability of reanalysis products to produce accurate precipitation. Many studies also have proved the uncertainty in simulated streamflow may be due to input precipitation error (Hong et al. 2006;Moulin et al. 2009). In this research VIC hydrology model shows it's potential to indirectly differentiate various reanalysis rainfall products by simulated streamflow.

Conclusions
The inferior quality of observed rainfall data is the main reason for poor simulated discharge in data-scarce and topographically complex Beas river basin. Observed rainfall inherent some uncertainty due to measurement error. In this study observed station rainfall and temperature data are compared with different reanalysis and global meteorological data. The spatio-temporal variability of various modeled climate data is also compared by simulated streamflow accuracy of the VIC hydrology model. The comparison of various reanalysis, global meteorological and station data in Beas has been conducted to find out reliable climate data as a proxy of observations and to find out the similarity and inconsistency between various datasets. The performance evaluation of various precipitation and temperature products has done at monthly and annual basis and based on statistical metrics. The study revealed a good correlation between reanalysis and observed temperature data. The gridded reanalysis temperature better represent snowmelt runoff in data-scarce snow/glacier covered Eastern Beas. Weak performance of reanalysis rainfall data as rainfall-runoff input has observed as compared to temperature in this study. All modeled rainfall data show a considerable difference when compared with observed data. JRA-55, CFSR and WFDEI are also not able to reproduce the observed hydrograph pattern accurately. MERRA overestimates station rainfall and observed discharge data due to error in model operation and not including observed precipitation data during assimilation. The performance of JRA-55, CFSR and WFDEI is poor as compared to raingauge and observed streamflow might be due to intrusion of error during observed rainfall data assimilation. This indicates the need to modify the rainfall retrieval algorithm for these above-mentioned data due to complex topography and raingauge limitation of the Beas river basin. However after comparing all global reanalysis and meteorological data ERA-Interim is found to give better performance as a meteorological input of the hydrology model. ERA-Interim provides a good match of temperature with observed station data for the whole basin. Moreover, the ERA-Interim temperature has no warmer bias during the simulation period. The ERA-Interim rainfall overestimates the observed data seasonally and annually. After the hydrologic simulation, it proves it's potential over observed rainfall data as a good rainfall-runoff input. The reason may be the topographic influence of high altitude Beas is less for ERA-interim rainfall than other rainfall data. The better performance of ERA-Interim probably due to the assimilation of climate data from observed stations and have the advantage of using a four-dimensional variational analysis model. The reanalysis data is also near Fig.17 Hydrograph of observed and simulated flow of Manali for calibration period (1994-1999) and validation period (2003-2009) real time and daily basis upgraded which is beneficial to proper management of water resources. So high-resolution ERA-Interim reanalysis can be used as a reliable climate data over observations for the data-sparse Beas river basin. The result of the study concludes that the accuracy of rainfall products is responsible for improving hydrology modeling results. This will also help researchers to find out ways of improving the quality of rainfall for hydrology modeling.  (1994-1999) and validation period (2003-2009)