Assessment of soil quality for agricultural purposes around the Barapukuria coal mining industrial area, Bangladesh: insights from chemical and multivariate statistical analysis

Barapukuria coal mine which has been operating from 2002 is situated in such a location that is dependent on agriculture. Pollution from Coal mining poses a huge risk to the ecosystem and surrounding soil. So, there could be some impact of the Barapukuria coal mine on the surrounding soil. The main aim of this study is to identify the soil quality of the study area and also see how much the soil is deviated from its standard reference value. From the result it has been identified that pH of the soil is relatively low near the coal mine due to the acid mine drainage. Moreover, Total Nitrogen of the soil samples is also below than the standard value. On the other hand, Iron (Fe), Potassium (K), and Organic matter (OM) and Copper (Cu) are relatively higher than the Standard Reference value for agriculture. So, it is clear that Iron pyrite (FeS2) and chalcopyrite (CuFeS2) which are released during mining operations is the pivotal reason for the degradation of the soil of the surrounding area. Various statistical analytical tools have been used in this study for analyzing the data. Correlation matrix and factor analysis show that degradation of soil is encountered because of anthropogenic causes i.e. because of the coal mine. Besides, cluster analysis entails that which soil samples are deviated from their standard condition and also classifies the soil samples according to their homogeneity. Moreover, one way ANOVA identifies the soil quality controlling factors and the spatial variability of soil samples. From the agricultural point of view, this study reveals that there are significant factors involved in the degradation of the soil and this degradation is occurring rapidly. The nutrients in the soil for plant growth such as Nitrogen (N), Potassium (K), Iron (Fe), Copper (Cu) and Organic matter (OM) are deviated from the standard reference value due to mining activities which ultimately affect the annual paddy production of the study area.


Introduction
Agriculture is the driving force to boost up the economy of a developing country like Bangladesh and soil quality is the most important factor in this regard. Industrialization has an adverse effect on agriculture throughout the whole world. Mining brought new potential hazards and risks to the environment (Lazareva and Pichler 2007;Othman and Al-Masri 2007;Li et al. 2014). In addition, mining waste land is an inevitable byproduct which caused a great mass of soils being spoiled away (Liu et al. 2003;Li et al. 2014). On the contrary, coal is the most abundant fossil fuel on the earth (Rashid et al. 2014) that comprises about 75% of the total fuel resources (Elliott 1981;Rashid et al. 2014). So, coal mine is necessary to meet the energy demand of a country and coal mines are being excavated frequently in the world. But coal mining has severe environmental, ecological, and human-health consequences. If not done properly, coal mining has potential to damage landscape, soils, surface water, groundwater, air during all phases of exploration (Martha 2001).
Barapukuria coal mine is the only mine of Bangladesh which has been running since 2002 by using underground multi slice long wall mining method. This coal mine has been extracted 9,240,718.745 MT coal by the year 2016-2017 (Petrobangla 2017). In this mining operation underground sump water is released to the surface without proper treatment. Moreover, there is no proper infrastructure for coal storage so coal dust can get easily mix with the surrounding soil and pollute the soil which ultimately effect the vast paddy land around the Barapukuria coal mine area. However, the impact of the coal mine on the soil quality and paddy production have not been considered previously. Therefore, the main objective of this study is to determine the quality of surface soil samples and find out how much soil is degraded with respect to the agricultural point of view. This study also entails about the chemical reaction process which are involved in the soil degradation.

Location and geology of the study area
Barapukuria coal mine is situated in North-Eastern part of Bangladesh. It is 299.4 km away from the capital of the country, Dhaka. Figure 1 is showing the index map of the collected sample of the study area. The Barapukuria coal deposit has a proved area of 5.25 sq. km. (approx.), with a total reserve of 390 million ton. Long wall multi slices coal mining method has been using in this mine (Imam 2013). Recently, the newly adopted technique Long wall Top Coal Caving (LTCC) is being used in the mine. Based on geological framework, Bangladesh is divided into two main divisions namely (Imam 2013): (1) The Precambrian Indian platform and (2) The basin or geosyncline. Aligned from north-east to south-west, a narrow zone between the above a mentioned division is called "the Hinge zone'' . Further the Precambrian Indian platform is subdivided into: (1) Rangpur Saddle, and (2) Bogra shelf; whereas the basin or geosyncline is into: (1) Bengal for deep, and (2) Folded Belt (Imam 2013). The Barapukuria coal basin is located in the tectonic unit Rangpur saddle of the stable platform in the northwest Bangladesh. The coal bearing sedimentary rocks of the basin lie unconformably on the Precambrian crystalline basement. The basin is elongated in north south direction and has a length of about 4.5 km and a width of about 1.5 km. It is a half graben type asymmetric basin bounded in the east by a major fault known as Eastern Boundary fault. The surface geology over the entire study area comprises the Tertiary Dupi Tila formation, which unconformable overlays the Gondwana (Permian) coal-bearing sediments. These are folded into an asymmetric syncline or basin, whose axis strikes approximately north-south. A major fault forms the eastern limit of the deposit, beyond which, Archaean basement rocks (Pre-Cambrian) are present immediately below the Dupi Tila. Several lesser faults were identified within the coal basin by geophysical seismic survey. The Gondwana sequence comprises predominantly sandstones, with subordinate siltstones and mudstones, which contain up to six coal seams in the centre of the basin. The lowest of these, Seam VI is the principal target seam of the Barapukuria coal mine and this seam has an average thickness of 36 m (Imam 2013).

Soil sampling
Eighteen soil samples were randomly collected from the Barapukuria Coal mine area and all soil samples were surface soil samples. Random sampling involves the arbitrary collection of samples within a defined area. Random sampling was done in order to eliminate the tendency of biasing the samples. Samples were collected during summer in the year of 2015. Most of the soil samples were collected from the agricultural lands near the coal mine and some samples were collected little away from the mine area in order to find the natural condition of the

Chemical analysis
Each soil sample was reduced to 200 g by quartering. The samples were dried naturally without sun light and then sieved through a 2-mm sieve. Each sample was separated for the measurement of 11 chemical properties, including pH, Organic Matter (OM), Potassium (K), Sulfur (S), Nitrogen (N), Phosphorus (P), Iron (Fe), Arsenic (As), Copper (Cu), Nickel (Ni) and Total alkalinity (as CaCO 3 ). The pH value of soil sample was determined by using digital pH meter. Organic matter (OM) was determined following potassium chromate method for determination of soil organic matter [NY/T 1121[NY/T .6-2006[NY/T (2006]. Total Nitrogen was analyzed using Semi-micro Kjeldahl method (Fu et al. 1999;Li et al. 2014). Potassium (K) was determined by Potassium Strip Test, Sulfur (S) was determined by using Ion Chromatography, Iron (Fe), Copper (Cu) and Nickel (Ni) were determined by Spectrophotometer. Arsenic (As) was determined by Hach EZ Arsenic test kit and total alkalinity (as CaCO 3 ) was determined by Titration Method.

Correlation matrix and principal component (PCA)/factor analysis
Because of the simplicity of the method, PCA is most popular method all over the World. PCA are widely applied to analyze interrelationship among different sets of groundwater hydro-chemical and soil sample data to extract the most significant factors and to reduce the data with minimum loss of information (Mustapha and Aris 2012;Schaefer and Einax 2010;Howladar and Rahman 2016).
In this research, PCA is applied to soil sample data which are collected from the Barapukuria Coal mine area to determine the principal factors corresponding to the different sources of variation in the data.
In PCA, each manifest variable is a linear function of principal components, with no separate representation of unique variance where, Z c is a N*p matrix of standardized component scores and K c is a p*p matrix of component loading. PCA is a linear orthogonal transformation of the variables in Y to principal components in such a way that the first component has the largest possible variance; the second component has the largest possible variance of the remaining data, etc., with the total p components explaining 100% of the variance.
When S is a p × p sample correlation matrix, S can be decomposed as follows: with W is a p × p matrix of eigenvectors and M is a p × p diagonal matrix of eigenvalues of S. The component scores Z c are: The p × p matrix of component loadings (i.e., the correlation coefficients between Z c and Y) is calculated as follows (De Winter and Dodou 2016; Howladar and Rahman 2016):

Cluster
Cluster analysis is a group of multivariate technique whose primary aim is to assemble objects based on the characteristics they possesses (Shrestha and Kazama 2007;Oketola et al. 2013;Howladar et al. 2017). Hierarchical clustering joins the most similar observations. The levels of similarity at which observations are merged are used to construct a dendrogram (Oketola et al. 2013;Howladar et al. 2017). Cluster analysis is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The Euclidean distance (E 2 ) is the geometric between two objects and can be calculated by the given formula: (Howladar et al. 2017) ANOVA "One way between subject effect analysis of variance" (one-way ANOVA) compares the variance (variability in scores) between the different groups with the variability within each of the group (Rogerson 2010, Howladar andRahman 2016). The analysis of variance (ANOVA), also known as the F-test, is a method to determine the variation of the means of a group of data or variables to evaluate statistical significance. This method, when comparing two means, is similar to the t test for independent samples. The single factor ANOVA test assumes a null hypothesis, K 0 , which states there is no difference between the groups within the population, as shown in Equation.
K 0 : a 1 = a 2 = · · · · · · .a q = 0 If the analysis is found to be statistically significant, then the null hypothesis is rejected for the alternative hypothesis. The alternative hypothesis states that the means of the groups in the population are different. For this research, a P-value of ≤ 0.05 was used to determine statistical significance. In this research One way ANOVA was performed by using IBM SPSS Statistics 22 software for 18 soil samples.

Chemical analysis results in the context of soil quality
The result of analyzed variables e.g. pH, Total Nitrogen, Organic Matter (OM), Potassium (K), Phosphorus (P), Sulfur (S), Iron (Fe), Arsenic (As), Copper (Cu), Nickel (Ni), and Total alkalinity is summarized in the Table 1. The table entails about the maximum, minimum, average, and standard deviation of each chemical parameter. From the table it is seen that Sulfur (S), Phosphorus (P), and Total alkalinity have relatively high standard deviation. High standard deviation indicates that the data points are spread out over a wider range of values. So the soil samples have a lot of variability regarding Sulfur (S), Phosphorus (P), and Total alkalinity. On the other side, standard deviation of Arsenic and Total Nitrogen are very low that means most of the soil samples contain relatively similar amount of Arsenic (As) and Total Nitrogen.
The experimental value of different chemical parameters has been shown in Table 1. Moreover, the standard reference value for each parameter has been presented also in the same table based on Regional Soil Resource Development Institute of Bangladesh (RSRDIB 2015). The table clarify that the mean value of pH, Total N are lower than the Standard reference value. On the other hand, Organic matter (OM), Potassium (K), Iron (Fe) and Copper (Cu) are higher than the Standard reference value (Fig. 3). Higher value of these chemical parameters is not favorable for the production of paddy.
Only Sulfur (S) and Phosphorus (P) have mean value within the optimum limit for soil. This is a clear indication that soil quality of the study area is degraded to a large extent from their standard reference value. Higher value of Iron (Fe) indicates that that Iron pyrite (FeS 2 ) and chalcopyrite (CuFeS 2 ) which are released during coal mining operations are liable for high Iron (Fe) and Copper (Cu) in the soil samples. Again the high Fe quantity of soil make lower the pH value of the soil and that's why pH value of the soil samples in surrounding area of mine found low. The chemical reactions involved for lowering the pH and Copper (Cu), and Iron (Fe) generation are shown below: Iron pyrites + Oxygen + water = Iron(iii) hydroxide ion + Sulfuric acid Cu formation: Sulfuric Acid formation from SO 2 : Equation (1) indicates the formation of H 2 SO 4 from Iron pyrites. During mining operation of coal iron pyrites is released and mixed with very much available Oxygen and water and finally form H 2 SO 4 . Equation (2) indicates why Cu and Fe are much higher in the study area. Chalcopyrite (CuFeS 2 ) is also released during coal extraction (1)  (Imam 2013). So, surrounding soil of coal mine is also affected by the S through the precipitation of coal dust in the agricultural land.

Application of multivariate statistics to elucidate the spatial differences of soil quality of the study area Correlation matrix and factor analysis
Correlation matrix is used for determining the dependence between multiple variables at the same time. Classification of principal components is thus "strong", "moderate" and "weak", corresponding to absolute loading values of > 0.75, 0.75 − 0.50 and 0.50 − 0.30, respectively (Liu et al. 2003). However, positive correlation between two variables indicate that one variable will increase or decrease with respect to other changes and negative correlation between two variables indicate one variable will increase when other will decrease and vice versa. Table 2 Table 3 it is obvious that Factor 1 has eigenvalue of 2.8, Factor 2 has eigenvalue of 2.5 and Factor 3 has eigenvalue of 1.35. Here from the Table 3 it has been found that three factors describes 61% that means they describes most of the variability so these factors are good and so these factors can be used as the replacement of other factors. Table 4 is showing three factors while loading is obtained with unrotated factor matrix. Factor 1 has high positive and negative values but Factor 2 has positive values for most of the variables and Factor 3 has negative values for most of the variables.
From the Fig. 4 of factor 1 vs. factor 2 it is clear that Total Nitrogen: OM and pH: Alkalinity is contaminated from the same source. Again, factor 2 has dominance over factor 1 which means contamination occurs for anthropogenic causes.
First factor loading explained 25.4% of the total variance. The factor shows strong positive loading for Total N and OM (Table 4). The results of first factor loading provide evidence of natural factor impact in the study area (Belkhiri and Narany 2015). Second factor loading explained 22.9% of the total variance. The factor has positive loadings for pH, K, P, S, Fe, and Total alkalinity. It is clear that the results of Factor analysis satisfy the experimental value of chemical analysis of the soil samples as well as gives the clear indication about the degraded parameters. The results of factor loading show clear evidence of anthropogenic influence in the study area that has been supported by similar scientific results done by Belkhiri and Narany (2015). However, factor 1 has also   little dominance that means some natural cause is also encountered in the study area. So, major reason of soil degradation of the study area is because of anthropogenic causes i.e. the coal mining activities.

Cluster
Cluster analysis is performed by using Statistica software (Version 08). The dendrogram of the 18 soil samples is shown in Fig. 5. There are three cluster groups in the dendrogram in which cluster 1 group possess 50% of total samples, cluster 2 has 38.9% samples, and cluster 3 has 11.1% soil samples. Table 5 shows the cluster group, sample numbers with respect to cluster group and percentage of sample belongs to each cluster group. In order to find out the similarity of grouped soil samples, their grouped min, max, and mean value of chemical parameters are shown collectively in Table 6. Cluster 1 group has high Fe, and OM value and low pH value than the Standard Reference value of RSRDIB. Again, cluster 2 has high K value than the Standard Reference value. In addition, all three cluster groups have low value of Total Nitrogen than the standard reference value. Moreover, Copper (Cu) has a higher value in all three cluster groups than the standard reference value. Figure 6 is showing the grouped soil samples location with respect to the Barapukuria coal mine area. Cluster 1 group is relatively degraded as Fe, K and OM is very high, and pH is low. High Fe value is the main reason of reducing pH of the study area which is described previously by using the chemical reactions. From Fig. 6 it is clear that cluster 1 samples are near to the coal mine. So, there are direct impacts of the coal mine on the soil samples of the surrounding area of the mine and with the increasing distance from the coal mine, Fe and OM becomes low and pH becomes high that is the clear indication of anthropogenic effect. Moreover, Total Nitrogen in the study area for samples of all  . 4 The bi-plot analysis for factor 1 & factor 2 cluster groups that indicate the as usual condition of this parameter in this study area. However, Cu is high in the sample of all cluster groups than that of standard reference value. But, Cluster 1 has more high value of Cu than other Clusters, this higher value of Cu in Cluster 1 prove the release of chalcopyrite (CuFeS 2 ) because of the coal mining.

One way ANOVA
The main aim of determining one way ANOVA is to compare the differences between the different sampling points for each parameter. In One way ANOVA, one independent variable is selected and with respect to this independent variable the Sum of Squares (SS), Mean   ,5,7,8,12,14,16,17,18 50 Cluster 2 2,3,6,9,11,13,15 38.9 Cluster 3 4,10 11.1 Square (MS), Degrees of Freedom (DF), F-ratio, and p-level are determined (Howladar and Rahman 2016). The one-way ANOVA test revealed that there are not significant mean differences between Total N, K, P, S, As and Ni in study area at P < 0.05 (Table 7). The mean differences between cluster one, two and three show significantly differences for some soil quality parameters including pH, OM, Fe, Cu, and Alkalinity (Table 7). This shows that these parameters have high variation in terms of their spatial distribution in the study area that satisfies similar scientific merit done by Belkhiri and Narany, 2015. These parameters are found to have higher factor loading and are also grouped in the second factor (Table 4) of the rotated components and contributed about 23% of the total variance of soil quality. It clearly shows that the degradation of the soil is due to the anthropogenic effects based on the result of factor analysis and ANOVA.

Soil quality with respect to agricultural purposes
Pyrite (FeS2) is the main mineral found in coal. When pyrite gets in contact with oxygenated waters, sulphates, iron and acidity are released to the environment. The estimation of soil quality around the mining areas and impacts of natural or anthropogenic activities on the soil chemistry is a comparatively complicated work because of the interruption of physical properties of soil horizons and very high heterogeneity of contaminant concentrations in mine soils (Hudson et al. 1997). The regulatory standards in different countries, including Bangladesh, vary for particular class of soils depending on the type of area management and anthropogenic influence such as residential soils, agricultural soils, technogenic soils, etc. (Galuszka et al. 2015). Barapukuria coal mine is surrounded with paddy fields and other agricultural land and the upland products are produced in study area. A details description about soil quality with respect to the optimum RSRDIB standard for upland crops is shown in Table 1. In addition, the degraded condition of each parameter in every sample has been clearly presented by Table 8 and Fig. 7. Table 8 shows that the percentages of highly polluted, slightly polluted, and Fig. 6 The status of the quality of soil around the mining area unpolluted soil samples and Fig. 7 showing the same result graphically. The criteria of ranking the pollution level are taken from the standard of RSRDIB. Here, parameters which maintain the standard reference value of RSRDIB are ranked unpolluted, which are highly deviated from the standard reference values are ranked highly polluted and which parameters are not deviated much from the standard reference values are ranked slightly polluted.
It is clear that, almost 50% soil samples have low pH value than the standard reference value. In fact, the pH is a good measure of acidity and alkalinity of soil water suspension and it provides a good identification of soil chemical nature and the desired pH for good vegetation ranges from 5.5 to 6.8 (Sharma 2008). In addition, the noticeable amount of degradation has been seen for Fe, which might be the reason for low pH value. Moreover, Arsenic (As), S, and P are not found much degraded. Only a few samples reflect the slight contamination of these parameters. The main degraded parameters of the soil samples are Total Nitrogen, Iron (Fe), Copper (Cu), Organic matter (OM), and Potassium (K). For soil quality Total Nitrogen is one of the key as paddy growth largely depends on the availability of Total Nitrogen. In particular, Total Nitrogen is vital for chlorophyll, which allows plants to carry out photosynthesis (Feeco 2017). In addition, Potassium (K) one ofthe quality elements that contributes to the growth and development of the paddy. It has a large contribution to the characteristics of the plants such as size, shape; color (Feeco 2017). So, deviation of K from the standard value can decrease the production of paddy. Moreover, high Fe, Cu, and OM value can also decrease the paddy production. So, Agriculture of the study area is facing problem now but this situation may become more worsen in near future.

Conclusion
Barapukuria coal mine is an asset for Bangladesh. It has been playing a vital role for fulfilling the increasing demand of energy in Bangladesh since 2002. But the lack of poor infrastructure for coal storage and inadequate treatment of underground mine water is polluting the surrounding soil to a large extent. From the analysis of soil samples it has been identified that several important chemical parameters that is necessary for agriculture are deviated from the Standard Reference value. It is also clear that soil samples are degraded because of the poor infrastructure of coal stock pile and improper disposal of mine water. Iron pyrite (FeS 2 ) and chalcopyrite (CuFeS 2 ) released during mining activities are mixing with the soil and consequently degrading the soil quality. Principal Component Analysis (PCA) also suggests that the soil samples are degraded mostly because of the anthropogenic causes i.e. because of the coal mine. Besides, from the cluster analysis it has been found that soil samples which are located near to the coal mine have been mostly degraded. Moreover, one way ANOVA also confirms about the spatial variability which due to the three different cluster groups and the location of the samples. With the agricultural point of view a few chemical parameters of the soil is deviated from the standard reference value and are not suitable for the agriculture. Without proper initiatives, degradation may lead to a severe damage to the soil quality and it can be brought disaster on the agriculture of the study which ultimately affects the vast paddy land around the Barapukuria coal mine area.