Research  Open  Published:
Public warning systems for forecasting ambient ozone pollution in Kuwait
Environmental Systems Researchvolume 2, Article number: 2 (2013)
Abstract
Background
In this paper, the performances of different forecasting systems are compared using the daily maximum ozone levels across three locations in Kuwait. The two analytical tools used in this study to forecast daily maximum ozone levels are time series modeling and fuzzy modeling. The structure of the two proposed forecasting models are derived from basic principles, which include a combination of persistence and daily maximum air temperature as input variables.
Results
The two proposed forecasting models /showed significant improvement compared to the pure persistence forecast, which is the model currently used to forecast ambient air pollution in Kuwait. The performance of the two models suggests that daily maximum temperature explains a large proportion of the variation in ozone daily maximum levels.
Conclusions
This study concludes that fuzzy modeling is the most reliable forecasting system, with the lowest number of false positives among the different models.
Introduction
Industrialization, technical growth, and overpopulation in urban areas of Kuwait have resulted in increased air pollution (AbdulWahab et al. 2000), and toxic air pollutants in close proximity to populated areas can have adverse health effects. Ambient ozone is the primary constituent of smog and is a pollutant of concern in industrialized countries, as it can lead to chronic respiratory infection and lung inflammation, aggravate asthma, impair lung defense mechanisms, and reduce the immunity of the human body (Tilton 1989). In the environment, ozone may result in acute foliar injuries, reduce agricultural yield and biomass production, and shift the competitive advantages of plant species in mixed populations (Lefohn et al. 1994). In this respect, surface ozone has become a serious problem in the urban areas of Kuwait, and if it occurs in sufficient concentration, it could threaten both human health and the environment (Jallad and Jallad, 2010).
Groundlevel ozone is formed by chemical reactions between nitrogen oxides (NO_{x}) and volatile organic compounds (VOC) in the presence of heat and sunlight. It is difficult to exactly define the formation and destruction mechanism of ozone. This is because ozone is an extremely reactive pollutant and can be scavenged by its precursors (Dimitriades 1989). As a result, the area of air pollution forecasting through empirical methods has gained importance with the availability of sufficient data. Earlier forecasting models were based on simple empirical data correlations, but the availability of a large amount of information has resulted in development of complex air pollution simulations for forecasting (Telenta et al. 1995).
Management of public warning strategies for ozone levels in densely populated areas requires accurate forecasts of ambient levels. Although ozone prediction models exist or have been proposed for several cities (Robeson and Steyn, 1990; Elsom 1996; Noordijk 1994; Yi and Prybutok 1996), they have not been assessed in realistic conditions. The purpose of this work is to explore the possibility of forecasting daily maximum ozone impacts in urban areas of Kuwait, where most measurements are taken at surface stations. The final goal is to produce a quantitative tool to help authorities monitor ozone pollution, which has become a public health issue in many cities worldwide.
Background
Kuwait is a principality state in the northeastern corner of the Arabian Peninsula with an estimated population of 3.5 million. It is a low lying country, with the highest point being 306 meters above sealevel. The annual rainfall across Kuwait varies from 75 to 150 millimeters, and the country has a desert climate—hot and dry (Federal Research Division 1993).
Kuwait is a major exporter of crude oil (ranked 4th in OPEC’s list), with plans to increase production to 3.2 million barrels per day in 2013 (Arab times 2012). Such characteristics make Kuwait a country with air pollution associated with petroleum, petrochemical, and other industrial pollutants (Arab times 2012; Organization of the Petroleum Exporting Countries 2009).
Recently, there has been an increase in air pollution in urban areas of Kuwait. This is the result of rapid industrialization, technical growth, and overpopulation. In the past, it has been observed that toxic air pollutants in close proximity to populated areas can have adverse health effects. Ozone is one such pollutant, and if it occurs in sufficient concentration, it can become a serious problem in urban areas of Kuwait. Jallad and Jallad (2010) observed that ozone pollution levels in the Salmiya residential district exceeded ambient air quality standards during specific times of the year. Therefore, accurate forecasting of surface ozone is required, as it can help with successful implementation of public warning strategies during episodic days in Kuwait.
Most of the air pollution studies carried out in Kuwait during the past two decades were aimed at air pollutant patterns, dispersion, and photochemical mechanisms. Similarly, studies that analyzed ozone levels were focused on comparing ozone levels with international standard limits, assessing health effects of ozone pollution, understanding diurnal behavior of ozone, and studying the seasonal trends in ozone levels (Ettouney et al. 2009a; Ettouney et al.2009b).
There have been very few studies focused on developing a robust forecasting system which can be used to develop a public warning system. Most of the forecasting systems that have been developed to predict the ambient ozone concentration in Kuwait use meteorological data and precursor concentrations (Ettouney et al. 2009a; Ettouney et al. 2009b; Ettouney et al. 2009c).
AbdulWahab et al. (1996) used stepwise multiple regression modeling to predict ozone levels from precursor concentrations and meteorological conditions during daylight hours in the Shuaiba Industrial Area (SIA) of Kuwait. AlAlawi et al. (2008) applied principal component regression and artificial neural networks to predict ozone concentration in Kuwait’s lower atmosphere using the data on seven environmental pollutant concentrations (CH_{4}, NMHC, CO, CO_{2}, NO, NO_{2}, and SO_{2}) and five meteorological variables (wind speed, wind direction, air temperature, relative humidity, and solar radiation). In their study, Elkamel et al. (2001) used an artificial neural network model to predict ozone concentrations as a function of meteorological conditions and precursor concentrations in the SIA.
Similarly, AbdulWahab and AlAlawi (2002) used neural networks for ozone modeling in the lower atmosphere as a function of meteorological conditions and various air quality parameters in the Khaldiya residential district of Kuwait.
These statistical models are based on semiempirical statistical relations among available data and measurements. They do not necessarily reveal any relation between cause and effect. They simply attempt to determine the underlying relationship between sets of input data (predictors) and targets (predictands).
However, the complex and sometimes nonlinear relationships of multiple variables can make these statistical models awkward and complicated. Therefore, it is expected that they will underperform when used to model the relationship between ozone and the other variables that are extremely nonlinear. In addition, over the last decade, artificial intelligence (AI) based techniques have been proposed as alternatives to traditional statistical ones for forecasting urban air pollution. Two of the most reliable and feasible AI techniques that have been used for urban air pollution forecasting are neural networks and fuzzy logic. Neural networks are simple mathematical models representing a brainlike system. Although considered as one of the most popular AI methods, neural networks also have inherent drawbacks that impede their practical applications. Scalability, testing, verification, and integration of neural network models into forecasting system are some of the major concerns today. Sometimes, neural network systems become unstable when they are applied to bigger problems. Compared to neural networks, fuzzy logic offers better insights into the forecasting model, and they can form a multivalued logic to deal with reasoning that is approximate rather than precise (Karlaftis and Vlahogianni 2011).
Prediction of ozone levels using a theoretical method (i.e. detailed atmospheric diffusion model) is difficult and empirical analysis is required to develop a forecasting system. A wellevaluated ozone forecast model can raise the possibility of successful ozone control strategy. In addition, forecasting the daily maximum ozone concentrations can help avoid and reduce ozonerelated injuries and damages. This research is significant, as it is the first study comparing the ozone forecasting systems across different locations in Kuwait.
Research methodology
Dataset
The study used hourly air pollutants data from January 2007 through September 2011 gathered by the Environmental Monitoring Information System of Kuwait (eMISK, working under the Environment Public Authority of Kuwait). The data was collected from three different locations in Kuwait (see Figure 1): AlJahra (Tower A), AlMansouriya (Tower B), and AlRigga (Tower C). The initial three years of data was used to train the forecasting systems, and data from the remaining years was divided into subgroups to validate the forecasting models. The validation data was divided into subsets based on the availability of the data and seasonal variation.
Geographical location and a history of high ozone episodes were considered in selecting target sites. Tower A is located in AlJahra, the capital of the AlJahra Governorate of Kuwait and the surrounding agriculturally based AlJahra District. Tower B is located in one of the suburbs of the AlAsimah Governorate, which houses most of Kuwait’s financial and business center. Tower C is located in the AlAhmadi Governorate, which forms an important part of the Kuwaiti economy, as several of Kuwait’s oil refineries are located there.
Model structure
Seinfeld (1986) proposed the atmospheric diffusion equation, from which the theoretical model (Pryor and Steyn 1995; Jorquera, and Acuna 1998) used for this study was developed:
where C(x, y, z, t) is the concentration, V = (u, v, w) is the wind vector, K = diag(K_{ x }, K_{ y }, K_{ z }) is the eddy diffusivity, Q is the emission rate, R is the netgeneration term (balance of chemical production and destruction) and L stands for physical removal processes such as wet and dry deposition. The mathematical expression in equation (1) represents the mass conservation principle and is the most general equation obeyed by any atmospheric pollutant.
In the case of ozone, the mass conservation expression can be simplified, because the emission term is not there and the deposition terms can be ignored. The ozone concentration can be assumed to be uniform in the mixed layer, as long as we take a moving air mass under the conditions of a strong convection.
Under such conditions, the column of air satisfies the balance:
Where {C_{ k }} stands for a detailed photochemical mechanism, which includes all the relevant species participating in ozone generation and destruction. These concentrations depend upon the spatial coordinates and time. Assessing the importance of VOC and NO_{x} in ozone photochemistry is not easy, as there are no previously measured VOC mixing ratios for Kuwait. Thus, to take a simpler approach, a stochastic model is constructed from equation (2). Integrating equation (2) from early morning rush hour up to the time when the ozone concentration is highest leads to the following equation:
Temperature is steadily increasing during this time interval (Figure 2), so using q = dT/dt, the above expression can be written in temperature terms as:
Taking the difference between two consecutive days results in:
Using the mean value theorem for integrals on the right hand side of equation (5) and dropping the “max” superscript for daily maximum ozone and temperature levels, the theoretical model reduces to:
Here, parameter α is positive and of order one, β and γ are of about the same magnitude (but differ in sign), and δ has a small value. The planetary boundary layer is well mixed, and the surface measurements are representative of the whole convective mixed layer, when the daily maximum ozone concentration is reached. Furthermore, the temperature recorded at the time when ozone is highest is strongly correlated with the highest temperature recorded that day, so in passing from equations (5) to (6) the interpretation of T_{max} is accordingly modified. Robeson and Steyn (1990) developed an equation similar to equation (6) and used variance analysis to find the best model fit among several slight variations of the equation.
Time series forecasting modeling
Presently three types of stochastic models to forecast air pollutants have been developed using the timeseries approach (McCollister and Wilson 1975):

(a)
Models based on time series of a single pollutant: AR, ARMA, ARIMA models.

(b)
Multivariable time series, where meteorological explanatory variables have been added on: ARX, ARMAX models.

(c)
Forecasting models based on a combination of classical and nonparametric methods.
The above citations make it clear that when the aim is on air pollution forecasting, any forecast model ought to be better than the pure persistence forecast (i.e. forecasting for tomorrow what occurred today).
Thus, in order to have an accurate forecast, some persistence and some exogenous, meteorological variables need to be included. Of all the meteorological variables, air temperature has the strongest correlation with ozone concentrations for two reasons: (1) high air temperatures are an excellent indication of environmental conditions conducive to O_{3} production and accumulation (i.e. anticyclonic conditions with associated clear skies and light winds), and (2) the rate ‘constants’ of photochemical reactions are highly temperature dependent (Pryor and Steyn 1995). As a result, air temperature is a reasonable surrogate for the combined effects of wind speed, wind direction, inversion height, and photochemical reaction rate. As discussed earlier, the simple linear equation (6), which considers only surface air temperature as the exogenous variable, will be used. In all cases the actual maximum temperature (T_{t+1}) was used in place of the forecast value (that is, an ex post parameter fit was performed). The errors associated with forecasting T_{t+1} will be assessed ex ante by simulation. Equation (6) has an ARX (1 2 0) structure (Ljung 1987), and its parameters were estimated using the ARX procedure available in MATLAB’s System Identification Toolbox (Ljung 1991).
Fuzzy modeling
A fuzzy model is a collection of rules, derived from the original data, which are combined to produce a single output for a given input. A multiple input, single output (MISO) model structure is given by the following set of rules:
Where X = (X_{1}, X_{2}, …, X_{p}) is a given observation, ${A}^{i}=\left({A}_{1}^{i},{A}_{2}^{i},\dots ,{A}_{p}^{i}\right)$is a premise of the fuzzy model; the points A^{i} are a basis for the space of the input variables. The consequences of the model are denoted by B^{i}; they represent an intensity level associated with the dependent variable y and are the basis of the output space. Equation (7) states that, if a given observation X (for instance, meteorological and air quality measurements) can be associated with a known pattern A^{i} (for instance, some class of meteorological condition), then a rule specific to A^{i} will give an estimate of the associated output y (for example, groundlevel ozone concentration). By using a fuzzy clustering classification process, the set of points A^{i} can be discriminated and then a fuzzy model constructed. A simple choice was proposed by Takagi and Sugeno (1985); they suggested linear rules of the form:
Then the parameters {c^{i}} can be estimated using ordinary least squares, by minimizing the difference between the observed output and the output of the fuzzy model, given by:
In order to develop fuzzy models, using fuzzy clustering techniques for parameter identification, the model structure given by equation (6) was used. In fuzzy modeling language, the sets (O_{3, t}; T_{t+1}; T_{t}) and (O_{3, t+1}) correspond to the premises and consequences, respectively. The fuzzy Cmeans algorithm (Rousseeuw et al. 1996; Johanyak, and Kovacs 2011; Sugeno and Yasukawa 1993) was applied to affect the partition of the original data set into M fuzzy clusters; the parameters were estimated using MATLAB’s Fuzzy Logic Toolbox.
Model performance
The performance of the two forecasting models was evaluated using two statistical indices: the rootmeansquare error (RMSE) and the index of agreement (IA), defined as:
Here o_{i} and p_{i} are observed and forecasted ozone maximum values on day i, N is the number of days in the test set, ${p}_{i}^{\prime}={\mathrm{p}}_{\mathrm{i}}{\mathrm{o}}_{\mathrm{m}}$ and ${o}_{i}^{\prime}={\mathrm{o}}_{\mathrm{i}}{\mathrm{o}}_{\mathrm{m}}$, with o_{m}, the average observed ozone maximum. The index of agreement is a dimensionless index bound between 0 (showing no agreement at all) and 1 (showing perfect agreement of the time series).
Results and discussions
Evolution of photochemical pollution
The evolution of typical photochemical pollution at different locations in Kuwait is summarized in Figure 2. It was observed that the ozone levels at different locations started rising in the morning (around 7:00 AM), reached maximum levels during the afternoon (11:00 AM – 4:00 PM), and then started falling again in the evening. Similarly, the monthly ozone levels at different locations in Kuwait are summarized in Figure 3 through Figure 4. It was observed that during summers (June to August), the average ozone levels are higher compared to the average ozone levels during the rest of the year. Thus, it can be concluded that there is a strong association between the ambient temperature and the surface ozone levels.
Empirical findings (time series modeling)
The results in Table 1 shows the parameter estimates for the time series model—presented in equation (6)—developed for the period 2007 through 2009. There is a great amount of fluctuation in the daily maximum temperature during the year; therefore, the data subsets (based on seasonal duration and missing data points) were used individually in the model development process. There are four data subsets used for each of the locations covered under this study. An additional data subset for model validation was also used. It can be seen from the parameter estimates in Table 2 that the fitted values support the model structure derived in equation (6). Additionally, it can be seen that the fitted parameters associated with each location show great amounts of variation during different time periods.
Empirical findings (fuzzy modeling)
The parameters estimated for Towers A, B, and C using fuzzy model are summarized in Table 2. For the fuzzy model, a different number of rules were found for each of the data subsets. Each of these rules corresponds to a linear model valid for a specific cluster obtained from the original data by means of a fuzzy classification. The parameters of these rules are similar to the ones estimated for the timeseries models. The predicted output results from a nonlinear combination of these rules.
Comparison of models
The results in Table 3 compare the forecasting for the three towers: AlJahra, AlMansouriya, and AlRigga, using the pure persistence model, the linear time series model, and the fuzzy model. Clearly, the root mean square error shows that the proposed fuzzy forecasting model and time series forecasting model are a significant improvement over the pure persistence forecast (O_{3,t+1} = O_{3,t}), for all data subsets. Regarding the observed episodes, the three models achieve a high percentage of correct forecasts, between 66 and 95%. As shown in Figures 3 and 5, the overall quality of the forecast results supports the model structure proposed in equation (6) and the most relevant input variables have been identified. The most significant performance difference among these three forecasts occurs on the number of false positives; fuzzy models are consistently lower, and therefore afford a more reliable forecast. It is worth noticing that the fuzzy forecast performs better in this sense, so it is less sensitive to changes in the parameter values. Nonetheless, all models ought to be calibrated once a year to take into account trends in ozone ground levels and to keep the forecast accurate.
Conclusion
The structure proposed in equation (6) was tested and validated by comparing the outcomes of two different modeling schemes, namely linear time series modeling and fuzzy modeling. These two forecasting systems showed a significant improvement over the pure persistence forecast. The impact of temperature on daily maximum ozone levels was confirmed by deducing the model from basic principles.
Furthermore, the accuracy of the results and the performance of the two models suggest that temperature explains a large proportion of variance in the daily maximum ozone levels. However, the accuracy of the models in forecasting very high ozone concentrations is very low. Ad hoc corrections may improve local forecasts, but such models will likely be site specific. Among the different forecasting systems, the fuzzy system showed better performance in terms of lower numbers of false positives for all validation datasets. Thus, fuzzy modeling can be used to develop an effective environmental warning system (EWS) in Kuwait.
References
AbdulWahab S, Bouhamra W, Ettouney H, Sowerby B, Crittenden BD: Analysis of ozone pollution in the shuaiba industrial area in Kuwait. Int J Env Stud 2000,57(2):207–224. 10.1080/00207230008711267
Jallad KN, Jallad CE: Analysis of ambient ozone and precursor monitoring data in a densely populated residential area of Kuwait. J Saudi Chem Soc 2010, 14: 363–372. 10.1016/j.jscs.2010.04.003
Tilton BE: Health effects of tropospheric ozone. Environ Sci Technol 1989, 23: 257–263. 10.1021/es00180a002
Lefohn AS, Edwards PJ, Adams MB: The characterization of ozone exposures in rural west Virginia and Virginia. J Air Waste Manag Assoc 1994, 44: 1276–1283.
Dimitriades B Atmospheric ozone research and its policy implications , 35; studies in environmental science. In Photochemical oxidant formation: overview of current knowledge and emerging issues. Amsterdam: Elsevier Science Publishers; 1989:35–43.
Telenta B, Alfksic N, Dacic M: Application of the operational synoptic model for pollution forecasting in accidental situations. Atmos Env 1995, 28: 2885–2891.
Robeson SM, Steyn DG: Evaluation and comparison of statistical forecast models for daily maximum ozone concentrations. Atmos Env 1990, 24B: 303–312.
Elsom D: Smog alert: managing urban Air quality. London: Earthscan Publications Limited; 1996.
Noordijk H Air pollution , 2. In The national smog warning system in the Netherlands; a combination of measuring and modeling. Southampton: Pollution Control and Monitoring; WIT Press; 1994:169–176.
Yi J, Prybutok VR: A neural network model forecasting for prediction of daily maximum ozone concentration in an industrialized urban area. Environ Pollut 1996, 92: 349–357. 10.1016/02697491(95)00078X
Federal Research Division: Kuwait: A country study. Whitefish, Montana: Kessinger Publishing; 1993.
Arab Times: Kuwait plans Oil production increase to 3.2 Million BPD. 2012. Retrieved from http://www.gulfbase.com/news/kuwaitplansoilproductionincreaseto3–2millionbpd/218917
Organization of the Petroleum Exporting Countries: World Oil outlook. Vienna, Austria: OPEC Secretariat; 2009.
Ettouney RS, AbdulWahab S, Elkilani AS: Emissions inventory, ISCST, and neural network: modeling of Air pollution in Kuwait. Int J Env Stud 2009, 66: 181–194.
Ettouney RS, Mjalli FS, Ettouney H, Zaki JG, ElRifai MA, Ettouney H: Forecasting ozone pollution using artificial neural networks. Mgmt Env Quality 2009, 20: 668–683. 10.1108/14777830910990843
Ettouney RS, Mjalli FS, Zaki JG, ElRifai MA, Ettouney HM: Forecasting of ozone pollution using artificial neural networks. Mgmt Environ Quality An Int J 2009,20(6):668–683. 10.1108/14777830910990843
AbdulWahab S, Bouhamra W, Ettouney H, Sowerby B, Crittenden BD: Predicting ozone levels: a statistical model for predicting ozone levels. Environ Sci Pollut Res 1996, 3: 195–204. 10.1007/BF02986958
AlAlawi SM, AbdulWahab SA, Bakheit CS: Combining principal component regression and artificial neural networks for more accurate predictions of groundlevel ozone. Env Model Soft 2008,23(4):396–403. 10.1016/j.envsoft.2006.08.007
Elkamel A, AbdulWahab S, Bouhamra W, Alper E: Measurement and prediction of ozone levels around a heavily industrialized area: a neural network approach. Adv Environ Res 2001,5(1):47–59. 10.1016/S10930191(00)000423
AbdulWahab SA, AlAlawi SM: Assessment and prediction of tropospheric ozone concentration levels using artificial neural networks. Env Model Soft 2002,17(3):219–228. 10.1016/S13648152(01)000779
Seinfeld JH: Atmospheric chemistry and physics of Air pollution. New York: John Wiley & Sons; 1986.
Pryor SC, Steyn DG: Hebdomadal and diurnal cycles in ozone time series from the lower Fraser valley, B. C. Atmospheric Environ 1995, 29: 1007–1019. 10.1016/13522310(94)00365R
Jorquera H, Perez R, Acuna G: Forecasting ozone daily maximum levels at Santiago. Chile Atmos Env 1998, 32: 3415–3424. 10.1016/S13522310(98)000351
McCollister GM, Wilson KR: Linear stochastic models for forecasting daily maxima and hourly concentrations of Air pollutants. Atmos Env 1975, 9: 417–423. 10.1016/00046981(75)901274
Ljung L: System identification: theory for the user. New Jersey: PrenticeHall: Englewood Cliffs; 1987.
Ljung L: System identification toolbox for use with MATLAB. Natick, MA: The Math Works Inc; 1991.
Takagi T, Sugeno M: Fuzzy identification systems and its applications to modeling and control. IEEE Transact Sys Man Cybernetics 1985, 1: 116–132.
Rousseeuw PJ, Kaufman L, Trauwaert E: Fuzzy clustering using scatter matrices. Comput Stat Data Analysis 1996, 23: 135–151. 10.1016/S01679473(96)000266
Johanyak ZC, Kovacs J: Fuzzy model based prediction of groundlevel ozone concentration. Acta Technica Jaurinensis 2011, 4: 113–125.
Sugeno M, Yasukawa T: A fuzzylogic based approach to qualitative modeling. IEEE Trans Fuzzy Syst 1993, 1: 7–31.
Karlaftis MG, Vlahogianni EI: Statistical methods versus neural networks in transportation research: differences, similarities and some insights. Transp Res 2011, 19: 387–399. 10.1016/j.trc.2010.10.004
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ original submitted files for images
Rights and permissions
About this article
Received
Accepted
Published
DOI
Keywords
 Groundlevel ozone
 Fuzzy analysis
 Ozone forecast
 Time series ozone forecast
 Time series environmental warning system
 Kuwait