Rainfall Prediction
Rainfall Prediction
Rainfall Prediction
A Novel Nonlinear Combination Model Based on Support Vector Machine for Rainfall Prediction
Kesheng Lu Department of Mathematics and Computer Sciences Guangxi Normal University for Nationality Chongzui, Guangxi, China Email: [email protected] Lingzhi Wang Department of Mathematics and Computer Science Liuzhou Teachers College Liuzhou, Guangxi, China Email: [email protected]
AbstractIn this study, a novel modulartype Support Vector Machine (SVM) is presented to simulate rainfall prediction. First of all, a bagging sampling technique is used to generate different training sets. Secondly, different kernel function of SVM with different parameters, i.e., base models, are then trained to formulate different regression based on the different training sets. Thirdly, the Partial Least Square (PLS) technology is used to select choose the appropriate number of SVR combination members. Finally, a SVM can be produced by learning from all base models. The technique will be implemented to forecast monthly rainfall in the Guangxi, China. Empirical results show that the prediction by using the SVM combination model is generally better than those obtained using other models presented in this study in terms of the same evaluation measurements. Our ndings reveal that the nonlinear ensemble model proposed here can be used as an alternative forecasting tool for a Meteorological application in achieving greater forecasting accuracy and improving prediction quality further. Keywords-support vector machine; kernel function; partial least square; rainfall prediction;
I. I NTRODUCTION Rainfall forecasting has been a difcult subject in hydrology due to the complexity of the physical processes involved and the variability of rainfall in space and time [1], [2]. With the development of science and technology, in particular, the intelligent computing technology in the past few decades, many emerging techniques, such as articial neural network (ANN), have been widely used in the rainfall forecasting and obtained good results [3], [4], [5]. ANN are computerized intelligence systems that simulate the inductive power and behavior of the human brain. They have the ability to generalize and see through noise and distortion, to abstract essential characteristics in the presence of irrelevant data, and to provide a high degree of robustness and fault tolerance [6], [7]. Many experimental results demonstrate that the rainfall forecasting of ANN model outperformed multiple regression, moving average and exponent smoothing from the research literature. In addition, ANN approaches want of a strict theoretical support, effects of applications are strongly
978-0-7695-4335-2/11 $26.00 2011 IEEE DOI 10.1109/CSO.2011.50 1343
depended upon operators experience. In the practical application, ANN often exhibits inconsistent and unpredictable performance on noisy data [8]. Recently, support vector regression (SVM), a novel neural network algorithm, was developed by Vapnik and his colleagues [9], which is a learning machine based on statistical learning theory, and which adheres to the principle of structural risk minimization seeking to minimize an upper bound of the generalization error, rather than minimize the training error (the principle followed by ANN) [10], [11]. When using SVM, the main problems is confronted: how to choose the kernel function and how to set the best kernel paraments. The proper parameters setting can improve the SVM regression accuracy. Different kernel function and different parameter settings can cause signicant differences in performance. Unfortunately, there are no analytical methods or strong heuristics that can guide the user in selecting an appropriate kernel function and good parameter values. In order to overcome these drawbacks, a novel technique is introduced. The generic idea consists of three phases. First, an initial data set is transformed into several different training sets. Based on the different training sets, different kernel function of SVM and different parameter settings are then trained to formulate different regression forecasting. Finally, a SVM can be produced by learning from all base models. The rainfall data of Guangxi is predicted as a case study for development of rainfall forecasting model. The rest of this study is organized as follows. Section 2 elaborates a triple-phase SVM process is described in detail. For further illustration, this work employs the method to set up a prediction model for rainfall forecasting in Section 3. Finally, some concluding remarks are drawn in Section 4. II. T HE B UILDING P ROCESS OF THE N ONLINEAR E NSEMBLE M ODEL Originally, SVM has been presented to solve pattern recognition problems. However, with the introduction of Vapniks -insensitive loss function, SVM has been developed to solve nonlinear regression estimation problems, such as new techniques known as support vector regression
(SVR) [12], which have been shown to exhibit excellent performance. At present, SVR has been emerging as an alternative and powerful technique to solve the nonlinear regression problem. It has achieved great success in both academic and industrial platforms due to its many attractive features and promising generalization performance. A. Support Vector Regression The SVR model maps data nonlinearly into a higherdimensional feature space, in which it undertakes linear regression. Rather than obtaining empirical errors, SVR aims to minimize the upper limit of the generalization error. Suppose we are given training data (x , )=1 , where is the input vector; is the output value and is the total number of data dimension. The modelling aim is to identify a regression function, = (), that accurately predicts the outputs corresponding to a new set of input output examples, ( , ). The linear regression function (in the feature space) is described as follows: () =() + , : , (1)
basically depended on different the kernel function, varying the parameters of SVR or utilizing different training sets. In this paper, there are three methods for generating diverse models. (1) Using different the type of SVR kernel function, such as the linear kernel function and the polynomial kernel function. (2) Utilizing different the parameters of SVR, such as different cluster center of the SVR, : through varying the cluster center of the SCR, different cluster radius of the SVR, different SVR can be produced. (3) Using different training data: by re-sampling and preprocessing data, different training sets can be obtained. . C. Selecting appropriate ensemble members After training, each individual neural predictor has generated its own result. However, if there are a great number of individual members, we need to select a subset of representatives in order to improve ensemble efciency. In this paper, the Partial Least Square (PLS) regression technique [16] is adopted to select appropriate ensemble members. Interested readers can be referred to [16] for more details. D. -Support vector regression If the proper hyper parameters are picked up, SVR will gain good generalization performance and vice versa, so it is important to select right model. Instead of selecting an appropriate Sch kopf et al. proposed a variant, called o support vector regression, which introduces a new parameter which can control the number of support vectors and training errors without dening a prior. To be more precise, they proved that is an upper bound on the fraction of margin errors and lower bound of the fraction of support vectors [15]. The -SVR regression in data sets can be described as follows: (, , ) = 1 + ( + 1 ( + )) 2 =1 .. ( ) + ( ) + , 0, = 1, 2, , , 0. (3) where 0 1, is the regulator, and training data are mapped into a high (even innite) dimensional feature space by the mapping function (). E. The Establishment of Combination Forecasting Model To summarize, the proposed nonlinear combination forecasting model consists of four main stages. Generally speaking, in the rst stage, the initial data set is divided into different training sets by used Bagging and Boosting technology. In the second stage, these training sets are input to the different individual SVM regression models, and then various single SVM regression predictors are produced
where and are coefcients; () denotes the high dimensional feature space, which is nonlinearly mapped from the input space x. This primal optimization problem is a linearly constrained quadratic programming problem [13], which can be solved by introducing Lagrangian multipliers and applying Karush-Kuhn-Tucker (KKT) conditions to solve its dual problem: (, ) = ( ) ( + ) =1 =1 ( )( )( , ) 1 2 =1 =1 .. ( ) = 0 =1 0 , , = 1, 2, , (2) where and are the Lagrangian multipliers associated with the constraints, the term ( , ) is dened as kernel function, where the value of kernel function equals the inner product of two vectors and in the feature space ( ) and ( ), meaning that ( , ) = ( ) ( ). In machine learning theories, the popular kernel functions are Linear kernel, Polynomial kernel and Guassian kernel. B. Generating individual SVR predictors With the work about biasvariance tradeoff of Bretiman [14], an ensemble model regression model consisting of diverse models with much disagreement is more likely to have a good generalization [15]. Therefore, how to generate diverse models is a crucial factor. For SVR model, several methods have been investigated for the generation of ensemble members making different errors. Such methods
1344
based on diversity principle. In the third stage, PLS model is used to select choose the appropriate number of SVR ensemble members. In the four stage, -SVM regression is used to aggregate the selected combination members (SVR). In such a way nal combination forecasting results can be obtained. The basic ow diagram can be shown in Fig.1.
SVR 1
C. Analysis of the Results Table 1 illustrates the tting accuracy and efciency of the model in terms of various evaluation indices for 500 training samples. From the table, we can generally see that learning ability of vSVM regression ensemble outperforms the other three models under the same network input. The more important factor to measure performance of a method is to check its forecasting ability of testing samples in order for actual rainfall application.
Table I A COMPARISON OF FITTING RESULT OF FOUR DIFFERENT MODELS ABOUT 500 TRAINING SAMPLES Ensemble Moel NMSE 0.0976 0.1029 0.0452 0.0374 MAPE 0.7360 0.3456 0.3211 0.2486 PRC 0.7654 0.7892 0.9350 0.9766
Output
SVR 2
Output
SVR 3
Output
PLS Se le ctiom
SVR 4
Output
SVR 6
Training Se t TR M
Output
III. E XPERIMENTAL R ESULTS AND D ISCUSSION A. Empirical Data This study has investigated Modeling -SVM regression to predict average monthly precipitation from January 1965 to December 2009 in Guangxi. Thus the data set contained 540 data points in time series, 500 data of whose were used to train samples for -SVM regression learning, and the other 40 data were used to test sample for -SVM regression Generalization ability. Method of modeling is one-step ahead prediction, that is, the forecast is only one sample each time and the training samples is an additional one each time on the base of the previous training. B. Performance evaluation of model In order to measure the effectiveness of the proposed method, three types of errors are used in this paper, such as, Normalized Mean Squared Error (NMSE), the Mean Absolute Percentage Error (MAPE) and Pearson Relative Coefcient (PRC), which be found in many paper [6]. In order to investigate the effect of the proposed model, the simple averaging ensemble, the mean squared error (MSE) based regression ensemble and variancebased weighted ensemble are established. Those are tted the 500 samples and forecasted the 40 samples by the those models, the comparison results are used to test the effect of predictive models.
Figure 3 shows the forecasting results of four different models for 40 testing samples, we can see that the forecasting results of vSVR ensemble model are best in all models. Table 2 shows that the forecasting performance of four different models from different perspectives in terms of various evaluation indices. From the graphs and table, we can generally see that the forecasting results are very promising in the rainfall forecasting under the research where either the measurement of tting performance is goodness or where the forecasting performance is effectiveness.
550 500 450 400 350 300 250 200 150 100 50 0 Actual monthly rainfall Simple averaging MSE regression Variance based weight vSVR combination
Rainfall(mm)
10
15
20
25
30
35 Monthly
40
Figure 2.
As shown in Table 2 about the rainfall forecasting of four different model, the differences among the different models are very signicant. For example, the NMSE of the simple averaging ensemble model is 0.1285. Similarly, the NMSE of the MSE ensemble model is 0.0955, the NMSE of the variance weighted ensemble model is 0.0653; however the NMSE of the vSVM regression model reaches 0.0221. The NMSE result of the vSVM regression model has obvious advantages over three other models. Subsequently,
1345
Table II A COMPARISON OF FORECASTING RESULT OF FOUR DIFFERENT MODELS ABOUT 40 TESTING SAMPLES Ensemble Moel simple averaging MSE ensemble variance weighted vSVM regression NMSE 0.1285 0.0955 0.0653 0.0221 MAPE 0.8710 0.4381 0.4109 0.2053 PRC 0.6726 0.7965 0.8820 0.9341
[4] Jiansheng Wu, A novel nonparametric regression ensemble for rainfall forecasting using particle swarm optimization technique coupled with articial neural network, Lecture Note Computer Science, Vol. 5553, No. 3, pp: 4958, SpringerVerlag Berlin Heidelberg, 2009. [5] G. F. Lin and L. H. Chen, Application of an articial neural network to typhoon rainfall forecasting, Hydrological Processes, Vol. 19, pp. 18251837, 2005. [6] Jiansheng Wu and Long Jin, Study on the meteorological prediction model using the learning algorithm of neural network based on pso algorithms, Journal of Tropical Meteorology, Vol. 16, No. 1, pp: 8388, 2009. [7] R. S. Govindaraju, Articial neural network in hydrology, I: Preliminary concepts, Journal of Hydrologic Engineering, Vol.5, No.2, 115123. [8] W. C. Hong, Rainfall forecasting by technological machine learning models, Applied Mathematics and Computation, Vol. 200, pp: 4157, 2008. [9] V. Vapnik, The nature of statistical learning theory. New York: Springer Press, 1995. [10] F. E. H. Tay and L. Cao, Modied support vector machines in nancial time series forecasting, Neurocomputing, vol. 48(14), pp: 847861, 2002. [11] V. Vapnik, S. Golowich and A. Smola, Support vector method for function approximation, regression estimation and signal processing, In Edited by M. Mozer, M. Jordan and T. Petsche, Advance in neural information processing system, Vol. 9, pp: 281287. Cambridge, MA: MIT Press, 1997. [12] B. Sch lkopf, A. Smola, R. C. Williamson and P. L. Bartlett. o New support vector algorithms, Neural Computation, Vol. 5, pp: 12071245, 2000. [13] V. Vapnik, S. Golowich and A. Smola, Support vector method for function approximation, regression estimation and signal processing, In Edited by M. Mozer, M. Jordan and T. Petsche, Advance in neural information processing system, Vol. 9, pp: 281287. Cambridge, MA: MIT Press, 1997. [14] L. Breiman, Combining Predictors,Proceedings of Combining Articial Neural Nets-Ensemble and Modular Multi-net Systems, the Springer Press, Berlin, Vol. 1, pp. 31-50, 1999. [15] J. A. Benediktsson, J. R. Sveinsson, O. K. Ersoy and P. H. Swain, Parallel Consensual Neural Neural Networks, IEEE Transactions on Neural Networks, Vol. 8, pp. 54-64, 1997. [16] D. M. Pirouz. An overview of partial least square, Technical report, The Paul Merage School of Business, University of California, Irvine, 2006.
for MAPE efciency index, the proposed vSVM regression model is also the smallest. IV. C ONCLUSION Accurate rainfall forecasting is crucial for a frequent unanticipated ash ood region to avoid life losing and economic loses. This paper proposes a novel nonlinear combination forecasting method in terms of vSVR principle. This model was applied to the forecasting elds of monthly rainfall in Guangxi. In terms of the different forecasting models, empirical results show that the developed model performs the best for monthly rainfall on the basis of different criteria. Our experimental results demonstrated the successful application of our proposed new model, vSVM regression, for the complex forecasting problem. It demonstrated that it increased the rainfall forecasting accuracy more than any other model employed in this study in terms of the same measurements. So the vSVM regression ensemble forecasting model can be used as an alternative tool for monthly rainfall forecasting to obtain greater forecasting accuracy. ACKNOWLEDGMENT The authors would like to express their sincere thanks to the editor and anonymous reviewers comments and suggestions for the improvement of this paper. This work was supported in part by Guangxi Natural Science Foundation under Grant No. 0832092, and in part by the Department of Guangxi Education under Grant No. 200707MS061. R EFERENCES
[1] Lihua Xiong, K. M. Connor, An empirical method to improve the prediction limits of the GLUE methodology in rainfall runoff modeling, Journal of Hydrology, Vol. 349, pp: 115 124, 2008. [2] G.H. Schmitz, J. Cullmann, PAIOFF: A new proposal for online ood forecasting in ash ood prone catchments, Journal of Hydrology, Vol. 360, pp: 114, 2008. [3] Jiansheng Wu, Liangyong Huang and Xiongming Pan, A novel bayesian additive regression trees ensemble model based on linear regression and nonlinear regression for torrential rain forecasting, Proeedings of the Third Internatioal Joint Conference on Computational Sciences and Optimization, eds. K. K. Lai, Yingwen Song and Lean Yu, IEEE Computer Society Press, Vol. 2, pp:484487, 2010.
1346