Figure 21 shows the predicted load after every one hour from May 2, 1991, till August 6, 1991. The figure portrays the load at most of the weekends and the set holidays (Christmas Day, Labor Day and Freedom Day) are represented with green rectangular shapes and red rectangles consecutively. Therefore, with reference to the figure shown below, it’s evident that the current load utilization and demand possess a number of seasonal patterns comprising of weekly and daily periodicity. Moreover, the load levels at weekends and the set holidays are much lower concerning the working days.

This results in a conclusion that the load demands are often influenced and greatly affected by the calendar days. Nevertheless, it is a very challenging task to forecast in holidays since the typical conditions are rare and distinct from normal workdays and other holidays. In this study, the researchers took the weekends and holidays to be the same for simplicity purposes. Time(Hour)Figure 22: Loads witnessed after every one hour during each and every day of a week.

Figure 22 portrays the mean hourly predicted load for the same day for each and every day for a period of one week. There is a variation in the load from one hour to another due to the consumers’ expected behavior. Also, it is very simple to determine that the expected curves available on all working days, without considering Friday, since it possess the same magnitudes and shapes. This act as a proper prove that the working days have a higher load demands than the weekends. However, Friday has a load level between levels of weekends and working days. Moreover, the relationship that exists between the temperature and the load demand is portrayed in figure 23 below, which led to the observation of a nonlinear relationship between them. This is a confirmation that the temperatures are worth applicable to the input variables. Temperature (0F)Figure 23: Relationship between the temperature and load.Putting into consideration the short run behavior and each day periodicity features of each one hour loads, this study selects each hour load’s amounts of the preceding 12 hours and same hours in the last week as the prime inputs integers of a certain prediction model. There was an addition of temperature variables every period region for which the test load was added with the predicted temperature for a particular forecasted hour. The average measured temperature value on a given hour and date within the previous five years was utilized to represent the predicted temperature. Many, who participated in GEFCom2014, for instance, [119, 120], used the average temperature to represent the predict in their approaches. Besides of the scenario, each hour and daily events indicators were utilized and used to codify the hours of day as well as seven days per week. The candidate set of inputs variables to be utilized to predict the value of the load L(t) can be summarized in the diagram below in equation (4.2). (4.2) Where L (T – i) means the lag behind load of t – i; where T (t – i) indicates the previous temperature of t – i; T (t) b representing the condition having the previous period of 0, which represent the predicted. The temperature for a certain predicting hour; and HI (t), DI (t), signify each hour and daily event results consecutively. DI (t) results to -2 for most of the weekends and set public, days set 0 for Thursday, and 1 for all days for working with the exception for Thursday. HI (t) amounts to 1, 3, 4, 5, 6 25 for the equivalent hours. In total, there are (11+6) + (12+6) + 2=41 main inputs in the set. Based on the same predictions, the input variables pool for the load numbers in the following dataset group (except region 8) is referred and shown by equation (4.2) at the time of modeling the load series representing load 9. Region 9 of the second set of data contains extremely different patterns of demand that don’t seem to be corresponding to the values of the temperature. Thus, the researchers excluded the information on the temperature equation (4.2) at the period of changing the load series located of Region 9. 4.4.2 Completion MetricsThis particular study is used to prove the performance of the forecasting model by utilizing the MAPE in equation (4.3) and MASE in equation (4.4) respectively. These can be defined as: (4.3) (4.4) Widely used standard of measurement, which measures the difference in percentage between the predictable and estimated standards. Where N is the forecasting horizon, is the is the actual value at period t + i, and 20 predicting value at period t+ i. As proposed in this article, the next day (24 hours) the short-term predicted load was predicted repetitively, therefore, numerous prediction periods N is equal to 24. The MAPE one of the most extensively used standard of measurement that measures the given percentage difference between the expected and estimated values. The tiny values obtained in MAPE integers represent the closest predictions to the real integers. Thus, the MASE refers to a reduced error which is particularly scaled by a unique predictive model. The value is often less than one particularly if the prediction is much better compared to the other technique, and the lesser the MASE, the best the results of the prediction relating to the unique technique. The MASE is often the most highly advised measure since it is less responsive to be used by the outliers and simple to interpret. 4.5 Identified Counterparts and their ImplementationsThere are five other feature selection techniques selected to prove the benefits that result from the proposed wrapper technique for STL utilizing SVR and used as reference points for purposes of comparison [123]. These counterparts and the approach proposed are written and abbreviated as shown below: (1) Full-The SVR prediction and predicting model with all the required inputs. (2) MI-SVR approach with MI-based attribute selection method [124].(3) Two Stage- SVR approach with two-stage attribute selection. (4) PMI-SVR method applied composed of the PMI based technique for actual characteristic-identification. (5) FA-SVR approach with implemented FA-based attribute selection. (6) H-FW-SVR approach with the suggested hybrid filter-wrapper procedure for attribute selection. In the abbreviations above, Full’ functions even when there is no feature selection in place. Previous studies provide MI’ and Two Stage’ as the existing techniques studies [122] . MI’ is designed to identify irrelevant features only that are based on MI. Two Stage’ is designed first to remove the features that are irrelevant in the MI and then remove the respective characteristics with respect to the value of MI between the two main integers. PMI and FA are two techniques that are mentioned in the previous sections. MI’, PMI and Two Stage’ represent the filter techniques. H-FW refers to the suggested hybrid wrapper approach. FA represents a wrapper method. SVR have been continuously applied as a forecasting modeler in all the methods mentioned above. One essential feature in SVR training and implementation involves the setting of a certain hyper parameters and kernel functions. This experiment identified the selected reference function to represent the kennel function after the main experimentation. The efficiency of SVR is not a big concern since the data sequence is not lengthy. Therefore, we conduct the matrix search to the parameters C, and Utilization of SVR was evaluated on the period of one day ahead forecasting. The multiple stages a head forecasts were determined via recursion. There is a note to be taken on the unit variable to be used to predict the preceding hour load although, there are several studies that investigated the strategies used for a number of steps ahead predicting the research is of no concern to us since we utilized recursion strategy only in the study. All the experiments conducted during the study were carried out in MATLAB 2014 utilizing the computer with the features Compaq Core 4 Duo Central Processing Unit T6850, 6.00 Gigahertz, and 4 GB Random Access Memory. The difference of the 6 SVR methods was the difference in methods used. The values containing MI’ and Two Stage’ were specifically s group based on their initial preferences. The program loading number indicated in PMI was specifically located to twenty. The population size of filter parameters was initially at 31, attractiveness 001, coefficient 121, and stopping procedure set as indicated; some changes amount to 151, or there is the lack of advancement in the fitness for almost twenty respective selected steps. Finally, we did not include each of the models 12 times, and the results of improvement mean reported.4.6 Results -The North American Electrical UtilityTable 6 summarizes the value of the results of feature selection by utilizing six main techniques for predicting hourly loads reported in January 1991, as a result of the space constraints. In the table, some selected inputs (No.’), List of inputs selected (Input (lags)’, time spent for feature and the characteristic identification (Tfs (minimum)’, the time spent for forecasting utilizing inputs selected (Ttf (minimum)’), time spent for the whole procedure (Tall (minimum)’), and MAPE of the relating (MAPE’) as portrayed. The value with that has the smallest number is written in bold. As seen from the table, although all the values of imputing dataset were used for developing the predicting model, the results of the forecasting step of full is worst, with higher MAPE compared to others, which necessitates the feature selection of STLF. This is as a result of the irrelevant features that cause the model to over fitting. The other 5 main feature selection techniques involve reduction in dimensionality, and the space models are then obtained without necessarily reducing the prediction accuracy.MI’ and Two Stage’ generates relatively high amount of inputs among the three filter techniques than PMI’. This occurs since MI-based approach removes the unwanted characteristics by calculating the MI value that is between each specific variable and output. This means that there are several repetitive features used in the identified inputs by utilizing MI’. Two Stages’ removes features that are repetitive in the next step but it doesn’t consider the already predictive values. On the other hand, PMI’ obtains essential characteristics (those that are so important and not redundant) stage by stage based on the partial information given. FA’ has the relatively higher amount of inputs compared to the hybrid H-FW’, and PMI’ which is primarily difficulties caused by the procedure in turning the 41-proportional search region. The mathematical calculation time (Tfs(min)’ shown in Table 6 takes more time due to the presence of the SVR and the search region of the filter which is high dimensional in nature. H-FW’ can identify the ideal characteristic subset well organized due to the small search region. The more inputs values selected increases the period for forecasting and training (Ttf(minimum)’) H-FW based model is often higher in terms of other counterparts, but this is allowable for daily conclusion making. Figure 24 illustrates the input subset selection using other feature selection methods comparing the accuracy of the prediction per month.Figure 24: Comparison of the MAPE and MASE for different models based on Case 1. In Table 6, values of the reduced input sets enabled an interpretation in an intuitive manner. For instance, the previous 12 hours involves several recent hour loads (preceding 1-6 hours) that relate to the current loads, are essential for prediction. Instead of utilizing the past various hourly loads, the loads representing the same hour are essential in forecasting, including the lagged loads obtained from the previous 36 and 168 hours. Also, the condition at the one hour if forecasting (T:1) has been predicted as one of the inputs for the 5 attribute selection criterion that portray that figure 25 presents the average predicted accuracy for every week during the testing period. The results confirm the supremacy stated in the proposed main hybrid-filter wrapper technique based on forecasting method and other techniques in a week. Moreover, it is noted that load forecasts represent higher errors in weekends than in workdays. This is due to particular load patterns of Friday and weekends are the district to normal days, which figure 22 has also indicated. Figure 26 shows the illustration of the forecasting respective performance of the model proposed an example of the given load and the predicted hourly load. It represents the actual load, the prediction, and errors in March1992. The outcome depicts the predicted values being quite close to the main values, and minimal mistakes are visible. Figure 25: Testing Average prediction of MAPE for each day per week Figure 26: Electricity Load versus time in hours4.7 Statistical Test AnalysisTwo stage statistical analyses are utilized to prove if the corresponding outcome acquired by the 6 predicting models on the 4 months that were put to the test shows that Case 2 was significantly different based on the Case 1 and 20 zonal loads. The statistical analysis utilizes the freedman tests and step wise, both of which involve strong statistical evidence, used for juxtaposition by a combination of the data obtained from the two main occurrences. The Friedman inspection is used to calculate the mean rank got by each model beyond both the data, to determine if remarkable differences are available between the predicting approaches based on the values of the mean ranks. Table 7 shows the differentiation of the MASE for various models in relation to Case 2. Table 8 shows the Freadman rank and the p-value of Davenport and Iman test MAPE. The part highlighted in bold shows the best technique with the lowest rank. The table portrays a p-value of Davenport and Iman test to be lower than the significance level 0.05. This shows the statistical difference in the results of the six models. To determine the statistical importance between the results obtained and the other main perspective, a step wise test, called the Holon test, was utilized at the importance stages of 0.2 and 0.06. Table 9 shows the results of the first 4 hypotheses that are prohibited (at the importance levels of 0.05) because their p-value is less than the value of the adjusted amounts (alpha/i). The last approach FA is so small that the result is rejected at 0.01 level.Table 6: Obtained results for 6 methods in predicting each hour loads in March 1992. Table 7: Differentiation of the MASE for various models in relation to Case 2. Zones Mode Full MI Two Stage PMI FA FW 1 0.86 0.93 0.74 0.58 0.72 0.41 2 0.88 0.86 0.84 0.88 0.68 0.86 3 0.84 0.68 0.69 0.65 0.68 0.66 4 0.75 0.68 0.62 0.68 0.62 0.52 5 0.86 0.87 0.82 0.69 0.61 0.65 6 0.72 0.86 0.88 0.61 0.59 0.58 7 0.94 0.83 0.86 0.68 0.68 0.59 8 0.71 0.79 0.79 0.74 0.61 0.66 9 0.91 0.84 0.95 0.87 0.87 0.84 10 0.67 0.67 0.60 0.70 0.68 0.60 11 0.76 0.89 0.80 0.59 0.68 0.50 12 0.81 0.83 0.85 0.79 0.62 0.63 13 0.77 0.61 0.63 0.66 0.62 0.69 14 0.81 0.78 0.78 0.80 0.67 0.60 15 0.61 0.69 0.62 0.67 0.68 0.60 16 0.64 0.65 0.57 0.55 0.60 0.51 17 0.63 0.64 0.69 0.61 0.60 0.56 18 0.72 0.64 0.54 0.55 0.58 0.53 19 0.69 0.65 0.66 0.63 0.50 0.51 20 0.73 0.64 0.68 0.56 0.50 0.53 21 0.73 0.73 0.78 0.67 0.60 0.59 Table 8: Friedman’s experiment outcome for different predicting models. Iman and Algorithm Friedman Rank Davenport p-value HypothesisFull 6.79 MI 6.10 Two Stage 4.83 2.61E-49 Rejected PMI 4.48 FA 3.68 FW 2.42 Table 9: Holm’s experiment outcome for different predicting models. i Algorithm z-values p-values alpha(0.05)/i alpha(0.1)/i5 Full 9.92 1.24E-19** 1.01E-03 2.10E-024 MI 7.59 7.17E-14** 1.26E-02 2.60E-023 TwoStage 5.14 4.54E-07** 1.77E-02 3.43E-022 PMI 4.47 1.32E-05** 2.60E-02 5.10E-021 FA 1.95 6.49E-02* 5.10E-02 1.11E-01Note: **: important at 0.05 level; *: important at 0.1 level. 4.8 ConclusionAttribute selection is an important step in STLF to simplify the interpretation of dataset and process of learning forecasting model. This study proposes the main high quality filter-wrapper attribute selection technique to address the issue relating to the characteristics selection issue. The suggested hybrid technique comprises of an FA-based wrapper approach and a PMI filter technique. Firstly, the filter procedure is utilized to do away with the redundant and irrelevant features, to produce an input values. A wrapper technique is applied to the minimized subset to determine the small set of features with large prediction accuracy. Exploratory results have indicated that the approach suggested can be able to identify leases input integers compared to other old methods and it is a more efficient wrapper method. Thus, the hybrid filter-wrapper technique is an excellent alternative for the process of attribute selection in STLF. Therefore, in this particular study, the most common input values only were greatly considered in the model choosing procedure. Forthcoming work needs to be done to analyze other essential factors and more lagged factors to improve the prediction accuracy. Since the load methods of remarkable days are distinct from normal weekdays, leading to a particular prediction model for the given special days is worth research attempt. Other areas to be done in future research comprise an extra comprehensive juxtaposition of the approach suggested with other established highest level models, and in the process apply the proposed technique for future long-term electricity price forecasting and load prediction.