Optimized Three Deep Learning Models Based-PSO Hyperparameters for Beijing PM2.5 Prediction

Andri Pranolo; Yingchi Mao; Aji Prasetya Wibawa; Agung Bella Putra Utama; Felix Andika Dwiyanto

doi:10.17977/um018v5i12022p53-66

Optimized Three Deep Learning Models Based-PSO Hyperparameters for Beijing PM2.5 Prediction

Andri Pranolo, Yingchi Mao, Aji Prasetya Wibawa, Agung Bella Putra Utama, Felix Andika Dwiyanto

Abstract

Deep learning is a machine learning approach that produces excellent performance in various applications, including natural language processing, image identification, and forecasting. Deep learning network performance depends on the hyperparameter settings. This research attempts to optimize the deep learning architecture of Long short term memory (LSTM), Convolutional neural network (CNN), and Multilayer perceptron (MLP) for forecasting tasks using Particle swarm optimization (PSO), a swarm intelligence-based metaheuristic optimization methodology: Proposed M-1 (PSO-LSTM), M-2 (PSO-CNN), and M-3 (PSO-MLP). Beijing PM2.5 datasets was analyzed to measure the performance of the proposed models. PM2.5 as a target variable was affected by dew point, pressure, temperature, cumulated wind speed, hours of snow, and hours of rain. The deep learning network inputs consist of three different scenarios: daily, weekly, and monthly. The results show that the proposed M-1 with three hidden layers produces the best results of RMSE and MAPE compared to the proposed M-2, M-3, and all the baselines. A recommendation for air pollution management could be generated by using these optimized models.

Full Text:

PDF

References

T. Yu and H. Zhu, “Hyper-Parameter Optimization: A Review of Algorithms and Applications,” arXiv Prepr. arXiv2003.05689, Mar. 2020.

M. Tan and Q. V. Le, “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks,” arXiv Prepr., May 2019.

N. Ma, X. Zhang, H. T. Zheng, and J. Sun, “Shufflenet v2: Practical guidelines for efficient cnn architecture design,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 116–131.

M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. C. Chen, “Mobilenetv2: Inverted residuals and linear bottlenecks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4510–4520.

X. Zhang, X. Chen, L. Yao, C. Ge, and M. Dong, “Deep Neural Network Hyperparameter Optimization with Orthogonal Array Tuning,” in Neural Information Processing, T. Gedeon, K. Wong, and M. Lee, Eds. Springer, 2019, pp. 287–295.

N. Gorgolis, I. Hatzilygeroudis, Z. Istenes, and L. n G. Gyenne, “Hyperparameter Optimization of LSTM Network Models through Genetic Algorithm,” in 2019 10th International Conference on Information, Intelligence, Systems and Applications (IISA), Jul. 2019, pp. 1–4.

G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov, “Improving neural networks by preventing co-adaptation of feature detectors,” arXiv Prepr. arXiv1207.0580, Jul. 2012.

A. Farzad, H. Mashayekhi, and H. Hassanpour, “A comparative performance analysis of different activation functions in LSTM networks for classification,” Neural Comput. Appl., vol. 31, no. 7, pp. 2507–2521, Jul. 2019.

M. D. Zeiler, “ADADELTA: An Adaptive Learning Rate Method,” arXiv Prepr. arXiv1212.5701, Dec. 2012.

X. Liang et al., “Assessing Beijing’s PM2.5 pollution: Severity, weather impact, APEC and winter heating,” Proc. R. Soc. A Math. Phys. Eng. Sci., vol. 471, no. 2182, 2015.

M. Zhang, D. Wu, and R. Xue, “Hourly prediction of PM2.5 concentration in Beijing based on Bi-LSTM neural network,” Multimed. Tools Appl., vol. 80, no. 16, pp. 24455–24468, 2021.

S. E. Buttrey, “ Data Mining Algorithms Explained Using R ,” J. Stat. Softw., vol. 66, no. Book Review 2, 2015.

R. Elshawi, M. Maher, and S. Sakr, “Automated Machine Learning: State-of-The-Art and Open Challenges,” Jun. 2019.

L. Yang and A. Shami, “On hyperparameter optimization of machine learning algorithms: Theory and practice,” Neurocomputing, vol. 415, pp. 295–316, Nov. 2020.

F. Hutter, L. Kotthoff, and J. Vanschoren, Automated Machine Learning. Cham: Springer International Publishing, 2019.

N. Xue, I. Triguero, G. P. Figueredo, and D. Landa-Silva, “Evolving Deep CNN-LSTMs for Inventory Time Series Prediction,” 2019 IEEE Congr. Evol. Comput. CEC 2019 - Proc., pp. 1517–1524, 2019.

M.-A. Zöller and M. F. Huber, “Benchmark and Survey of Automated Machine Learning Frameworks,” Apr. 2019.

X.-H. Yan, F.-Z. He, and Y.-L. Chen, “A Novel Hardware/Software Partitioning Method Based on Position Disturbed Particle Swarm Optimization with Invasive Weed Optimization,” J. Comput. Sci. Technol., vol. 32, no. 2, pp. 340–355, Mar. 2017.

M.-Y. Cheng, K.-Y. Huang, and M. Hutomo, “Multiobjective Dynamic-Guiding PSO for Optimizing Work Shift Schedules,” J. Constr. Eng. Manag., vol. 144, no. 9, p. 04018089, Sep. 2018.

S. Rahnamayan, H. R. Tizhoosh, and M. M. A. Salama, “A novel population initialization method for accelerating evolutionary algorithms,” Comput. Math. with Appl., vol. 53, no. 10, pp. 1605–1614, May 2007.

H. Wang, Z. Wu, J. Wang, X. Dong, S. Yu, and C. Chen, “A New Population Initialization Method Based on Space Transformation Search,” in 2009 Fifth International Conference on Natural Computation, 2009, pp. 332–336.

M. Hiransha, E. A. Gopalakrishnan, V. K. Menon, and K. P. Soman, “NSE Stock Market Prediction Using Deep-Learning Models,” in Procedia Computer Science, 2018, vol. 132, pp. 1351–1362.

Y. S. Park and S. Lek, Artificial Neural Networks: Multilayer Perceptron for Ecological Modeling, vol. 28. Elsevier, 2016.

T. Marwala, “Multi-layer Perceptron,” Handb. Mach. Learn., no. 2001, pp. 23–42, 2018.

J. Gamboa, “Deep Learning for Time-Series Analysis,” arXiv, 2017.

P. Gao, R. Zhang, and X. Yang, “The application of stock index price prediction with neural network,” Math. Comput. Appl., vol. 25, no. 3, 2020.

W. Lu, J. Li, Y. Li, A. Sun, and J. Wang, “A CNN-LSTM-based model to forecast stock prices,” Complexity, vol. 2020, 2020.

J. M. Nazzal, I. M. El-emary, S. a Najim, A. Ahliyya, P. O. Box, and K. S. Arabia, “Multilayer Perceptron Neural Network ( MLPs ) For Analyzing the Properties of Jordan Oil Shale,” World Appl. Sci. J., vol. 5, no. 5, pp. 546–552, 2008.

G. Van Houdt, C. Mosquera, and G. Nápoles, “A review on the long short-term memory model,” Artif. Intell. Rev., vol. 53, no. 8, pp. 5929–5955, Dec. 2020.

Ferdiansyah, S. H. Othman, R. Zahilah Raja Md Radzi, D. Stiawan, Y. Sazaki, and U. Ependi, “A LSTM-Method for Bitcoin Price Prediction: A Case Study Yahoo Finance Stock Market,” ICECOS 2019 - 3rd Int. Conf. Electr. Eng. Comput. Sci. Proceeding, no. March 2020, pp. 206–210, 2019.

M. Lechner and R. Hasani, “Learning Long-Term Dependencies in Irregularly-Sampled Time Series,” arXiv, 2020.

H. Wang, Z. Yang, Q. Yu, T. Hong, and X. Lin, “Online reliability time series prediction via convolutional neural network and long short term memory for service-oriented systems,” Knowledge-Based Syst., vol. 159, pp. 132–147, 2018.

J. Lu, Q. Zhang, Z. Yang, and M. Tu, “A hybrid model based on convolutional neural network and long short-term memory for short-term load forecasting,” IEEE Power Energy Soc. Gen. Meet., vol. 2019-Augus, 2019.

A. K. Jain, C. Grumber, P. Gelhausen, I. Häring, and A. Stolz, “A Toy Model Study for Long-Term Terror Event Time Series Prediction with CNN,” Eur. J. Secur. Res., vol. 5, no. 2, pp. 289–309, 2020.

S. S. Baek, J. Pyo, and J. A. Chun, “Prediction of water level and water quality using a cnn-lstm combined deep learning approach,” Water (Switzerland), vol. 12, no. 12, 2020.

S. Selvin, R. Vinayakumar, E. A. Gopalakrishnan, V. K. Menon, and K. P. Soman, “Stock price prediction using LSTM, RNN and CNN-sliding window model,” in 2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017, 2017, vol. 2017-Janua, pp. 1643–1647.

C. Yang, J. Zhai, G. Tao, and P. Haajek, “Deep Learning for Price Movement Prediction Using Convolutional Neural Network and Long Short-Term Memory,” Math. Probl. Eng., vol. 2020, 2020.

S. Mehtab and J. Sen, “Stock Price Prediction Using CNN and LSTM-Based Deep Learning Models,” 2020 Int. Conf. Decis. Aid Sci. Appl. DASA 2020, pp. 447–453, 2020.

J. M. T. Wu, Z. Li, N. Herencsar, B. Vo, and J. C. W. Lin, “A graph-based CNN-LSTM stock price prediction algorithm with leading indicators,” Multimed. Syst., no. Special Issue Paper, 2021.

A. J. Dautel, W. K. Härdle, S. Lessmann, and H.-V. Seow, “Forex exchange rate forecasting using deep recurrent neural networks,” Digit. Financ., vol. 2, no. 1, pp. 69–96, 2020.

A. S. Lundervold and A. Lundervold, “An overview of deep learning in medical imaging focusing on MRI,” Z. Med. Phys., vol. 29, no. 2, pp. 102–127, May 2019.

E. Lewinson, “Python for Finance Cookbook,” in Over 50 recipes for applying modern Python libraries to financial data analysis, 1st ed., Packt Publishing, 2020, p. 434.

K. Wang, K. Li, L. Zhou, Y. Hu, and Z. Cheng, “Multiple convolutional neural networks for multivariate time series prediction,” Neurocomputing, vol. 360, pp. 107–119, 2019.

E. Hoseinzade and S. Haratizadeh, “CNNpred: CNN-based stock market prediction using a diverse set of variables,” Expert Syst. Appl., vol. 129, pp. 273–285, 2019.

L. Ni, Y. Li, X. Wang, J. Zhang, J. Yu, and C. Qi, “Forecasting of Forex Time Series Data Based on Deep Learning,” Procedia Comput. Sci., vol. 147, pp. 647–652, 2019.

I. Halimi, G. I. Marthasari, and Y. Azhar, “Prediksi Harga Emas Menggunakan Univariate Convolutional Neural Network,” J. Repos., vol. 1, no. 2, p. 105, 2019.

A. Vidal and W. Kristjanpoller, “Gold volatility prediction using a CNN-LSTM approach,” Expert Syst. Appl., vol. 157, 2020.

I. E. Livieris, E. Pintelas, and P. Pintelas, “A CNN–LSTM model for gold price time-series forecasting,” Neural Comput. Appl., vol. 32, no. 23, pp. 17351–17360, 2020.

R. Yamashita, M. Nishio, R. K. G. Do, and K. Togashi, “Convolutional neural networks: an overview and application in radiology,” Insights Imaging, vol. 9, no. 4, pp. 611–629, Aug. 2018.

S. Singhal, H. Kumar, and V. Passricha, “Prediction of Heart disease using DNN,” Am. Interantional J. Res. Sci. Technol. Eng. Math., no. November, pp. 257–261, 2018.

G. T. Taye, H. J. Hwang, and K. M. Lim, “Application of a convolutional neural network for predicting the occurrence of ventricular tachyarrhythmia using heart rate variability features,” Sci. Rep., vol. 10, no. 1, pp. 1–7, 2020.

M. Afrasiabi, H. khotanlou, and M. Mansoorizadeh, “DTW-CNN: time series-based human interaction prediction in videos using CNN-extracted features,” Vis. Comput., vol. 36, no. 6, pp. 1127–1139, 2020.

P. Liu, J. Liu, and K. Wu, “CNN-FCM: System modeling promotes stability of deep learning in time series prediction,” Knowledge-Based Syst., vol. 203, p. 106081, 2020.

Z. Zhang, Y. Dong, and Y. Yuan, “Temperature Forecasting via Convolutional Recurrent Neural Networks Based on Time-Series Data,” Complexity, vol. 2020, 2020.

A. G. Salman, B. Kanigoro, and Y. Heryadi, “Weather Forecasting using Deep Learning Techniques,” ICACSIS, pp. 281–285, 2015.

T. T. Kieu Tran, T. Lee, J. Y. Shin, J. S. Kim, and M. Kamruzzaman, “Deep learning-based maximum temperature forecasting assisted with meta-learning for hyperparameter optimization,” Atmosphere (Basel)., vol. 11, no. 5, pp. 1–21, 2020.

Z. Alameer, M. A. Elaziz, A. A. Ewees, H. Ye, and Z. Jianhua, “Forecasting gold price fluctuations using improved multilayer perceptron neural network and whale optimization algorithm,” Resour. Policy, vol. 61, no. September 2018, pp. 250–260, 2019.

DOI: http://dx.doi.org/10.17977/um018v5i12022p53-66