A Comparative Study of Machine Learning-based Approach for Network Traffic Classification

Kien Trang, An Hoang Nguyen

Abstract


Internet usage has increased rapidly and become an essential part of human life, corresponding to the rapid development of network infrastructure in recent years. Thus, protecting users’ confidential information when joining the global network becomes one of the most significant considerations. Even though multiple encryption algorithms and techniques have been applied in different parties, including internet providers, and web hosting, this situation also allows the hacker to attack the network system anonymously. Therefore, the significance of classifying network data streams to improve network system quality and security is attracting increasing study interests. This work introduces a machine learning-based approach to find the most suitable training model for network traffic classification tasks. Data pre-processing is first applied to normalize each feature type in the dataset. Different machine learning techniques, including k-Nearest Neighbors (KNN), Artificial Neural Network (ANN), and Random Forest (RF), are applied based on the normalized features in the classification phase. An open-access dataset ISCXVPN2016 is applied for this research, which includes two types of encryption (VPN and Non-VPN) and seven classes of traffic categories classes. Experimental results on the open dataset have shown that the proposed models have reached a high classification rate – over 85% in some cases, in which the RF model obtains the most refined results among the three techniques.


Full Text:

PDF

References


G.R. El Said, “How Did the COVID-19 Pandemic Affect Higher Education Learning Experience? An Empirical Investigation of Learners’ Academic Performance at a University in a Developing Country”, Advances in Human-Computer Interaction, vol. 2021, pp. 1–10, Feb. 2021.

L. Yang, D. Holtz, S. Jaffe, S. Suri, S. Sinha, J. Weston, C. Joyce, N. Shah, K. Sherman, B. Hecht, and J. Teevan, “The effects of remote work on collaboration among information workers,” Nature Human Behaviour, Sep. 2021.

L. Stewart, G. Armitage, P. Branch, and S. Zander, "An Architecture for Automated Network Control of QoS over Consumer Broadband Links," TENCON 2005 - 2005 IEEE Region 10 Conference, pp. 1-6, November 2005.

T. Karagiannis, A. Broido, M. Faloutsos, and K. claffy, “Transport layer identification of P2P traffic,” Proceeding of the 4th ACM SIGCOMM conference on Internet measurement (IMC '04), New York, pp. 121–134, September 2004.

P. B. Park, Y. Won, J. Chung, M. Kim, and J. W.-K. Hong, “Fine-grained traffic classification based on functional separation,” International Journal of Network Management, vol. 23, no. 5, pp. 350–381, Aug. 2013.

G. Aceto, A. Dainotti, W. de Donato and A. Pescape, "PortLoad: Taking the Best of Two Worlds in Traffic Classification," 2010 INFOCOM IEEE Conference on Computer Communications Workshops, pp. 1-5, March 2010.

Z. Yuan and C. Wang, "An improved network traffic classification algorithm based on Hadoop decision tree," 2016 IEEE International Conference of Online Analysis and Computing Science (ICOACS), pp. 53-56, May 2016.

M. Shafiq, X. Yu, A. A. Laghari, L. Yao, N. K. Karn and F. Abdessamia, "Network Traffic Classification techniques and comparative analysis using Machine Learning algorithms," 2016 2nd IEEE International Conference on Computer and Communications (ICCC), pp. 2451-2455, October 2016.

Z. Fan and R. Liu, "Investigation of machine learning based network traffic classification," 2017 International Symposium on Wireless Communication Systems (ISWCS), pp. 1-6, August 2017.

A. Pasyuk, E. Semenov and D. Tyuhtyaev, "Feature Selection in the Classification of Network Traffic Flows," 2019 International Multi-Conference on Industrial Engineering and Modern Technologies (FarEastCon), pp. 1-5, October 2019.

Y. Wang, Y. Xiang and S. Yu, "Internet Traffic Classification Using Machine Learning: A Token-based Approach," 2011 14th IEEE International Conference on Computational Science and Engineering, pp. 285-289, August 2011.

S. Dong and R. Jain, “Flow online identification method for the encrypted Skype,” in Journal of Network and Computer Applications, vol 132, pp. 75-85.

M. Dixit, R. Sharma, S. Shaikh and K. Muley, "Internet Traffic Detection using Naïve Bayes and K-Nearest Neighbors (KNN) algorithm," 2019 International Conference on Intelligent Computing and Control Systems (ICCS), pp. 1153-1157, May 2019.

F. Zhang, Y. Wang and M. Ye, "Network Traffic Classification Method Based on Improved Capsule Neural Network," 2018 14th International Conference on Computational Intelligence and Security (CIS), pp. 174-178, November 2018.

H. Lim, J. Kim, J. Heo, K. Kim, Y. Hong and Y. Han, "Packet-based Network Traffic Classification Using Deep Learning," 2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), pp. 046-05, February 2019.

J. Kwon, D. Jung and H. Park, "Traffic Data Classification using Machine Learning Algorithms in SDN Networks," 2020 International Conference on Information and Communication Technology Convergence (ICTC), pp. 1031-1033, October 2020.

Z. Li, Z. Qin, K. Huang, X. Yang, and S. Ye, “Intrusion Detection Using Convolutional Neural Networks for Representation Learning,” Lecture Notes in Computer Science, pp. 858–866, 2017.

A. S. Iliyasu and H. Deng, "Semi-Supervised Encrypted Traffic Classification with Deep Convolutional Generative Adversarial Networks," in IEEE Access, vol. 8, pp. 118-126, 2020.

G. D’Angelo and F. Palmieri, "Network traffic classification using deep convolutional recurrent autoencoder neural networks for spatial–temporal features extraction," Journal of Network and Computer Applications, vol. 173, pp. 102890, 2021.

G. Draper-Gil, A. H. Lashkari, M. S. I. Mamun, and A. A. Ghorbani, “Characterization of Encrypted and VPN Traffic using Time-related Features,” Proceedings of the 2nd International Conference on Information Systems Security and Privacy (ICISSP2016), pp. 407-414, February 2016.

H. A. H. Ibrahim, O. R. Aqeel Al Zuobi, M. A. Al-Namari, G. Mohamed Ali, and A. A. A. Abdalla, "Internet traffic classification using machine learning approach: Datasets validation issues," 2016 Conference of Basic Sciences and Engineering Studies (SGCAC), pp. 158-166, February 2016.

A. Moldagulova and R. B. Sulaiman, "Using KNN algorithm for classification of textual documents," 2017 8th International Conference on Information Technology (ICIT), pp. 665-671, May 2017.

J. R. Quinlan, “Induction of decision trees,” Machine Learning, vol. 1, no. 1, pp. 81–106, Mar. 1986.




DOI: http://dx.doi.org/10.17977/um018v4i22021p128-137

Refbacks

  • There are currently no refbacks.


Copyright (c) 2021 Knowledge Engineering and Data Science

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Flag Counter

Creative Commons License


This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

View My Stats