| Peer-Reviewed

Exploiting Machine Learning Algorithms for Predicting Crash Injury Severity in Yemen: Hospital Case Study

Received: 27 August 2020     Accepted: 14 September 2020     Published: 28 September 2020
Views:       Downloads:
Abstract

This study focused on exploiting machine learning algorithms for classifying and predicting injury severity of vehicle crashes in Yemen. The primary objective is to assess the contribution of the leading causes of injury severity. The selected machine learning algorithms compared with traditional statistical methods. The filtrated second data collected within two months (August-October 2015) from the two main hospitals included 156 injured patients of vehicle crashes reported from 128 locations. The data classified into three categories of injury severity: Severe, Serious, and Minor. It balanced using a synthetic minority oversampling technique (SMOTE). Multinomial logit model (MNL) compared with five machine learning classifiers: Naïve Bayes (NB), J48 Decision Tree, Random Forest (RF), Support Vector Machine (SVM), and Multilayer Perceptron (MLP). The results showed that most of machine learning-based algorithms performed well in predicting and classifying the severity of the traffic injury. Out of five classifiers, RF is the best classifier with 94.84% of accuracy. The characteristics of road type, total injured person, crash type, road user, transport way to the emergency department (ED), and accident action were the most critical factors in the severity of the traffic injury. Enhancing strategies for using roadway facilities may improve the safety of road users and regulations.

Published in Applied and Computational Mathematics (Volume 9, Issue 5)
DOI 10.11648/j.acm.20200905.12
Page(s) 155-164
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2020. Published by Science Publishing Group

Keywords

Crash Injury Severity, Machine Learning, Traditional Statistical Methods, SMOTE, WEKA

References
[1] W. H. Organization, "Global status report on road safety 2018: Summary," World Health Organization 2018.
[2] N. Dhakal, "Using Naturalistic Data for Bicycle Safety Analysis: An Application of Cycle Philly Data to Assess Wrong-Way Riding," 2017.
[3] S. P. Wall et al., "The effect of sharrows, painted bicycle lanes and physically protected paths on the severity of bicycle injuries caused by motor vehicles," Safety, vol. 2, no. 4, p. 26, 2016.
[4] J. J. Rolison, S. Regev, S. Moutari, and A. Feeney, "What are the factors that contribute to road accidents? An assessment of law enforcement views, ordinary drivers’ opinions, and road accident records," Accident Analysis & Prevention, vol. 115, pp. 11-24, 2018.
[5] D. Khorasani-Zavareh and H. Sadeghi-Bazargani, "Epidemiological pattern of motorcycle injuries with focus on riding purpose: Experience from a middle-income country," Journal of Research in Clinical Medicine, vol. 3, no. 3, pp. 149-156, 2015.
[6] V. M. Nantulya and M. R. Reich, "The neglected epidemic: road traffic injuries in developing countries," Bmj, vol. 324, no. 7346, pp. 1139-1141, 2002.
[7] M. Peden et al., "World report on road traffic injury prevention. World Health Organization," in Hyder AA, Jarawan E, Mathers C, 2004: Citeseer.
[8] J. K. Wachter and P. L. Yorio, "A system of safety management practices and worker engagement for reducing and preventing accidents: An empirical and theoretical investigation," Accident Analysis & Prevention, vol. 68, pp. 117-130, 2014.
[9] A. A. Al-Thaifani, N. A. Al-Rabeei, and A. M. Dallak, "Study of the injured persons and the injury pattern in road traffic accident in Sana’a city, Yemen," Advances in Public Health, vol. 2016, 2016.
[10] J. R. Ameen and J. A. Naji, "Causal models for road accident fatalities in Yemen," Accident Analysis & Prevention, vol. 33, no. 4, pp. 547-561, 2001.
[11] F. M. Karim, "Road traffic accidents in Yemen," International journal of injury control and safety promotion, vol. 15, no. 3, pp. 165-166, 2008.
[12] F. M. Karim, A. A. Saleh, A. Taijoobux, and M. Ševrović, "Developing a Model for Forecasting Road Traffic Accident (RTA) Fatalities in Yemen," Slovak Journal of Civil Engineering, vol. 25, no. 4, pp. 12-18, 2017.
[13] E. Alfalahi, A. Assabri, and Y. Khader, "Pattern of road traffic injuries in Yemen: a hospital-based study," Pan African medical journal, vol. 29, no. 1, pp. 1-9, 2018.
[14] A. Iranitalab and A. Khattak, "Comparison of four statistical and machine learning methods for crash severity prediction," Accident Analysis & Prevention, vol. 108, pp. 27-36, 2017.
[15] R. E. Al Mamlook, K. M. Kwayu, M. R. Alkasisbeh, and A. A. Frefer, "Comparison of machine learning algorithms for predicting traffic accident severity," in 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology (JEEIT), 2019, pp. 272-276: IEEE.
[16] M. B. Anvari, A. T. Kashani, and R. Rabieyan, "Identifying the most important factors in the at-fault probability of motorcyclists by data mining, based on classification tree models," International Journal of Civil Engineering, vol. 15, no. 4, pp. 653-662, 2017.
[17] H. Jiang, Y. Zou, S. Zhang, J. Tang, and Y. Wang, "Short-term speed prediction using remote microwave sensor data: machine learning versus statistical model," Mathematical Problems in Engineering, vol. 2016, 2016.
[18] S. Kumar and D. Toshniwal, "Severity analysis of powered two wheeler traffic accidents in Uttarakhand, India," European transport research review, vol. 9, no. 2, p. 24, 2017.
[19] A. Montella, M. Aria, A. D’Ambrosio, and F. Mauriello, "Analysis of powered two-wheeler crashes in Italy by classification trees and rules discovery," Accident Analysis & Prevention, vol. 49, pp. 58-72, 2012.
[20] L.-Y. Chang and J.-T. Chien, "Analysis of driver injury severity in truck-involved accidents using a non-parametric classification tree model," Safety science, vol. 51, no. 1, pp. 17-22, 2013.
[21] Z. Halim, R. Kalsoom, S. Bashir, and G. Abbas, "Artificial intelligence techniques for driving safety and vehicle crash prediction," Artificial Intelligence Review, vol. 46, no. 3, pp. 351-387, 2016.
[22] D. W. Kononen, C. A. Flannagan, and S. C. Wang, "Identification and validation of a logistic regression model for predicting serious injuries associated with motor vehicle crashes," Accident Analysis & Prevention, vol. 43, no. 1, pp. 112-122, 2011.
[23] F. R. Moghaddam, S. Afandizadeh, and M. Ziyadi, "Prediction of accident severity using artificial neural networks," International Journal of Civil Engineering, vol. 9, no. 1, p. 41, 2011.
[24] I. Aghayan, M. H. Hosseinlou, and M. M. Kunt, "Application of support vector machine for crash injury severity prediction: A model comparison approach," Journal of Civil Engineering and Urbanism, vol. 5, no. 5, pp. 193-199, 2015.
[25] S. H.-A. Hashmienejad and S. M. H. Hasheminejad, "Traffic accident severity prediction using a novel multi-objective genetic algorithm," International journal of crashworthiness, vol. 22, no. 4, pp. 425-440, 2017.
[26] W. H. Organization, Injury surveillance guidelines (no. WHO/NMH/VIP/01.02). World Health Organization, 2001.
[27] L. Wahab and H. Jiang, "A comparative study on machine learning based algorithms for prediction of motorcycle crash severity," PLoS one, vol. 14, no. 4, p. e0214966, 2019.
[28] T. Cai, H. He, and W. Zhang, "Breast cancer diagnosis using imbalanced learning and ensemble method," Applied and Computational Mathematics, vol. 7, no. 3, pp. 146-154, 2018.
[29] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, "SMOTE: synthetic minority over-sampling technique," Journal of artificial intelligence research, vol. 16, pp. 321-357, 2002.
[30] C. Sugetha, L. Karunya, E. Prabhavathi, and P. K. Sujatha, "Performance evaluation of classifiers for analysis of road accidents," in 2017 Ninth International Conference on Advanced Computing (ICoAC), 2017, pp. 365-368: IEEE.
[31] I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal, Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, 2016.
[32] J. R. Quinlan, C4. 5: programs for machine learning. Elsevier, 2014.
[33] R. Genuer, J.-M. Poggi, C. Tuleau-Malot, and N. Villa-Vialaneix, "Random forests for big data," Big Data Research, vol. 9, pp. 28-46, 2017.
[34] J. Ren, S. D. Lee, X. Chen, B. Kao, R. Cheng, and D. Cheung, "Naive bayes classification of uncertain data," in 2009 Ninth IEEE International Conference on Data Mining, 2009, pp. 944-949: IEEE.
[35] M. Stitson, J. Weston, A. Gammerman, V. Vovk, and V. Vapnik, "Theory of support vector machines," University of London, vol. 117, no. 827, pp. 188-191, 1996.
[36] B. Kijsirikul and N. Ussivakul, "Multiclass support vector machines using adaptive directed acyclic graph," in Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No. 02CH37290), 2002, vol. 1, pp. 980-985: IEEE.
[37] C. Nwankpa, W. Ijomah, A. Gachagan, and S. Marshall, "Activation functions: Comparison of trends in practice and research for deep learning," arXiv preprint arXiv: 1811.03378, 2018.
[38] A. M. Kalteh and R. Berndtsson, "Interpolating monthly precipitation by self-organizing map (SOM) and multilayer perceptron (MLP)," Hydrological sciences journal, vol. 52, no. 2, pp. 305-317, 2007.
[39] J. S. Long and J. Freese, Regression models for categorical dependent variables using Stata. Stata press, 2006.
[40] L. Kotthoff, C. Thornton, and F. Hutter, "User guide for auto-WEKA version 2.6," Dept. Comput. Sci., Univ. British Columbia, BETA lab, Vancouver, BC, Canada, Tech. Rep, vol. 2, 2017.
[41] M. Hossin and M. Sulaiman, "A review on evaluation metrics for data classification evaluations," International Journal of Data Mining & Knowledge Management Process, vol. 5, no. 2, p. 1, 2015.
[42] B. D. Eugenio and M. Glass, "The kappa statistic: A second look," Computational linguistics, vol. 30, no. 1, pp. 95-101, 2004.
[43] P. Hajek and R. Henriques, "Mining corporate annual reports for intelligent detection of financial statement fraud–A comparative study of machine learning methods," Knowledge-Based Systems, vol. 128, pp. 139-152, 2017.
[44] J. Sun, H. Fujita, P. Chen, and H. Li, "Dynamic financial distress prediction with concept drift based on time weighting combined with Adaboost support vector machine ensemble," Knowledge-Based Systems, vol. 120, pp. 4-14, 2017.
[45] T. Chai and R. R. Draxler, "Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against avoiding RMSE in the literature," Geoscientific model development, vol. 7, no. 3, pp. 1247-1250, 2014.
[46] R. Aldred, S. García-Herrero, E. Anaya, S. Herrera, and M. Á. Mariscal, "Cyclist injury severity in Spain: a Bayesian analysis of police road injury data focusing on involved vehicles and route environment," International journal of environmental research and public health, vol. 17, no. 1, p. 96, 2020.
[47] Y. Zheng, Y. Ma, N. Li, and J. Cheng, "Personality and behavioral predictors of cyclist involvement in crash-related conditions," International journal of environmental research and public health, vol. 16, no. 24, p. 4881, 2019.
[48] V. P. Nyakyi, "Modelling Assessment on Causes of Road Accidents Along Kilimanjaro-Arusha Highway in Tanzania," Applied and Computational Mathematics, vol. 7, no. 2, pp. 71-74, 2018.
[49] C. E. Anderson, A. Zimmerman, S. Lewis, J. Marmion, and J. Gustat, "Patterns of cyclist and pedestrian street crossing behavior and safety on an urban greenway," International journal of environmental research and public health, vol. 16, no. 2, p. 201, 2019.
Cite This Article
  • APA Style

    Tariq Al-Moqri, Xiao Haijun, Jean Pierre Namahoro, Eshrak Naji Alfalahi, Ibrahim Alwesabi. (2020). Exploiting Machine Learning Algorithms for Predicting Crash Injury Severity in Yemen: Hospital Case Study. Applied and Computational Mathematics, 9(5), 155-164. https://doi.org/10.11648/j.acm.20200905.12

    Copy | Download

    ACS Style

    Tariq Al-Moqri; Xiao Haijun; Jean Pierre Namahoro; Eshrak Naji Alfalahi; Ibrahim Alwesabi. Exploiting Machine Learning Algorithms for Predicting Crash Injury Severity in Yemen: Hospital Case Study. Appl. Comput. Math. 2020, 9(5), 155-164. doi: 10.11648/j.acm.20200905.12

    Copy | Download

    AMA Style

    Tariq Al-Moqri, Xiao Haijun, Jean Pierre Namahoro, Eshrak Naji Alfalahi, Ibrahim Alwesabi. Exploiting Machine Learning Algorithms for Predicting Crash Injury Severity in Yemen: Hospital Case Study. Appl Comput Math. 2020;9(5):155-164. doi: 10.11648/j.acm.20200905.12

    Copy | Download

  • @article{10.11648/j.acm.20200905.12,
      author = {Tariq Al-Moqri and Xiao Haijun and Jean Pierre Namahoro and Eshrak Naji Alfalahi and Ibrahim Alwesabi},
      title = {Exploiting Machine Learning Algorithms for Predicting Crash Injury Severity in Yemen: Hospital Case Study},
      journal = {Applied and Computational Mathematics},
      volume = {9},
      number = {5},
      pages = {155-164},
      doi = {10.11648/j.acm.20200905.12},
      url = {https://doi.org/10.11648/j.acm.20200905.12},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.acm.20200905.12},
      abstract = {This study focused on exploiting machine learning algorithms for classifying and predicting injury severity of vehicle crashes in Yemen. The primary objective is to assess the contribution of the leading causes of injury severity. The selected machine learning algorithms compared with traditional statistical methods. The filtrated second data collected within two months (August-October 2015) from the two main hospitals included 156 injured patients of vehicle crashes reported from 128 locations. The data classified into three categories of injury severity: Severe, Serious, and Minor. It balanced using a synthetic minority oversampling technique (SMOTE). Multinomial logit model (MNL) compared with five machine learning classifiers: Naïve Bayes (NB), J48 Decision Tree, Random Forest (RF), Support Vector Machine (SVM), and Multilayer Perceptron (MLP). The results showed that most of machine learning-based algorithms performed well in predicting and classifying the severity of the traffic injury. Out of five classifiers, RF is the best classifier with 94.84% of accuracy. The characteristics of road type, total injured person, crash type, road user, transport way to the emergency department (ED), and accident action were the most critical factors in the severity of the traffic injury. Enhancing strategies for using roadway facilities may improve the safety of road users and regulations.},
     year = {2020}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Exploiting Machine Learning Algorithms for Predicting Crash Injury Severity in Yemen: Hospital Case Study
    AU  - Tariq Al-Moqri
    AU  - Xiao Haijun
    AU  - Jean Pierre Namahoro
    AU  - Eshrak Naji Alfalahi
    AU  - Ibrahim Alwesabi
    Y1  - 2020/09/28
    PY  - 2020
    N1  - https://doi.org/10.11648/j.acm.20200905.12
    DO  - 10.11648/j.acm.20200905.12
    T2  - Applied and Computational Mathematics
    JF  - Applied and Computational Mathematics
    JO  - Applied and Computational Mathematics
    SP  - 155
    EP  - 164
    PB  - Science Publishing Group
    SN  - 2328-5613
    UR  - https://doi.org/10.11648/j.acm.20200905.12
    AB  - This study focused on exploiting machine learning algorithms for classifying and predicting injury severity of vehicle crashes in Yemen. The primary objective is to assess the contribution of the leading causes of injury severity. The selected machine learning algorithms compared with traditional statistical methods. The filtrated second data collected within two months (August-October 2015) from the two main hospitals included 156 injured patients of vehicle crashes reported from 128 locations. The data classified into three categories of injury severity: Severe, Serious, and Minor. It balanced using a synthetic minority oversampling technique (SMOTE). Multinomial logit model (MNL) compared with five machine learning classifiers: Naïve Bayes (NB), J48 Decision Tree, Random Forest (RF), Support Vector Machine (SVM), and Multilayer Perceptron (MLP). The results showed that most of machine learning-based algorithms performed well in predicting and classifying the severity of the traffic injury. Out of five classifiers, RF is the best classifier with 94.84% of accuracy. The characteristics of road type, total injured person, crash type, road user, transport way to the emergency department (ED), and accident action were the most critical factors in the severity of the traffic injury. Enhancing strategies for using roadway facilities may improve the safety of road users and regulations.
    VL  - 9
    IS  - 5
    ER  - 

    Copy | Download

Author Information
  • School of Mathematics and Physics, China University of Geosciences, Wuhan, China

  • School of Mathematics and Physics, China University of Geosciences, Wuhan, China

  • School of Mathematics and Physics, China University of Geosciences, Wuhan, China

  • Ministry of Public Health and Population Yemen Field Epidemiology Training Program Almaqaleh St, Sana’a, Yemen

  • School of Automation, China University of Geoscience, Wuhan, China

  • Sections