Talent introduction is an important force of academic development in universities. As the core of talent introduction, prediction of academic talent capacity is an essential and valuable research. However, it is hard to apply traditional statistical methods to extract knowledge from the mass and multi-dimensional talent information. Data mining approaches as up-to-date and efficient technologies are good at analyzing information, extracting patterns or rules from a big dataset and then making a prediction based on the relationship among extracted information. In this study, a series of data mining approaches are employed to evaluate the academic capacity of talent and to analyze the correlation between features. The Principal Component Analysis and Random Forest are used to feature extraction for improving the accuracy of prediction. A classical classification model, Gradient Boosting Decision Tree, is used as the primary analytic model to prediction. In order to validate the effectiveness of the model, other five classification models are used to conduct a comparative experiment based on prediction accuracy values and the F-measure metric. Further, to investigate the contribution of some important features, we make a marginal utility analysis of important features which have a high correlation with academic talent capacity. The experiment results reveals the important features for academic capacity and the positive factors for the academic production of talents.
Published in | Applied and Computational Mathematics (Volume 8, Issue 4) |
DOI | 10.11648/j.acm.20190804.12 |
Page(s) | 75-81 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2019. Published by Science Publishing Group |
Data Mining, Classification Models, Prediction, Talent Introduction, Academic Talent Capacity
[1] | Hanif, M. I. & Yunfei, S. (2013), The role of talent management and HR generic strategies for talent retention, African Journal of Business Management, 7, 2827-2835. |
[2] | Kellogg, R. P. (2012), China’s brain gain: Attitudes and future plans of overseas Chinese students in the US, Journal of Chinese Overseas, 8, 83-104. |
[3] | Tharenou, P. & Seet, P. S. (2014), China's reverse brain drain: regaining and retaining talent, International Studies of Management and Organization, 44, 55-74. |
[4] | Ma, Y. P. & Pan, S. Y. (2015), Chinese returnees from overseas study: An understanding of brain gain and brain circulation in the age of globalization, Frontiers of Education in China, 10, 306-329. |
[5] | Lievens, K. van Dam, & Anderson, N. (2002), Recent trends and challenges in personnel selection, Personnel Review, 31, 580-601. |
[6] | Friedman, J. H. (2001), Greedy function approximation: A gradient boosting machine, Annals of Statistics, 29, 1189-1232. |
[7] | Quinlan, J. R. (1987), Simplifying decision trees, International Journal of Man-machine Studies, 27, 221-234. |
[8] | Breiman, L. (2001), Random forests, Machine learning, 45, 5-32. |
[9] | Jain, A. K., Mao, J., & Mohiuddin, K. M. (1996), Artificial neural networks: A tutorial, Computer, 29, 31-44. |
[10] | Chen, J., Huang, H., Tian, S., & Qu, Y. (2009), Feature selection for text classification with Naive Bayes, Expert Systems with Applications, 36, 5432-5435. |
[11] | Suykens J. A. & Vandewalle, J. (1999), Least squares support vector machine classifiers, Neural Processing Letters, 9, 293-300. |
[12] | Shaw, M. J., Subramaniam, C., Tan, G. W., & Welge, M. E. (2001), Knowledge management and data mining for marketing, Decision Support Systems, 31, 127-137. |
[13] | Hormozi, A. M. & Giles, S. (2004), Data mining: A competitive weapon for banking and retail industries, Information Systems Management, 21, 62-71. |
[14] | Koh, H. C. & Tan, G. (2011), Data mining applications in healthcare, Journal of Healthcare Information Management, 19, 65-72. |
[15] | Romero, C. & Ventura, S. (2013), Data mining in education, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 3, 12-27. |
[16] | Chien, C. F. & Chen, L. F. (2008), Data mining to improve personnel selection and enhance human capital: A case study in high-technology industry, Expert Systems with Applications, 34, 280-290. |
[17] | Ranjan, J., Goyal, D. P. & Ahson, S. I. (2008), Data mining techniques for better decisions in human resource management systems, International Journal of Business Information Systems, 3, 464-481. |
[18] | Gupta, S., Mokashi, U. M., & Suma, V. (2017). Entropy-based discretisation for performance prediction of employee: strategy for improving software quality, International Journal of Productivity and Quality Management, 21, 411-428. |
[19] | Huang, M. J., Tsou, Y. L. & Lee, S. C. (2006), Integrating fuzzy data mining and fuzzy artificial neural networks for discovering implicit knowledge, Knowledge-Based Systems, 19, 396-403. |
[20] | Han, Y. (2016). Improved BIRCH Clustering Algorithm and Human Resource Management Efficiency: An Organizational Learning Perspective. International Journal of Security and Its Applications, 10 (8), 385-394. |
[21] | Fadhil, R., Djatna, T., & Maarif, M. S. (2017). Analysis and Design of a Human Resources Performance Measurement System for the Nutmeg Oil Agro-industry in Aceh. Journal of Regional and City Planning, 28 (2), 99-110. |
[22] | Chien, C. F. & Chen, L. F. (2007), Using rough set theory to recruit and retain high-potential talents for semiconductor manufacturing, IEEE Transactions on Semiconductor Manufacturing, 20, 528-541. |
[23] | Saron, M. & Othman, Z. A. (2012), Academic talent model based on human resource data mart, International Journal of Research in Computer Science, 2, 29-35. |
APA Style
Shunshun Shi, Mingzhou Chen, Rui Feng, Hua Zhang, Shuai Zhang. (2019). Prediction of Academic Talent Capacity Based on Gradient Boosting Decision Tree. Applied and Computational Mathematics, 8(4), 75-81. https://doi.org/10.11648/j.acm.20190804.12
ACS Style
Shunshun Shi; Mingzhou Chen; Rui Feng; Hua Zhang; Shuai Zhang. Prediction of Academic Talent Capacity Based on Gradient Boosting Decision Tree. Appl. Comput. Math. 2019, 8(4), 75-81. doi: 10.11648/j.acm.20190804.12
AMA Style
Shunshun Shi, Mingzhou Chen, Rui Feng, Hua Zhang, Shuai Zhang. Prediction of Academic Talent Capacity Based on Gradient Boosting Decision Tree. Appl Comput Math. 2019;8(4):75-81. doi: 10.11648/j.acm.20190804.12
@article{10.11648/j.acm.20190804.12, author = {Shunshun Shi and Mingzhou Chen and Rui Feng and Hua Zhang and Shuai Zhang}, title = {Prediction of Academic Talent Capacity Based on Gradient Boosting Decision Tree}, journal = {Applied and Computational Mathematics}, volume = {8}, number = {4}, pages = {75-81}, doi = {10.11648/j.acm.20190804.12}, url = {https://doi.org/10.11648/j.acm.20190804.12}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.acm.20190804.12}, abstract = {Talent introduction is an important force of academic development in universities. As the core of talent introduction, prediction of academic talent capacity is an essential and valuable research. However, it is hard to apply traditional statistical methods to extract knowledge from the mass and multi-dimensional talent information. Data mining approaches as up-to-date and efficient technologies are good at analyzing information, extracting patterns or rules from a big dataset and then making a prediction based on the relationship among extracted information. In this study, a series of data mining approaches are employed to evaluate the academic capacity of talent and to analyze the correlation between features. The Principal Component Analysis and Random Forest are used to feature extraction for improving the accuracy of prediction. A classical classification model, Gradient Boosting Decision Tree, is used as the primary analytic model to prediction. In order to validate the effectiveness of the model, other five classification models are used to conduct a comparative experiment based on prediction accuracy values and the F-measure metric. Further, to investigate the contribution of some important features, we make a marginal utility analysis of important features which have a high correlation with academic talent capacity. The experiment results reveals the important features for academic capacity and the positive factors for the academic production of talents.}, year = {2019} }
TY - JOUR T1 - Prediction of Academic Talent Capacity Based on Gradient Boosting Decision Tree AU - Shunshun Shi AU - Mingzhou Chen AU - Rui Feng AU - Hua Zhang AU - Shuai Zhang Y1 - 2019/09/27 PY - 2019 N1 - https://doi.org/10.11648/j.acm.20190804.12 DO - 10.11648/j.acm.20190804.12 T2 - Applied and Computational Mathematics JF - Applied and Computational Mathematics JO - Applied and Computational Mathematics SP - 75 EP - 81 PB - Science Publishing Group SN - 2328-5613 UR - https://doi.org/10.11648/j.acm.20190804.12 AB - Talent introduction is an important force of academic development in universities. As the core of talent introduction, prediction of academic talent capacity is an essential and valuable research. However, it is hard to apply traditional statistical methods to extract knowledge from the mass and multi-dimensional talent information. Data mining approaches as up-to-date and efficient technologies are good at analyzing information, extracting patterns or rules from a big dataset and then making a prediction based on the relationship among extracted information. In this study, a series of data mining approaches are employed to evaluate the academic capacity of talent and to analyze the correlation between features. The Principal Component Analysis and Random Forest are used to feature extraction for improving the accuracy of prediction. A classical classification model, Gradient Boosting Decision Tree, is used as the primary analytic model to prediction. In order to validate the effectiveness of the model, other five classification models are used to conduct a comparative experiment based on prediction accuracy values and the F-measure metric. Further, to investigate the contribution of some important features, we make a marginal utility analysis of important features which have a high correlation with academic talent capacity. The experiment results reveals the important features for academic capacity and the positive factors for the academic production of talents. VL - 8 IS - 4 ER -