Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data

Zari Farhadi Zari Farhadi; Reza Arabi Belaghi; Ozlem Gurunlu Alma

doi:doi:10.11648/j.ajtas.20190805.14

| Peer-Reviewed

Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data

Zari Farhadi Zari Farhadi, Reza Arabi Belaghi, Ozlem Gurunlu Alma

Published in American Journal of Theoretical and Applied Statistics (Volume 8, Issue 5)

Received: 29 June 2019 Accepted: 3 September 2019 Published: 16 October 2019

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

Shrinkage methods for linear regression were developed over the last ten years to reduce the weakness of ordinary least squares (OLS) regression with respect to prediction accuracy. And, high dimensional data are quickly growing in many areas due to the development of technological advances which helps collect data with a large number of variables. In this paper, shrinkage methods were used to evaluate regression coefficients effectively for the high-dimensional multiple regression model, where there were fewer samples than predictors. Also, regularization approaches have become the methods of choice for analyzing such high dimensional data. We used three regulation methods based on penalized regression to select the appropriate model. Lasso, Ridge and Elastic Net have desirable features; they can simultaneously perform the regulation and selection of appropriate predictor variables and estimate their effects. Here, we compared the performance of three regular linear regression methods using cross-validation method to reach the optimal point. Prediction accuracy using the least squares error (MSE) was evaluated. Through conducting a simulation study and studying real data, we found that all three methods are capable to produce appropriate models. The Elastic Net has better prediction accuracy than the rest. However, in the simulation study, the Elastic Net outperformed other two methods and showed a less value in terms of MSE.

Published in	American Journal of Theoretical and Applied Statistics (Volume 8, Issue 5)
DOI	10.11648/j.ajtas.20190805.14
Page(s)	185-192
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2019. Published by Science Publishing Group

Keywords

Shrinkage ‎Estimator, High Dimension, Cross-Validation, Ridge ‎Regression, ‎Elastic Net

References

[1]	Doreswamy, Chanabasayya. M. Vastrad. (2013). "Performance Analysis Of Regularized Linear Regression Models For Oxazolines And Oxazoles Derivitive Descriptor Dataset," International Journal of Computational Science and Information Technology (IJCSITY) Vol. 1, No. 4. 10.5121/ijcsity.2013.1408.
[2]	Fan. J, Li. R (2001). "Variable selection via nonconcave penalized likelihood and its oracleproperties," Journal of the American Statistical Association 96: 1348-1360.
[3]	Hoerl. A. E‎,‎ Kennard. R. W (1970)‎. "‎Ridge regression: Biased estimation for nonorthogonal ‎problems," ‎Technometrics.‎ 12 (1‎)‎ 55–67.‎
[4]	Hastie. T, Tibshirani. R, and Friedman, J (2001).‎ The Elements of Statistical Learning; Data ‎Mining,‎ Inference and Prediction. New ‎York,‎ ‎Springer‎.
[5]	James. G, Witten. D, Hastie. T, R.‎ Tibshirani. ‎(2013).‎ An Introduction to Statistical Learning with Applications in ‎R. Springer New York Heidelberg Dordrecht London.‎‎‎
[6]	Jerome. Friedman, Trevor Hastie (2009). "Regularization Paths for Generalized Linear Models via Coordinate Descent", www.jstatsoft.org/v33/i01/paper.
[7]	Qiu. D, (2017). An Applied Analysis of High-Dimensional Logistic Regression. simon fraser niversity.
[8]	Tibshirani. R, (1996). "Regression shrinkage and selection via the LASSO," Journal of the Royal Statistical Society. Series B (Methodological)‎.‎ 267-288‎.
[9]	Tibshirani. R‎, ‎Hastie. T‎, ‎Wainwright. M‎., (2015). Statistical Learning with Sparsity The Lasso and ‎Generalizations‎. Chapman ‎and‎ hall ‎book
[10]	‎Yuzbasi.‎ B, ‎Arashi. ‎M, ‎Ahmed.‎ S. ‎E‎ ‎(2017). "Big Data Analysis Using Shrinkage Strategies," arXiv: 1704.05074v1 [stat.ME] 17 Apr 2017.
[11]	‎Zhang.‎ F, ‎(2011)‎. Cross-Valitation and regression analiysis in high dimentional sparse linear models. Stanford ‎University.
[12]	Zhao. P‎,‎ Yu. B, (2006)‎. "‎On model selection consistency of ‎lasso,"‎ Journal of Machine Learning Research 7 (11) 2541–2563‎.‎
[13]	Zou. H, and Hastie. T (2005). "Regularization and variable selection via the elastic net," J. Roy.Stat.Soc.B 67, 301–320‎.
[14]	Zou. H (2006). "The adaptive lasso and its oracle properties.", Journal of the American Statistical Association 101: 1418-1429.

Cite This Article

Plain Text BibTeX RIS

APA Style

Zari Farhadi Zari Farhadi, Reza Arabi Belaghi, Ozlem Gurunlu Alma. (2019). Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data. American Journal of Theoretical and Applied Statistics, 8(5), 185-192. https://doi.org/10.11648/j.ajtas.20190805.14

Copy | Download

ACS Style

Zari Farhadi Zari Farhadi; Reza Arabi Belaghi; Ozlem Gurunlu Alma. Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data. Am. J. Theor. Appl. Stat. 2019, 8(5), 185-192. doi: 10.11648/j.ajtas.20190805.14

Copy | Download

AMA Style

Zari Farhadi Zari Farhadi, Reza Arabi Belaghi, Ozlem Gurunlu Alma. Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data. Am J Theor Appl Stat. 2019;8(5):185-192. doi: 10.11648/j.ajtas.20190805.14

Copy | Download

@article{10.11648/j.ajtas.20190805.14,
  author = {Zari Farhadi Zari Farhadi and Reza Arabi Belaghi and Ozlem Gurunlu Alma},
  title = {Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data},
  journal = {American Journal of Theoretical and Applied Statistics},
  volume = {8},
  number = {5},
  pages = {185-192},
  doi = {10.11648/j.ajtas.20190805.14},
  url = {https://doi.org/10.11648/j.ajtas.20190805.14},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajtas.20190805.14},
  abstract = {Shrinkage methods for linear regression were developed over the last ten years to reduce the weakness of ordinary least squares (OLS) regression with respect to prediction accuracy. And, high dimensional data are quickly growing in many areas due to the development of technological advances which helps collect data with a large number of variables. In this paper, shrinkage methods were used to evaluate regression coefficients effectively for the high-dimensional multiple regression model, where there were fewer samples than predictors. Also, regularization approaches have become the methods of choice for analyzing such high dimensional data. We used three regulation methods based on penalized regression to select the appropriate model. Lasso, Ridge and Elastic Net have desirable features; they can simultaneously perform the regulation and selection of appropriate predictor variables and estimate their effects. Here, we compared the performance of three regular linear regression methods using cross-validation method to reach the optimal point. Prediction accuracy using the least squares error (MSE) was evaluated. Through conducting a simulation study and studying real data, we found that all three methods are capable to produce appropriate models. The Elastic Net has better prediction accuracy than the rest. However, in the simulation study, the Elastic Net outperformed other two methods and showed a less value in terms of MSE.},
 year = {2019}
}

Copy | Download

TY - JOUR
T1 - Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data
AU - Zari Farhadi Zari Farhadi
AU - Reza Arabi Belaghi
AU - Ozlem Gurunlu Alma
Y1 - 2019/10/16
PY - 2019
N1 - https://doi.org/10.11648/j.ajtas.20190805.14
DO - 10.11648/j.ajtas.20190805.14
T2 - American Journal of Theoretical and Applied Statistics
JF - American Journal of Theoretical and Applied Statistics
JO - American Journal of Theoretical and Applied Statistics
SP - 185
EP - 192
PB - Science Publishing Group
SN - 2326-9006
UR - https://doi.org/10.11648/j.ajtas.20190805.14
AB - Shrinkage methods for linear regression were developed over the last ten years to reduce the weakness of ordinary least squares (OLS) regression with respect to prediction accuracy. And, high dimensional data are quickly growing in many areas due to the development of technological advances which helps collect data with a large number of variables. In this paper, shrinkage methods were used to evaluate regression coefficients effectively for the high-dimensional multiple regression model, where there were fewer samples than predictors. Also, regularization approaches have become the methods of choice for analyzing such high dimensional data. We used three regulation methods based on penalized regression to select the appropriate model. Lasso, Ridge and Elastic Net have desirable features; they can simultaneously perform the regulation and selection of appropriate predictor variables and estimate their effects. Here, we compared the performance of three regular linear regression methods using cross-validation method to reach the optimal point. Prediction accuracy using the least squares error (MSE) was evaluated. Through conducting a simulation study and studying real data, we found that all three methods are capable to produce appropriate models. The Elastic Net has better prediction accuracy than the rest. However, in the simulation study, the Elastic Net outperformed other two methods and showed a less value in terms of MSE.
VL - 8
IS - 5
ER -

Copy | Download

Author Information

Zari Farhadi Zari Farhadi

Department of Statistics, University of Tabriz, Tabriz, Iran
Reza Arabi Belaghi

Department of Statistics, University of Tabriz, Tabriz, Iran
Ozlem Gurunlu Alma

Department of Statistics, Mughla Sitki Kochman Unv, Mughla, Turkey

Download PDF

Sections

Plain Text BibTeX RIS

APA Style

Zari Farhadi Zari Farhadi, Reza Arabi Belaghi, Ozlem Gurunlu Alma. (2019). Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data. American Journal of Theoretical and Applied Statistics, 8(5), 185-192. https://doi.org/10.11648/j.ajtas.20190805.14

Copy | Download

ACS Style

Zari Farhadi Zari Farhadi; Reza Arabi Belaghi; Ozlem Gurunlu Alma. Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data. Am. J. Theor. Appl. Stat. 2019, 8(5), 185-192. doi: 10.11648/j.ajtas.20190805.14

Copy | Download

AMA Style

Zari Farhadi Zari Farhadi, Reza Arabi Belaghi, Ozlem Gurunlu Alma. Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data. Am J Theor Appl Stat. 2019;8(5):185-192. doi: 10.11648/j.ajtas.20190805.14

Copy | Download

@article{10.11648/j.ajtas.20190805.14,
  author = {Zari Farhadi Zari Farhadi and Reza Arabi Belaghi and Ozlem Gurunlu Alma},
  title = {Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data},
  journal = {American Journal of Theoretical and Applied Statistics},
  volume = {8},
  number = {5},
  pages = {185-192},
  doi = {10.11648/j.ajtas.20190805.14},
  url = {https://doi.org/10.11648/j.ajtas.20190805.14},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajtas.20190805.14},
  abstract = {Shrinkage methods for linear regression were developed over the last ten years to reduce the weakness of ordinary least squares (OLS) regression with respect to prediction accuracy. And, high dimensional data are quickly growing in many areas due to the development of technological advances which helps collect data with a large number of variables. In this paper, shrinkage methods were used to evaluate regression coefficients effectively for the high-dimensional multiple regression model, where there were fewer samples than predictors. Also, regularization approaches have become the methods of choice for analyzing such high dimensional data. We used three regulation methods based on penalized regression to select the appropriate model. Lasso, Ridge and Elastic Net have desirable features; they can simultaneously perform the regulation and selection of appropriate predictor variables and estimate their effects. Here, we compared the performance of three regular linear regression methods using cross-validation method to reach the optimal point. Prediction accuracy using the least squares error (MSE) was evaluated. Through conducting a simulation study and studying real data, we found that all three methods are capable to produce appropriate models. The Elastic Net has better prediction accuracy than the rest. However, in the simulation study, the Elastic Net outperformed other two methods and showed a less value in terms of MSE.},
 year = {2019}
}

Copy | Download