15 Appendix 5: Exemplar Datasets

15.1 UCI Machine Learning Repository

https://archive.ics.uci.edu/ml/index.php

15.2 mlbench: Machine Learning Benchmark Problems

A collection of artificial and real-world machine learning benchmark problems, including, e.g., several data sets from the UCI repository.

https://cran.r-project.org/web/packages/mlbench/index.html

15.3 Kaggle

https://www.kaggle.com/datasets

Grolemund, Garrett, and Hadley Wickham. 2016. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. 1st ed. Sebastopol, CA: O’Reilly Media, Inc.

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning: With Applications in R. 7th ed. Springer Texts in Statistics. New York: Springer-Verlag. https://doi.org/10.1007/978-1-4614-7138-7.

Kuhn, Max, and Kjell Johnson. 2019. Feature Engineering and Selection: A Practical Approach for Predictive Models. 1 edition. Boca Raton, FL: Chapman and Hall/CRC.

———. 2018. Applied Predictive Modeling. 1st ed. 2013, Corr. 2nd printing 2018 edition. New York: Springer.

Kvålseth, Tarald O. 1985. “Cautionary Note About R 2.” The American Statistician 39 (4): 279–85. https://doi.org/10.1080/00031305.1985.10479448.

Landis, J. R., and G. G. Koch. 1977. “The Measurement of Observer Agreement for Categorical Data.” Biometrics 33 (1): 159–74.

Silge, Julia, and David Robinson. 2017. Text Mining with R: A Tidy Approach. 1rst ed. Beijing; Boston: O’Reilly Media.

Wickham, Hadley. 2019. The Tidyverse Style Guide. Wickham and Hadley.

Xie, Yihui. 2015. Dynamic Documents with R and Knitr. 2nd ed. Boca Raton, Florida: Chapman; Hall/CRC. http://yihui.name/knitr/.

———. 2019. Bookdown: Authoring Books and Technical Documents with R Markdown. https://CRAN.R-project.org/package=bookdown.

Yarkoni, Tal, and Jacob Westfall. 2017. “Choosing Prediction over Explanation in Psychology: Lessons from Machine Learning.” Perspectives on Psychological Science 12 (6): 1100–1122. https://doi.org/10.1177/1745691617693393.