Reading and Video Assigments

All readings are due by the START of the class for which they are assigned. Reading quizzes will be administered at the start of many classes to encourage completion of all reading assignments in a timely manner.

Required Textbooks

All required and reference textbooks are freely available online (though hard copies can also be purchased if desired). The six textbooks that we will focus on for the course include:

  • James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning: With Applications in R (7th ed.). (pdf)

  • Kuhn, M. & Johnson, K. (2018). Applied Predictive Modeling. New York, NYL Springer Science. (pdf)

  • Grolemund, G., & Wickham, H. (2016). R for Data Science: Import, Tidy, Transform, Visualize, and Model Data (1st ed.). Sebastopol, CA: O’Reilly Media, Inc. (bookdown)

  • Silge, J., & Robinson, D. (2017). Text Mining with R: A Tidy Approach (1rst ed.). Beijing; Boston: O’Reilly Media. (bookdown)

  • Kuhn, M., & Johnson, K. (2019). Feature Engineering and Selection: A Practical Approach for Predictive Models (1 edition). Boca Raton, FL: Chapman and Hall/CRC. (bookdown)

  • Wickham, H. (2019). The Tidy Style Guide. (bookdown)

Unit 1: Overview of machine learning concepts and uses

  • January 21st
    • Yarkoni and Westfall (2017) (PDF)
  • January 23rd
    • James et al. (2013) Chapter 1: Introduction

    • James et al. (2013) Chapter 2: Statistical Learning (pp 15 - 42)

Unit 2: Introduction to regression models

  • January 30th
    • James et al. (2013) Chapter 3: Linear Regression (pp 59 - 109)

Unit 3: Introduction to classification models

  • February 6th
    • James et al. (2013) Chapter 4: Classification (pp 127 - 154)
  • February 18th
    • Kuhn and Johnson (2019) Unit 4: Exploratory Visualizations

Unit 4: Cross validation methods

  • February 25th
    • James et al. (2013) Chapter 5: Resampling Methods (pp 175 - 186)
  • February 27th
    • Kuhn and Johnson (2018) Chapter 4: Resampling Techniques (pp 67 - 78)

Unit 5: Subsetting and filtering

Unit 6: Regularization and penalized models

Unit 7: Bootstrapping and permutation tests