IAML Unit 9: Discussion

Announcements

Code update from keras to brulee by end of day tomorrow
Appendix to better understand splits objects and then generalized to applications for nested cross validation
office hours claimed for student defense. Available by appt tomorrow and monday.
running right out to meeting with surgeon after class

Learn more about the algorithms you use by reading package documentation
- ranger
- rpart

How do you make predictions with decision tree?

When we split the data in a tree by a variable, how the threshold is specified? Is the algorithm determining it internally?
- RSS
- impurity (e.g., gini)
  - do first for a node
  - then weighted combine across the nodes
Can we discuss further how tree-models can handle nonlinear effects and interactions natively?
- Interactions (draw): hours studied (hi/low) X sleep quality (high/low) for exam score
- non-linear?
What determines the depth of trees that we set in decision trees and random forest?
How do trees handle missing data? Do we ever want to impute missing data in our recipes?
In what situations might additional feature engineering (e.g., alternative handling of missing values or categorical aggregation) still improve decision tree performance?
- missing values, categorical aggregation, nzv, other dimensionality reduction
- Why isn’t feature engineering required as extensively when it comes to decision trees?

A further explanation on what a base learner is
When and why does bagging improve model performance
- What are the benefits of bagging
- No impact on bias
- Interpretability loss
- computational cost
Can you talk more about bagging and how it utilizes bootstrapping techniques?
Can you explain more on why we should de-correlate the bagged trees? Do we not need to de-correlate if we are not using bagging?

Understanding mtry

Role of tree_depth (trees), min_n (both), cost complexity (trees) and mtry (rf), number of trees (rf) in decision trees and random forests
Can we go over the different advantages and disadvantages of the different tree algorithms like we did in class with QDA, LDA, KNN, Log, and RDA?
- simple tree (e.g. CART)
- bagged tress
- rf and XGBoost