########################################## #### Week 3 Lab: Two Parameter Models #### #### Friday, September 22nd, 2016 #### ########################################## ######################################################## #### Download lab files into your working directory #### ######################################################## library(lmSupport) # Set your working directory # We're using the bias dataset from last week # Check out the dataset # How could you achieve the same thing as head() using bracket notation? # some() lives in a package we don't really want to load (for only one function? why?) # We can also use bracket notation to do the same thing as some(). Let's use google. ########################################### #### Read page 1 of Lab3_Exercise.docx #### ########################################### # What's the experiment about? # What is the independent variable? # What is the dependent variable? # Prediction: # In other words: Does concern score explain variance in week 4 IAT scores? ################################################################ #### What models should we compare to test this hypothesis? #### ################################################################ # DATA = MODEL + ERROR # Model C: # Model A: ################################### #### Prepare data for analysis #### ################################### # We have four concern items; need to create average score for each participant # Which of the items are reverse-coded? # It would appear as though the reverse-scored item from last week (4) has been adjusted already. # Why varScore and not rowMeans? ################################################ #### The Compact Model: One Parameter Model #### ################################################ # Fit a one-parameter model # What is our one parameter? # We can ask for the values of y that are predicted by our model # What is the number we're predicting for everyone? # We can ask for the residuals # See that the predicted values (our model) plus the error equals # the data themselves! # And we can look at the coefficients, or parameter estimates themselves # If we want to ask questions about model fit, we need to calculate SSE: # Does this value alone tell us if the model fits the data well? ################################################## #### The Augmented Model: Two Parameter Model #### ################################################## # Fit a two-parameter model # What is our second parameter? # We can ask for the values of y that are predicted by our model # We can also ask for the errors # And we can ask for just the coefficients, or parameter estimates themselves # Week4IAT = # If we want to ask questions about model fit, we need to calculate SSE: # Model Comparison # What does the p-value tell us? (generally, not this specific p) # p-Value: modelSummary(mA, t=FALSE) # How is this related to what we just did? # Remember: R is giving us the results of two different model comparisons. # Each line in the summary is associated with a different set of comparisons. # What is the interpretation of the second coefficient, b1? What two models are being compared? # What is the interpretation of the first coefficient, b0? What two models are being compared? # Coefficients from Model C # Why is b0 different between Model A and Model C? # Confidence Intervals # Effect Sizes # Calculate PRE/ partial eta squared for b1 # Or just use modelEffectSizes() to get partial eta-squared # What does partial eta squared represent? ##################################### #### Graphing our Data and Model #### ##################################### # You will encounter us using the terms "quick and dirty" plot and "publication quality" plot. # Generally speaking, when we use the latter, we mean ggplot. Quick and dirty plots are more # for your understanding and ability to look at the data visually. # Based on last week: # Generating regression line with CI bands using the effects package library(effects) #used for quick and dirty view of models. VERY IMPORTANT. # Why are CI bands not linear? # But are these the error bands we want? # Publication quality graphing # Load ggplot2 library(ggplot2) # From Help: ggplot() is typically used to construct a plot incrementally, # using the + operator to add layers to the existing ggplot object. This is # advantageous in that the code is explicit about which layers are added and # the order in which they are added. For complex graphics with multiple layers, # initialization with ggplot is recommended. # Generating predicted data (necessary for confidence interval bands) # creating data frame for predictor values, first two numbers are range of predictor # A dataframe containing just one variable, ConcernM, representing many of the possible # values of ConcernM. # use modelPredictions() to get standard error of Y-hats # What did this add? # Graph a scatterplot of the data with ConcernM on the x-axis and Wk4IAT on the y-axis # now we add laters to this plot as we go: # Now let's add a layer to "plot" that will graph the regression line. # Can we just use the default parameters? # Finally, let's do just a couple things to make it look nice: # Everything at once (combining previous code into single plot) # If you were prepping this graph for publication, what else would you want to do? # We will learn how to do all these things later! ###################### #### Lab exercise #### ###################### # Do numbers 1-7 in small groups. Then stop, and we'll reconvene to do graphing. ##################### #### Extra Stuff #### ##################### # Calculate the SD of the errors SE = sqrt(SSEA/dfD) SE # Residual standard error (or standard error of the estimate) # Correlation: when you want the relationship between two quantitative variables cor(d$Wk4IAT,d$ConcernM) # r - an effect size indicator (= the square root of partial eta squared) cor.test(d$Wk4IAT,d$ConcernM) # r has a p-value