########################################################
####  Lab 12: Categorical 2 X 3 / 3 X 2 designs     ####
####       and Intro to Repeated Measures:          ####
####                   1 Dec 17                     ####
########################################################


# OVERVIEW OF TODAY #

# 1. Contrast and Dummy coding in a 2 (Between-subjects) X 3 (Between-subjects) design
# 2. Repeated measures intro with a 2 (WITHIN-subjects) X 3 (Between-subjects) design

library (lmSupport)
library(ggplot2)
library (car)


# 1. Contrast and Dummy coding in a 2 (Between-subjects) X 3 (Between-subjects) design

# For our first example: 

# A group of educational psychologists devises an intervention that they believe will help 
# boost the performance of students in a large University class. The educational psychologists 
# also wonder if this intervention would be more effective for minority students in the class,
# in effect reducing the achievement gap.
# The codebook for this dataset is given below.


# Codebook for "Lab12_DemoData1.dat" 

# Columns	    Variable 	                      Description	Values

# 1	id	      Student ID	                      1 - 120
# 2	cond	    Experimental condition	         "control", "intervention"
# 3	race	    Student race	                   "black", "hispanic", "white",
# 4	perf	    Student's class performance	      0 - 100 (DV)

# Get a sense of what's going on


# For sake of time, we will assume the data is clean and that model assumptions are all met 

# Let's see if we can get a rough sense of how things are behaving based on the means


# So let's test some hypotheses

# Hypothesis 1: There will be an overall effect of Condition such that performance is better in the intervention
# than control groups averaged across all races

# Hypothesis 2. There is a Race x Condition interaction such that the intervention has a different effect among White students
# than among non-White (minority) students  (i.e., Race moderates the Condition effect)


# We want to translate our hypotheses into contrasts for both condition and race. 
# We can actually create our contrasts in a way that will allow us to test both hypotheses! 

# first see what data type condition is


# Sometimes your data will already read in as factors and sometimes it will not. It is generally good practice 
# to set your factors yourself. If for no other reason, this will allow you to order the levels in the way you want


# Up until now, you have mostly been manually centering your variables. 
# However, we know from last lab that we can use varContrasts to set contrasts for this factor

# we can always see these labels by using contrasts 


# Note: we did not change the name of the factor but it is now centered. If you think you might forget if a variable is 
# centered or not, you can always change the name of it or make a new variable with the regressors coded. 


## Create Race Contrasts ##


# We can now see these contrasts 


# Run the interactive model.


#But what if we want to get the effect sizes for the individual contrasts? 


#### Interpret the results ####


# Now say you submit this study for peer review and Reviewer 1 asks if the effect of the intervention
# differs for Black vs. Hispanic individuals. Then Reviewer 2 asks if the effect of the intervention differs 
# for Black vs. White individuals. THEN Reviewer 3 asks if the effect of the intervention differs for Hispanic 
# vs White individuals. You realize that yourself or others could have reasonably had planned hypotheses that line up 
# with these tests. How can you test all these contrasts? 


# However if you were going to plan these tests you would need to do what? 


# How could you do this correction? 


# We need to see if race by condition interaction is significant
# As we learned last week, we can use the Anova function 


# Do we satisfy the condition of Fisher LSD? 


### Interaction of condition and white v black and white v hispanic ###

# Fit a model that tests the reviewers' questions; White students
# are the reference group

# We'll want the individual contrasts variables in our data frame to get
# an effect size for each of them, so code them out




# Remember this gives you the same estimates as 


# But this one does not allow you to get the individual effect sizes 

#Interpret the relevant coefficients


# To get the third test, we have to make a different racial group the reference group


# Obtain the third test of whether the effect of the intervention varies across the different
# pairwise comparisons of racial groups



# Note: Alternatively, you could have skipped testing the main effect and used the Holm-Bonferroni approach
# by correcting the p values of the tests (see last week's demo)


#### Bar plot of the main effect of race ####

#windows()


#### 2. Repeated measures intro with a 2 (WITHIN-subjects) X 3 (Between-subjects) design ####

# Note: The repeated measures data we will work with today will already be in wide format: 
# a participant's repeated responses are in a single row, and each response is in a separate column.

# Some data, especially repeated measures data, will often start in long format: 
# each row is one time point per participant. So each participant will have data in multiple rows. 
# Any variables that don't change across time will have the same value in all the rows.

#  Next week we will show you how to go from wide format to long format in R. 

 #clear all objects from workspace



# For your intro to repeated measures, we will use real (modified) data from a study in John and Daniel's 
# lab. Evidence suggests that the pharmacological properties of alcohol reduces anxiety about 
# unpredictably bad events more so than anxiety about predictably bad events. In this experiment, 
# participants were divided into three groups and given an alcohol beverage, a placebo beverage 
# (deceptively told alcohol but only got an alcohol flavored juice drink), or a control beverage (truthfully 
# told no alcohol and actually got a regularly flavored juice drink). All participants were hooked up to 
# electrodes and given mild electric shocks whenever they saw cues come up on a computer screen. 
# Some cues signaled predictable shock (shock would be of a specified intensity that was previewed to 
# the participants before the study started) and some cues signaled unpredictable shock (shock intensity 
# would be of some unknown level). Cue types were counterbalanced. At the end of the experiment, 
# participants rated how anxious they were (scale of 0-5) after seeing each type of cue. 


#H ypothesis: 
# 1. Alcohol will significantly reduce anxiety about unpredictable shock. 
# 2. Furthermore, the effect of alcohol on self-reported anxiety to unpredictable shock is pharmacological,
# so there should not be an expectancy effect from the participants thinking they drank alcohol (placebo group).  


# Codebook for "Lab12_DemoData2.dat" 

# Columns	                   Variable 	                                Description	Values

# 1	Unpredictable	  Self-reported anxiety to Unpredictable shock	       0 - 4.75
# 2	Predictable	    Self-reported anxiety to Predictable shock	         0 - 4.25
# 3	BG	            Participant's beverage group assignment	             Control (no alcohol), Alcohool, Placebo      


# Get a sense of what's going on


# Again, for sake of time, we will assume the data is clean and that model assumptions are all met 

# Hypothesis: Alcohol but not placebo drinks will reduce self-reported anxiety to unpredictable shock. 

# First let's make sure BG is a factor


# yep

# What are its levels?


# What set of orthoginal contrasts can we make to test this hypothesis? 


# Our questions are about the effect of alcohol on unpredictable shock vs predictable shock, but 
# there may be some designs where you first would want to test for a general, "main effect" of your focal variable 
# (in this case alcohol) on your dependent variable (in this case self reported anxiety). 
# To do that, we can first create a new variable that averages the two repeated measures together. 



# Interpret the intercept 


# Interpret b1 



# Interpret b2 



# To test our hypotheses, we will need to make a difference score to remove the "dependence problem" 
# of the self-reported anxiety data. In other words, if our hypotheses is about an effect of alcohol 
# on self reported anxiety and a moderating effect of type of shock (i.e., a greater effect of alcohol 
# on self reported anxiety to unpredictable vs predictable shock) we need to create a new variable that 
# is a difference score for unpredictable vs predictable shock 

#It may make inerpretation easier to subtract the smaller score from the larger score 



 #Note: you could also just do the math within the lm function
# and that would give you the same result 

# What if we wanted to get effect size estimates for each contrast?

# We need to code regressors 


# Interpret the intercept 


# Interpret b1 


#Interpret b2 



# So how do we test the simple effects of the alcohol vs control/placebo contrasts for each type of shock? 


# Interpret b1 


# Interpret b2 





# Interpret b1 


# Interpret b2 



# We will give you ggplot code to graph repeated measures data next week! 

# Final question to think about as you go on to your homework. What if our focal variable was unpredictable vs predictable shock 
# and we only wanted to test for signficant differences in unpredictable vs predictable shock anxiety within each of the
# three beverage groups? What coding scheme would we use?