# Homework Key for Week 4 # 1. Read in and inspect the data. d = dfReadDat('Salary.dat') varDescribe(d) summary(d) View(d) # 2. Is the mean faculty salary significantly greater (or smaller) than $50,000/year? # Report the corresponding F-statistic, df and p-value. d$sal50 = d$SALARY - 50000 m1 = lm(sal50 ~ 1, data=d) modelSummary(m1, t=F) # It is significantly greater, F(1,61) = 15.26, df = 61, p < 0.001. # Most researchers would actually report t in this case, since # this is a one-sample t-test. # 3. Fit a linear model predicting salary from number of publications. Test if the number of # publications significantly predicts salary m2 = lm(SALARY ~ 1 + PUBS, data=d) modelSummary(m2, t=F) confint(m2) # For every publication, professor salary increases $351, on average, F(1,60) = 20.66, df = 60, # p < 0.001. Confidence interval [196.44, 505.16] # 4. Report a variance-based indicator of effect size, along with its interpretation. modelEffectSizes(m2) # partial eta squared = 0.256. Number of publications explains 25.6% of the variance in salary. # 5. To better understand this effect, you'll now calculate some predicted salaries based on # number of publications. # a. 5 publications? 48439.09 + 350.8*5 # $50139.09 # b. 15 publications? 48439.09 + 350.08*15 # 53690.29 ## This could also be done by recentering the publication variable first around 5 and then ## around 15. # 6. Create a scatterplot for Salary by Publications. Include the regression line of the model # as well as lines representing the standard error of the point estimate. Paste this plot into # your Word document. pY1 = data.frame(PUBS=seq(1,69,length=62)) pY1 = modelPredictions(m2, pY1) plot1 = ggplot(d, aes(x = PUBS, y = SALARY)) plot1 plot1 = plot1 + geom_point(data=d, aes(x = PUBS, y = SALARY), color = 'black') plot1 plot1 = plot1 + geom_smooth(aes(ymin=CILo, ymax=CIHi, x=PUBS, y=Predicted), data=pY1, stat='identity', color='red') plot1 plot1 = plot1 + theme_bw(base_size = 14) + labs(x = 'Number of publications', y = 'Salary') plot1 # 7. plot1 = plot1 + scale_x_continuous(breaks = seq(0, 70, 10)) + theme(panel.grid.major = element_blank()) plot1 # changed x axis ticks and removed (some) gridlines # Fun packages with Mitch: To save myself some time doing the theme things, I often use the following package: install.packages('cowplot') library(cowplot) ??cowplot plot1 = ggplot(d, aes(x = PUBS, y = SALARY)) + geom_point(data=d, aes(x = PUBS, y = SALARY), color = 'black') + geom_smooth(aes(ymin=CILo, ymax=CIHi, x=PUBS, y=Predicted), data=pY1, stat='identity', color='red') + labs(x = 'Number of publications', y = 'Salary') + scale_x_continuous(breaks = seq(0, 70, 10)) plot1