Answer Key - Homework 04

Here are the answer key files for Homework 4:

You are also welcome to refer to my Github Repository for Homework 4 Answer Key https://github.com/melindahiggins2000/N736Fall2017_Homework4Key

Assignment - Homework 04 - due 10/24/2017

For Homework 04, you will be using the HELP dataset, learn more at:

Complete the following:

  1. Perform a Simple Linear Regression for:
    • OUTCOME variable cesd: “Center for Epidemiological Studies-Depression (CESD) total score - Baseline”
    • PREDICTOR variable indtot: ““Inventory of Drug Use Consequences (InDue) total score - Baseline”"
    • decide if you want to transform either variable cesd or indtot and if so, what transformation you applied and why - you can also decide not to transform (i.e. tradeoffs between model fit and interpretability of your results) - discuss your reasoning
  2. Perform regression diagnostics:
    • check the normality of the residuals (histogram and Q-Q plots)
    • check for linearity - is there any systematic relationship between the residuals and the predicted (or fitted) values?
    • homoscedasticity - plot of standardized residuals versus fitted values - this is known as a “Scale-Location” graph.
    • check for outliers and data points with high leverage or influence: outliers are often identified with standardized residuals > 3 (or <-3) and influential observations are often identified using Cook’s D
  3. Provide a summary of the regression results.
    • provide a FIGURE of the model, in this case a scatterplot with the fitted line overlaid and 95% confidence intervals if you can
    • Make a TABLE presenting the fitted regression model (coefficients and tests of significance for those coefficients)
    • describe the variance explained by the model (based on r2)
    • describe the model itself based on the y-intercept and slope terms
    • note any limitations or issues with the model fit or interpretation of the model
  4. Perform a One-way ANOVA for:
    • OUTCOME variable cesd: “Center for Epidemiological Studies-Depression (CESD) total score - Baseline”
    • GROUP variable racegrp: “Racial Group of Respondent”
    • options - you can use either an ANOVA or GLM modeling approach
    • if the GROUP variable is significant, also perform post hoc tests - use some kind of pairwise error rate adjustment (i.e. bonferroni, sidak, Tukey’s HSD, etc) - be sure to report which one you used and why
  5. Perform model diagnostics:
    • homoscedasticity - look at a test for equal variance (Levene’s test or Bartlett’s test or equivalent).
    • if this test of equal variances fails, you may want to report a modified F-test (e.g. Welch’s test)
  6. Present a summary of the ANOVA results.
    • Make a FIGURE of the group mean differences - either an error-bar plot or a series of boxplots one for each group to show the group differences in the outcome
    • Make a TABLE presenting the ANOVA results
    • describe the model results - was the GROUP (racegrp) significant?
    • If GROUP is significant, what did the post hoc tests reveal?

Variables in HELP dataset to be used for Homework 04

Use these variables from HELP dataset for Homework 04
Variable Label
cesd CESD total score - Baseline
racegrp Racial Group of Respondent
indtot Inventory of Drug Use Consequences (InDue) total score - Baseline

Copyright © Melinda Higgins, Ph.D.. All contents under (CC) BY-NC-SA license,CC-BY-NC-SA unless otherwise noted.

Feedback, Comments (email me)?