Assignment - Homework 04 - due 10/24/2017
For Homework 04, you will be using the HELP dataset, learn more at:
Complete the following:
- Perform a Simple Linear Regression for:
- OUTCOME variable
cesd
: “Center for Epidemiological Studies-Depression (CESD) total score - Baseline”
- PREDICTOR variable
indtot
: ““Inventory of Drug Use Consequences (InDue) total score - Baseline”"
- decide if you want to transform either variable
cesd
or indtot
and if so, what transformation you applied and why - you can also decide not to transform (i.e. tradeoffs between model fit and interpretability of your results) - discuss your reasoning
- Perform regression diagnostics:
- check the normality of the residuals (histogram and Q-Q plots)
- check for linearity - is there any systematic relationship between the residuals and the predicted (or fitted) values?
- homoscedasticity - plot of standardized residuals versus fitted values - this is known as a “Scale-Location” graph.
- check for outliers and data points with high leverage or influence: outliers are often identified with standardized residuals > 3 (or <-3) and influential observations are often identified using Cook’s D
- Provide a summary of the regression results.
- provide a FIGURE of the model, in this case a scatterplot with the fitted line overlaid and 95% confidence intervals if you can
- Make a TABLE presenting the fitted regression model (coefficients and tests of significance for those coefficients)
- describe the variance explained by the model (based on r2)
- describe the model itself based on the y-intercept and slope terms
- note any limitations or issues with the model fit or interpretation of the model
- Perform a One-way ANOVA for:
- OUTCOME variable
cesd
: “Center for Epidemiological Studies-Depression (CESD) total score - Baseline”
- GROUP variable
racegrp
: “Racial Group of Respondent”
- options - you can use either an ANOVA or GLM modeling approach
- if the GROUP variable is significant, also perform post hoc tests - use some kind of pairwise error rate adjustment (i.e. bonferroni, sidak, Tukey’s HSD, etc) - be sure to report which one you used and why
- Perform model diagnostics:
- homoscedasticity - look at a test for equal variance (Levene’s test or Bartlett’s test or equivalent).
- if this test of equal variances fails, you may want to report a modified F-test (e.g. Welch’s test)
- Present a summary of the ANOVA results.
- Make a FIGURE of the group mean differences - either an error-bar plot or a series of boxplots one for each group to show the group differences in the outcome
- Make a TABLE presenting the ANOVA results
- describe the model results - was the GROUP (
racegrp
) significant?
- If GROUP is significant, what did the post hoc tests reveal?
Variables in HELP dataset to be used for Homework 04
Use these variables from HELP dataset for Homework 04
cesd |
CESD total score - Baseline |
racegrp |
Racial Group of Respondent |
indtot |
Inventory of Drug Use Consequences (InDue) total score - Baseline |
Copyright © Melinda Higgins, Ph.D.. All contents under (CC) BY-NC-SA license, unless otherwise noted.
Feedback, Comments (email me)?