Log - Log - November 5, 2017


SAVE OUTFILE='C:\MyGithub\N736Fall2017_lesson20\helpmkh.sav'
* Encoding: UTF-8.
* ============================================.
* N736 - LESSON 20
* Poisson Regression and Negative Binomial Regression
* Melinda Higgins, PhD
* dated 11/5/2017
* working with the helpmkh dataset
* ============================================.

* ============================================.
* look at distribution of d1
* number of times hospitalized for medical problems
* this is a good count variable
* check mean and standard deviation
* SD > mean indicates overdispersion
* ============================================.


Frequencies - Statistics - November 5, 2017
StatisticsStatistics, table, d1 How many times hospitalized for medical problems (lifetime), 1 layers, 0 levels of column headers and 2 levels of row headers, table with 3 columns and 11 rows
d1 How many times hospitalized for medical problems (lifetime) d1 How many times hospitalized for medical problems (lifetime)
N Valid 453
Missing 0
Mean 3.06
Std. Deviation 6.188
Minimum 0
Maximum 100
Percentiles 25 1.00
50 2.00
75 3.50
Frequencies - d1 How many times hospitalized for medical problems (lifetime) - November 5, 2017
d1 How many times hospitalized for medical problems (lifetime)d1 How many times hospitalized for medical problems (lifetime), table, 1 levels of column headers and 2 levels of row headers, table with 6 columns and 24 rows
  Frequency Percent Valid Percent Cumulative Percent
Valid 0 92 20.3 20.3 20.3
1 120 26.5 26.5 46.8
2 92 20.3 20.3 67.1
3 36 7.9 7.9 75.1
4 37 8.2 8.2 83.2
5 18 4.0 4.0 87.2
6 12 2.6 2.6 89.8
7 5 1.1 1.1 90.9
8 11 2.4 2.4 93.4
9 2 .4 .4 93.8
10 11 2.4 2.4 96.2
12 1 .2 .2 96.5
13 1 .2 .2 96.7
14 2 .4 .4 97.1
15 3 .7 .7 97.8
17 1 .2 .2 98.0
20 5 1.1 1.1 99.1
22 1 .2 .2 99.3
36 1 .2 .2 99.6
40 1 .2 .2 99.8
100 1 .2 .2 100.0
Total 453 100.0 100.0  
Frequencies - Histogram - November 5, 2017
How many times hospitalized for medical problems (lifetime): 100
Frequency: 1 How many times hospitalized for medical problems (lifetime): 100
Frequency: 1 How many times hospitalized for medical problems (lifetime): 40
Frequency: 1 How many times hospitalized for medical problems (lifetime): 40
Frequency: 1 How many times hospitalized for medical problems (lifetime): 36
Frequency: 1 How many times hospitalized for medical problems (lifetime): 36
Frequency: 1 How many times hospitalized for medical problems (lifetime): 20.333
Frequency: 6 How many times hospitalized for medical problems (lifetime): 20.333
Frequency: 6 How many times hospitalized for medical problems (lifetime): 14.714
Frequency: 7 How many times hospitalized for medical problems (lifetime): 14.714
Frequency: 7 How many times hospitalized for medical problems (lifetime): 9.12
Frequency: 25 How many times hospitalized for medical problems (lifetime): 9.12
Frequency: 25 How many times hospitalized for medical problems (lifetime): 4.1944
Frequency: 108 How many times hospitalized for medical problems (lifetime): 4.1944
Frequency: 108 How many times hospitalized for medical problems (lifetime): 1
Frequency: 304 How many times hospitalized for medical problems (lifetime): 1
Frequency: 304 0 100 200 300 400 400 300 200 100 0 0 30 60 90 120 120 90 60 30 0

Log - Log - November 5, 2017

* ============================================.
* look at correlations of d1 with demographics
* and other predictors - we'll focus on pcs
* ============================================.

  /VARIABLES=d1 age female pss_fr pcs mcs indtot sexrisk

Correlations - Correlations - November 5, 2017
CorrelationsCorrelations, table, 1 levels of column headers and 2 levels of row headers, table with 10 columns and 28 rows
  d1 How many times hospitalized for medical problems (lifetime) age Age at baseline (in years) female Gender of respondent pss_fr Perceived Social Support - friends pcs SF36 Physical Composite Score - Baseline mcs SF36 Mental Composite Score - Baseline indtot Inventory of Drug Use Consequences (InDue) total score - Baseline sexrisk Risk Assessment Battery (RAB) sex risk score - Baseline
d1 How many times hospitalized for medical problems (lifetime) Pearson Correlation 1 .161** .038 -.048 -.258** -.093* .032 .036
Sig. (2-tailed)   .001 .415 .310 .000 .049 .492 .442
N 453 453 453 453 453 453 453 453
age Age at baseline (in years) Pearson Correlation .161** 1 .043 .080 -.229** .045 .026 -.120*
Sig. (2-tailed) .001   .358 .088 .000 .343 .575 .011
N 453 453 453 453 453 453 453 453
female Gender of respondent Pearson Correlation .038 .043 1 .067 -.157** -.119* -.261** .092
Sig. (2-tailed) .415 .358   .155 .001 .011 .000 .052
N 453 453 453 453 453 453 453 453
pss_fr Perceived Social Support - friends Pearson Correlation -.048 .080 .067 1 .077 .138** -.198** -.113*
Sig. (2-tailed) .310 .088 .155   .104 .003 .000 .016
N 453 453 453 453 453 453 453 453
pcs SF36 Physical Composite Score - Baseline Pearson Correlation -.258** -.229** -.157** .077 1 .110* -.135** .024
Sig. (2-tailed) .000 .000 .001 .104   .019 .004 .612
N 453 453 453 453 453 453 453 453
mcs SF36 Mental Composite Score - Baseline Pearson Correlation -.093* .045 -.119* .138** .110* 1 -.381** -.106*
Sig. (2-tailed) .049 .343 .011 .003 .019   .000 .024
N 453 453 453 453 453 453 453 453
indtot Inventory of Drug Use Consequences (InDue) total score - Baseline Pearson Correlation .032 .026 -.261** -.198** -.135** -.381** 1 .113*
Sig. (2-tailed) .492 .575 .000 .000 .004 .000   .016
N 453 453 453 453 453 453 453 453
sexrisk Risk Assessment Battery (RAB) sex risk score - Baseline Pearson Correlation .036 -.120* .092 -.113* .024 -.106* .113* 1
Sig. (2-tailed) .442 .011 .052 .016 .612 .024 .016  
N 453 453 453 453 453 453 453 453
**. Correlation is significant at the 0.01 level (2-tailed).  
*. Correlation is significant at the 0.05 level (2-tailed).  
Log - Log - November 5, 2017

  /VARIABLES=d1 age female pss_fr pcs mcs indtot sexrisk

Nonparametric Correlations
Nonparametric Correlations - Correlations - November 5, 2017
CorrelationsCorrelations, table, 1 levels of column headers and 3 levels of row headers, table with 11 columns and 52 rows
  d1 How many times hospitalized for medical problems (lifetime) age Age at baseline (in years) female Gender of respondent pss_fr Perceived Social Support - friends pcs SF36 Physical Composite Score - Baseline mcs SF36 Mental Composite Score - Baseline indtot Inventory of Drug Use Consequences (InDue) total score - Baseline sexrisk Risk Assessment Battery (RAB) sex risk score - Baseline
Kendall's tau_b d1 How many times hospitalized for medical problems (lifetime) Correlation Coefficient 1.000 .162** .100* -.069* -.238** -.141** .124** .051
Sig. (2-tailed) . .000 .016 .048 .000 .000 .000 .152
N 453 453 453 453 453 453 453 453
age Age at baseline (in years) Correlation Coefficient .162** 1.000 .033 .043 -.139** .020 .045 -.077*
Sig. (2-tailed) .000 . .398 .198 .000 .533 .166 .022
N 453 453 453 453 453 453 453 453
female Gender of respondent Correlation Coefficient .100* .033 1.000 .054 -.138** -.099* -.216** .047
Sig. (2-tailed) .016 .398 . .178 .000 .010 .000 .240
N 453 453 453 453 453 453 453 453
pss_fr Perceived Social Support - friends Correlation Coefficient -.069* .043 .054 1.000 .047 .088** -.133** -.100**
Sig. (2-tailed) .048 .198 .178 . .147 .007 .000 .003
N 453 453 453 453 453 453 453 453
pcs SF36 Physical Composite Score - Baseline Correlation Coefficient -.238** -.139** -.138** .047 1.000 .089** -.092** .021
Sig. (2-tailed) .000 .000 .000 .147 . .005 .004 .530
N 453 453 453 453 453 453 453 453
mcs SF36 Mental Composite Score - Baseline Correlation Coefficient -.141** .020 -.099* .088** .089** 1.000 -.238** -.074*
Sig. (2-tailed) .000 .533 .010 .007 .005 . .000 .024
N 453 453 453 453 453 453 453 453
indtot Inventory of Drug Use Consequences (InDue) total score - Baseline Correlation Coefficient .124** .045 -.216** -.133** -.092** -.238** 1.000 .088**
Sig. (2-tailed) .000 .166 .000 .000 .004 .000 . .009
N 453 453 453 453 453 453 453 453
sexrisk Risk Assessment Battery (RAB) sex risk score - Baseline Correlation Coefficient .051 -.077* .047 -.100** .021 -.074* .088** 1.000
Sig. (2-tailed) .152 .022 .240 .003 .530 .024 .009 .
N 453 453 453 453 453 453 453 453
Spearman's rho d1 How many times hospitalized for medical problems (lifetime) Correlation Coefficient 1.000 .220** .114* -.095* -.327** -.199** .166** .069
Sig. (2-tailed) . .000 .015 .043 .000 .000 .000 .141
N 453 453 453 453 453 453 453 453
age Age at baseline (in years) Correlation Coefficient .220** 1.000 .040 .062 -.207** .029 .066 -.110*
Sig. (2-tailed) .000 . .399 .190 .000 .541 .163 .019
N 453 453 453 453 453 453 453 453
female Gender of respondent Correlation Coefficient .114* .040 1.000 .063 -.169** -.121* -.258** .055
Sig. (2-tailed) .015 .399 . .178 .000 .010 .000 .241
N 453 453 453 453 453 453 453 453
pss_fr Perceived Social Support - friends Correlation Coefficient -.095* .062 .063 1.000 .067 .126** -.185** -.137**
Sig. (2-tailed) .043 .190 .178 . .157 .007 .000 .004
N 453 453 453 453 453 453 453 453
pcs SF36 Physical Composite Score - Baseline Correlation Coefficient -.327** -.207** -.169** .067 1.000 .144** -.134** .031
Sig. (2-tailed) .000 .000 .000 .157 . .002 .004 .514
N 453 453 453 453 453 453 453 453
mcs SF36 Mental Composite Score - Baseline Correlation Coefficient -.199** .029 -.121* .126** .144** 1.000 -.343** -.104*
Sig. (2-tailed) .000 .541 .010 .007 .002 . .000 .027
N 453 453 453 453 453 453 453 453
indtot Inventory of Drug Use Consequences (InDue) total score - Baseline Correlation Coefficient .166** .066 -.258** -.185** -.134** -.343** 1.000 .126**
Sig. (2-tailed) .000 .163 .000 .000 .004 .000 . .007
N 453 453 453 453 453 453 453 453
sexrisk Risk Assessment Battery (RAB) sex risk score - Baseline Correlation Coefficient .069 -.110* .055 -.137** .031 -.104* .126** 1.000
Sig. (2-tailed) .141 .019 .241 .004 .514 .027 .007 .
N 453 453 453 453 453 453 453 453
**. Correlation is significant at the 0.01 level (2-tailed).    
*. Correlation is significant at the 0.05 level (2-tailed).    
Log - Log - November 5, 2017

* ============================================.
* run Poisson regression - intercept only model
* this is the NULL model - no predictors
* ============================================.

* Generalized Linear Models.

Generalized Linear Models
Generalized Linear Models - Model Information - November 5, 2017
Model InformationModel Information, table, 0 levels of column headers and 1 levels of row headers, table with 2 columns and 4 rows
Dependent Variable d1 How many times hospitalized for medical problems (lifetime)
Probability Distribution Poisson
Link Function Log
Generalized Linear Models
Generalized Linear Models - Case Processing Summary - November 5, 2017
Case Processing SummaryCase Processing Summary, table, 1 levels of column headers and 1 levels of row headers, table with 3 columns and 5 rows
  N Percent
Included 453 100.0%
Excluded 0 0.0%
Total 453 100.0%
Generalized Linear Models
Generalized Linear Models - Continuous Variable Information - November 5, 2017
Continuous Variable InformationContinuous Variable Information, table, 1 levels of column headers and 2 levels of row headers, table with 7 columns and 3 rows
  N Minimum Maximum Mean Std. Deviation
Dependent Variable d1 How many times hospitalized for medical problems (lifetime) 453 0 100 3.06 6.188
Generalized Linear Models
Generalized Linear Models - Goodness of Fit - November 5, 2017
Goodness of FitaGoodness of Fit, table, 1 levels of column headers and 1 levels of row headers, table with 4 columns and 14 rows
  Value df Value/df
Deviance 2261.889 452 5.004
Scaled Deviance 2261.889 452  
Pearson Chi-Square 5656.091 452 12.513
Scaled Pearson Chi-Square 5656.091 452  
Log Likelihoodb -1638.126    
Akaike's Information Criterion (AIC) 3278.252    
Finite Sample Corrected AIC (AICC) 3278.261    
Bayesian Information Criterion (BIC) 3282.368    
Consistent AIC (CAIC) 3283.368    
Dependent Variable: How many times hospitalized for medical problems (lifetime)
Model: (Intercept)
a. Information criteria are in smaller-is-better form.
b. The full log likelihood function is displayed and used in computing information criteria.
Generalized Linear Models
Generalized Linear Models - Omnibus Test - November 5, 2017
Omnibus TestaOmnibus Test, table, 1 levels of column headers and 0 levels of row headers, table with 3 columns and 5 rows
Likelihood Ratio Chi-Square df Sig.
.000 . .
Dependent Variable: How many times hospitalized for medical problems (lifetime)
Model: (Intercept)
a. Compares the fitted model against the intercept-only model.
Generalized Linear Models
Generalized Linear Models - Tests of Model Effects - November 5, 2017
Tests of Model EffectsTests of Model Effects, table, 2 levels of column headers and 1 levels of row headers, table with 4 columns and 5 rows
Source Type III
Wald Chi-Square df Sig.
(Intercept) 1733.278 1 .000
Dependent Variable: How many times hospitalized for medical problems (lifetime)
Model: (Intercept)
Generalized Linear Models
Generalized Linear Models - Parameter Estimates - November 5, 2017
Parameter EstimatesParameter Estimates, table, 2 levels of column headers and 1 levels of row headers, table with 11 columns and 7 rows
Parameter B Std. Error 95% Wald Confidence Interval Hypothesis Test Exp(B) 95% Wald Confidence Interval for Exp(B)
Lower Upper Wald Chi-Square df Sig. Lower Upper
(Intercept) 1.118 .0269 1.066 1.171 1733.278 1 .000 3.060 2.903 3.225
(Scale) 1a                  
Dependent Variable: How many times hospitalized for medical problems (lifetime)
Model: (Intercept)
a. Fixed at the displayed value.
Log - Log - November 5, 2017

* ============================================.
* Poisson Regression - pcs as predictor for d1
* ============================================.

* Generalized Linear Models.

Generalized Linear Models
Generalized Linear Models - Model Information - November 5, 2017
Model InformationModel Information, table, 0 levels of column headers and 1 levels of row headers, table with 2 columns and 4 rows
Dependent Variable d1 How many times hospitalized for medical problems (lifetime)
Probability Distribution Poisson
Link Function Log
Generalized Linear Models
Generalized Linear Models - Case Processing Summary - November 5, 2017
Case Processing SummaryCase Processing Summary, table, 1 levels of column headers and 1 levels of row headers, table with 3 columns and 5 rows
  N Percent
Included 453 100.0%
Excluded 0 0.0%
Total 453 100.0%
Generalized Linear Models
Generalized Linear Models - Continuous Variable Information - November 5, 2017
Continuous Variable InformationContinuous Variable Information, table, 1 levels of column headers and 2 levels of row headers, table with 7 columns and 4 rows
  N Minimum Maximum Mean Std. Deviation
Dependent Variable d1 How many times hospitalized for medical problems (lifetime) 453 0 100 3.06 6.188
Covariate pcs SF36 Physical Composite Score - Baseline 453 14.0742912292480 74.8063278198242 48.048541551131024 10.784602685414228
Generalized Linear Models
Generalized Linear Models - Goodness of Fit - November 5, 2017
Goodness of FitaGoodness of Fit, table, 1 levels of column headers and 1 levels of row headers, table with 4 columns and 14 rows
  Value df Value/df
Deviance 1899.702 451 4.212
Scaled Deviance 1899.702 451  
Pearson Chi-Square 3178.676 451 7.048
Scaled Pearson Chi-Square 3178.676 451  
Log Likelihoodb -1457.032    
Akaike's Information Criterion (AIC) 2918.065    
Finite Sample Corrected AIC (AICC) 2918.091    
Bayesian Information Criterion (BIC) 2926.296    
Consistent AIC (CAIC) 2928.296    
Dependent Variable: How many times hospitalized for medical problems (lifetime)
Model: (Intercept), pcs
a. Information criteria are in smaller-is-better form.
b. The full log likelihood function is displayed and used in computing information criteria.
Generalized Linear Models
Generalized Linear Models - Omnibus Test - November 5, 2017
Omnibus TestaOmnibus Test, table, 1 levels of column headers and 0 levels of row headers, table with 3 columns and 5 rows
Likelihood Ratio Chi-Square df Sig.
362.187 1 .000
Dependent Variable: How many times hospitalized for medical problems (lifetime)
Model: (Intercept), pcs
a. Compares the fitted model against the intercept-only model.
Generalized Linear Models
Generalized Linear Models - Tests of Model Effects - November 5, 2017
Tests of Model EffectsTests of Model Effects, table, 2 levels of column headers and 1 levels of row headers, table with 4 columns and 6 rows
Source Type III
Wald Chi-Square df Sig.
(Intercept) 916.659 1 .000
pcs 365.062 1 .000
Dependent Variable: How many times hospitalized for medical problems (lifetime)
Model: (Intercept), pcs
Generalized Linear Models
Generalized Linear Models - Parameter Estimates - November 5, 2017
Parameter EstimatesParameter Estimates, table, 2 levels of column headers and 1 levels of row headers, table with 11 columns and 8 rows
Parameter B Std. Error 95% Wald Confidence Interval Hypothesis Test Exp(B) 95% Wald Confidence Interval for Exp(B)
Lower Upper Wald Chi-Square df Sig. Lower Upper
(Intercept) 3.206 .1059 2.998 3.414 916.659 1 .000 24.679 20.054 30.372
pcs -.046 .0024 -.051 -.041 365.062 1 .000 .955 .950 .959
(Scale) 1a                  
Dependent Variable: How many times hospitalized for medical problems (lifetime)
Model: (Intercept), pcs
a. Fixed at the displayed value.
Log - Log - November 5, 2017

* ============================================.
* Negative Binomial Regression - pcs as predictor for d1
* compare goodness of fit stats
* ============================================.

* Generalized Linear Models.

Generalized Linear Models
Generalized Linear Models - Model Information - November 5, 2017
Model InformationModel Information, table, 0 levels of column headers and 1 levels of row headers, table with 2 columns and 4 rows
Dependent Variable d1 How many times hospitalized for medical problems (lifetime)
Probability Distribution Negative binomial (MLE)
Link Function Log
Generalized Linear Models
Generalized Linear Models - Case Processing Summary - November 5, 2017
Case Processing SummaryCase Processing Summary, table, 1 levels of column headers and 1 levels of row headers, table with 3 columns and 5 rows
  N Percent
Included 453 100.0%
Excluded 0 0.0%
Total 453 100.0%
Generalized Linear Models
Generalized Linear Models - Continuous Variable Information - November 5, 2017
Continuous Variable InformationContinuous Variable Information, table, 1 levels of column headers and 2 levels of row headers, table with 7 columns and 4 rows
  N Minimum Maximum Mean Std. Deviation
Dependent Variable d1 How many times hospitalized for medical problems (lifetime) 453 0 100 3.06 6.188
Covariate pcs SF36 Physical Composite Score - Baseline 453 14.0742912292480 74.8063278198242 48.048541551131024 10.784602685414228
Generalized Linear Models
Generalized Linear Models - Goodness of Fit - November 5, 2017
Goodness of FitaGoodness of Fit, table, 1 levels of column headers and 1 levels of row headers, table with 4 columns and 14 rows
  Value df Value/df
Deviance 475.653 450 1.057
Scaled Deviance 475.653 450  
Pearson Chi-Square 730.027 450 1.622
Scaled Pearson Chi-Square 730.027 450  
Log Likelihoodb -984.664    
Akaike's Information Criterion (AIC) 1975.329    
Finite Sample Corrected AIC (AICC) 1975.382    
Bayesian Information Criterion (BIC) 1987.676    
Consistent AIC (CAIC) 1990.676    
Dependent Variable: How many times hospitalized for medical problems (lifetime)
Model: (Intercept), pcs
a. Information criteria are in smaller-is-better form.
b. The full log likelihood function is displayed and used in computing information criteria.
Generalized Linear Models
Generalized Linear Models - Omnibus Test - November 5, 2017
Omnibus TestaOmnibus Test, table, 1 levels of column headers and 0 levels of row headers, table with 3 columns and 5 rows
Likelihood Ratio Chi-Square df Sig.
80.697 1 .000
Dependent Variable: How many times hospitalized for medical problems (lifetime)
Model: (Intercept), pcs
a. Compares the fitted model against the intercept-only model.
Generalized Linear Models
Generalized Linear Models - Tests of Model Effects - November 5, 2017
Tests of Model EffectsTests of Model Effects, table, 2 levels of column headers and 1 levels of row headers, table with 4 columns and 6 rows
Source Type III
Wald Chi-Square df Sig.
(Intercept) 180.959 1 .000
pcs 83.302 1 .000
Dependent Variable: How many times hospitalized for medical problems (lifetime)
Model: (Intercept), pcs
Generalized Linear Models
Generalized Linear Models - Parameter Estimates - November 5, 2017
Parameter EstimatesParameter Estimates, table, 2 levels of column headers and 1 levels of row headers, table with 11 columns and 9 rows
Parameter B Std. Error 95% Wald Confidence Interval Hypothesis Test Exp(B) 95% Wald Confidence Interval for Exp(B)
Lower Upper Wald Chi-Square df Sig. Lower Upper
(Intercept) 3.127 .2324 2.671 3.582 180.959 1 .000 22.796 14.455 35.950
pcs -.044 .0049 -.054 -.035 83.302 1 .000 .957 .948 .966
(Scale) 1a                  
(Negative binomial) .910 .0855 .757 1.094            
Dependent Variable: How many times hospitalized for medical problems (lifetime)
Model: (Intercept), pcs
a. Fixed at the displayed value.
