Linear Regression

EXAMPLE Analysis of Gentoo penguins

Let’s load the Penguins data from the palmerpenguins package. Go ahead and also load the dplyr and ggplot2 packages as well. Other packages are also loaded in the R chunks below as needed.

R Packages Used for this Example on Linear and Logistic Regression:

  • palmerpenguins
  • dplyr
  • ggplot2
  • easystats
  • gtsummary
  • car
  • broom
  • olsrr
  • ROCR
  • DescTools
# load packages needed
library(dplyr)
library(palmerpenguins)
library(ggplot2)

For this homework EXAMPLE, let’s pull out the data for the Gentoo penguins, and we’ll remove the NA’s here at the beginning for the penguins with missing body size measurements.

pg1 <- penguins %>%
  select(species, body_mass_g, bill_depth_mm,
         bill_length_mm, flipper_length_mm) %>%
  filter(species == "Gentoo") %>%
  filter(complete.cases(.))

dim(pg1)
## [1] 123   5
head(pg1)
## # A tibble: 6 × 5
##   species body_mass_g bill_depth_mm bill_length_mm flipper_length_mm
##   <fct>         <int>         <dbl>          <dbl>             <int>
## 1 Gentoo         4500          13.2           46.1               211
## 2 Gentoo         5700          16.3           50                 230
## 3 Gentoo         4450          14.1           48.7               210
## 4 Gentoo         5700          15.2           50                 218
## 5 Gentoo         5400          14.5           47.6               215
## 6 Gentoo         4550          13.5           46.5               210

Bivariate Correlations

Before we build the regression model, let’s take a moment and look at the bivariate correlations between body mass and the body dimensions for bill depth and bill and flipper length. We’ll put body mass in the 1st column since this is the variable we’ll focus on for the linear regression.

library(easystats)
pg1 %>%
  select(body_mass_g, bill_depth_mm,
         bill_length_mm, flipper_length_mm) %>%
  correlation() %>%
  summary(redundant = TRUE) %>%
  display()
Correlation Matrix (pearson-method)
Parameter body_mass_g bill_depth_mm bill_length_mm flipper_length_mm
body_mass_g 0.72*** 0.67*** 0.70***
bill_depth_mm 0.72*** 0.64*** 0.71***
bill_length_mm 0.67*** 0.64*** 0.66***
flipper_length_mm 0.70*** 0.71*** 0.66***

p-value adjustment method: Holm (1979)

Fit Model for body_mass_g from other 3 dimensional measurements

NOTE: Since we are focusing for this example on model interpretation, let’s fit the model to all available data. For now, we will not split the data into a training and testing datasets.

Let’s fit the lm linear model.

lm1 <- lm(body_mass_g ~ bill_depth_mm +
            bill_length_mm + flipper_length_mm,
          data = pg1)

View Model Results

To view and format the model results in a nice table format, let’s use the gtsummary package and the tbl_regression() function. I have also added some other formatting features from the package. To learn more, see https://www.danieldsjoberg.com/gtsummary/articles/tbl_regression.html.

NOTE: add_glance_source_note() function comes from the broom::glance() function from the broom package which is loaded with gtsummary. See https://www.danieldsjoberg.com/gtsummary/articles/tbl_regression.html#gtsummary-functions-to-add-information.

library(gtsummary)

lm1 %>%
  tbl_regression(
    estimate_fun = ~ style_number(.x, digits = 2)
      ) %>%
  add_glance_source_note(
    include = c(r.squared, adj.r.squared)
  )
Characteristic Beta 95% CI p-value
bill_depth_mm 182.81 97.20, 268.41 <0.001
bill_length_mm 41.23 15.52, 66.94 0.002
flipper_length_mm 22.12 8.91, 35.34 0.001
Abbreviation: CI = Confidence Interval
R² = 0.625; Adjusted R² = 0.615

Check Normality of Residuals - histogram and QQplot

One of the assumptions of linear regression is that the residuals should be normally distributed. So, let’s make a histogram of the residuals to look for any non-normal tendencies like skewness and let’s also look for any non-linear trends or outliers with the normal probability plot - the QQ plot.

# histogram of the residuals
hist(lm1$residuals)

# get QQ plot of residuals
# the normal probability plot
library(car)
car::qqPlot(lm1)

## [1] 14 18

From the QQ plot, it looks like penguin cases on rows #14 and #18 might be outliers we should look at.

Residual Plots & Tukey’s test for nonadditivity

In the car package, we can also use the residualPlots() function to look for any non-linear trends in the plots of the residuals against each predictor. We are looking for anything that deviates from Y=0 a horizontal line through 0. Look for any lines that trend up or down over the range of the predictor. Also look for any curvature in the lines.

After the plots, a table is produced showing the Tukey’s test for nonadditivity. If the test is significant (p<.05) for any given predictor, then that predictor may not meet the assumption of a linear relationship to the outcome variable - in other words, you may need to make this predictor variable quadratic or higher order like \(X^2\) or \(X^3\).

# diagnostics for the first model 
# with 3 independent variables
# function from car package
car::residualPlots(lm1)

##                   Test stat Pr(>|Test stat|)   
## bill_depth_mm       -1.5994         0.112402   
## bill_length_mm      -1.5169         0.131965   
## flipper_length_mm   -2.0273         0.044889 * 
## Tukey test          -2.7184         0.006561 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

As you’ll see there is some curvature in the plots above, but the test was only statistically significant for flipper_length_mm.

Learn more in the BOOK: “Companion for Applied Regression” by John Fox (who also wrote the car package for R), see Chapter 6 https://us.sagepub.com/sites/default/files/upm-binaries/38503_Chapter6.pdf.

Use broom package to review other model output

Next we can use the broom package https://broom.tidymodels.org/, which is useful for looking at model effects, predictions, and residuals and other related statistics and fit metrics.

We’re going to use the augment() function to pull out the .fitted variable for the predicted values from the model. Then we’re going to compare the predicted values to the original values.

library(broom)
alm1 <- augment(lm1)

Review fitted values (e.g. value predicted by model)

Before we make the plot, let’s also look at the summary statistics (mean (SD) and the min and max) and compare the original values and the model .fitted values.

alm1 %>%
  select(body_mass_g, .fitted) %>%
  tbl_summary(statistic = list(
      all_continuous() ~ "{mean} ({sd}); [{min} - {max}]"
    ),
    digits = all_continuous() ~ 2,) %>%
  modify_footnote(
    all_stat_cols() ~ "Mean (SD) [min - max]"
  )
Characteristic N = 1231
body_mass_g 5,076.02 (504.12); [3,950.00 - 6,300.00]
.fitted 5,076.02 (398.39); [4,419.09 - 6,227.09]
1 Mean (SD) [min - max]

What do you notice?

  • The means do match
  • But the standard deviation of the .fitted values is smaller than for the original values indicating that the model is not capturing all of the variability in the original outcome variable.
  • Also the range, min - max, of the .fitted values is tighter and doesn’t cover the full range of the original values, again noting that the model may not be fully capturing the full variability of the outcome and may have limited ability or make predictions at the low and high end of the outcome.

Plot of Model Predictions vs Original Values

Note: To see how well the values match up, be sure to plot the x and y axis on same limits and add Y=X reference line. So, what do you see - how good are the predictions?

# find min and max of original body_mass_g
# and .fitted values
lower <- min(alm1$body_mass_g, alm1$.fitted)
upper <- max(alm1$body_mass_g, alm1$.fitted)

ggplot(alm1,
       aes(x = body_mass_g,
           y = .fitted)) +
  geom_point() +
  geom_abline(intercept = 0, slope = 1,
              color = "red") +
  xlim(lower, upper) +
  ylim(lower, upper) +
  xlab("Original Body Mass (g)") +
  ylab("Model Predicted Body Mass (g)") +
  ggtitle("Regression Diagnostic Plot - Fitted vs Original")

In the plot above, for the most part the points do lie along the Y=X line with some deviations at the low and high end of the outcome values. On the lower end the model is predicting values higher than the original body mass measurements.

The spread in the values shows a little more variability for the higher body mass values.

Added Variable Plots

Another helpful set of diagnostic plots are looking at the added variable plots. Each of these plots looks at how well the given variable is correlated with (predicts) the outcome after adjusting for the other variables already included in the model.

In other words, does the plot show a horizontal line indicating that the variable does not “add” anything to the model and is perhaps redundant with other variables already in the model. This can help with final variable selection.

car::avPlots(lm1)

However, as we see, all 3 of these variables do appear to add something to the model even after considering the other variables in the model.

Also notice that the penguins with row #s 18 and 14 show up as potential outliers in all 3 plots.

Outlier Tests

If we want to identify outliers in the dataset, we can run the car::outlierTest() function to find sample cases in the data set that might be an unusual case which may be affecting the model fit.

#run Bonferroni test for outliers
car::outlierTest(lm1)
## No Studentized residuals with Bonferroni p < 0.05
## Largest |rstudent|:
##    rstudent unadjusted p-value Bonferroni p
## 18 3.463134         0.00074468     0.091595

From the output, the penguin on row #18 might be an outlier case, although the Bonferroni adjusted p-value is > 0.05.

Influence Index Plots

We can also look at other plots looking for cases that may be outliers or may have extra influence on the final fit of the model. The following “Influence Index Plots” shows plots for:

  • Cooks D distances
  • Studentized Residuals
  • Bonferroni p-values
  • and “hat” values
#identify highly influential points
influenceIndexPlot(lm1)

From the plots above, it looks like

  • penguins on rows #25, #80 (Cook’s D) may be influential data points,
  • #18 and #14 (residuals and p-values) may be outliers, and
  • penguins on rows #34 and #38 might have extra influence/leverage on the model fit (hat values).

See Section 6.3 for “Unusual Data” in Chapter 6 https://us.sagepub.com/sites/default/files/upm-binaries/38503_Chapter6.pdf of the Companion for Applied Regression BOOK.

Influence Plot

Another kind of influence plot looks at the points that may both be outliers and also have extra influence on the fit of the model (looking at Studentized Residuals against the Hat-values). It looks like:

#make influence plot
influencePlot(lm1)

##       StudRes        Hat      CookD
## 14  3.0006088 0.01949839 0.04194110
## 18  3.4631345 0.01243676 0.03456583
## 25  1.6850051 0.06282071 0.04685561
## 34 -0.6090638 0.14042810 0.01523138
## 38 -0.5589119 0.14796662 0.01364114
## 80  1.9481519 0.06877872 0.06847047
  • cases #18 and #14 have low hat values but higher residuals indicating they may be outliers - the bubbles are also larger indicating higher Cook’s D values indicating higher influence;
  • cases #25 and #80 also have larger bubbles (Cook’s D values) and have mid range hat values and lower residuals;
  • cases #34 and #38 have smaller bubbles (Cook’s D) and residuals close to 0, but they have higher Hat values

In general, it might be worth running the models again with and without these cases to see how much the final model fit and coefficients may be influenced by these data points.

To learn more, see Figure 6.6 in https://us.sagepub.com/sites/default/files/upm-binaries/38503_Chapter6.pdf.

Test for Heteroskedasticity (equal variance assumption)

Another assumption of linear regression is that the variance in the outcome is equal across all levels of the predictors. One way to check this is to run the test for heteroskedasticity.

# test for heteroskedasticity
ncvTest(lm1)
## Non-constant Variance Score Test 
## Variance formula: ~ fitted.values 
## Chisquare = 0.02063264, Df = 1, p = 0.88578

We are looking for a non-significant p-value which we have here - so we cannot reject the assumption of equal variance - this is good.

Variance Inflation Factor (multicollinearity)

Another assumption for linear regression (really any regression model) is the assumption that the predictors are independent of one another. To check this, we can compute the VIF (variance inflation factors) for the predictors.

Typically we want the VIFs < 10 and ideally want them < 5. I actually start to get worried about non-independence if the VIFs get much larger > than 2.

# we want these values < 10 or < 5
vif(lm1)
##     bill_depth_mm    bill_length_mm flipper_length_mm 
##          2.244212          1.996625          2.336683

All of the VIFs here look fine - they are all < 2.34 which is acceptable.

OPTIONAL - Also use olsrr package to check for collinearity

library(olsrr)

# get VIFs and Tolerances
ols_vif_tol(lm1)
##           Variables Tolerance      VIF
## 1     bill_depth_mm 0.4455908 2.244212
## 2    bill_length_mm 0.5008452 1.996625
## 3 flipper_length_mm 0.4279571 2.336683
# Get condition index
ols_eigen_cindex(lm1)
##     Eigenvalue Condition Index    intercept bill_depth_mm bill_length_mm
## 1 3.9957057835         1.00000 4.803244e-05  0.0001180776   0.0001302349
## 2 0.0025388092        39.67178 1.398025e-01  0.1744807916   0.1847332517
## 3 0.0014987177        51.63413 1.459059e-04  0.6127508792   0.7113057460
## 4 0.0002566896       124.76496 8.600036e-01  0.2126502515   0.1038307674
##   flipper_length_mm
## 1      2.367401e-05
## 2      1.159089e-02
## 3      3.076338e-04
## 4      9.880778e-01

Ideally, we want the Condition Index < 30 (always shown on the last line) for the last eigenvalue. Notice that the VIFs are all < 3, but the Condition Index (see line 4 of the Eigenvalues) is 124.76 which is >> 30, indicating we have a multicollinearity problem here.


Logistic Regression

EXAMPLE Analysis of Gentoo penguins

Let’s load the Penguins data. For this homework EXAMPLE, let’s pull out the data for the Gentoo penguins.

Let’s use logistic regression to see how well using these same 3 dimensional measurements (bill depth, bill and flipper length measurements) differentiates the male and female penguins.

NOTE: We need to do a filter here since there are penguins missing biological sex. So, let’s re-select the data and remove all NAs including those for sex to have a complete dataset before we fit the models. It is important to remove the NAs now, since they will cause errors for some of the later prediction functions.

pg2 <- penguins %>%
  select(species, sex, bill_depth_mm,
         bill_length_mm, flipper_length_mm) %>%
  filter(species == "Gentoo") %>%
  filter(complete.cases(.))

Get Summary Statistics for Males vs Females

Similar to performing bivariate analyses of each potential predictor with the outcome for a linear regression, to evaluate potential predictors for a logistic regression model it is useful to run t-tests for continuous predictors (or chi-square tests for any categorical predictors) for the 2 categories for the logistic regression outcome (male vs female biological sex for the penguins in this example).

I added custom output below and ran a two-sample t-test using equal variance assumption for the 2-group tests below. Notice that all of the tests are significant with the Males having much larger dimensions than the Females.

library(gtsummary)

pg2 %>%
  tbl_summary(
    by = sex,
    include = c(bill_depth_mm,
                bill_length_mm, 
                flipper_length_mm),
    type = all_continuous() ~ "continuous2",
    statistic = list(
      all_continuous() ~ c(
        "{mean} ({sd})",
        "{median} ({p25}, {p75})",
        "[{min} - {max}]"
        )
      ),
    digits = all_continuous() ~ 2
    ) %>%
  add_p(
    test = everything() ~ "t.test",
    test.args = all_tests("t.test") ~ list(var.equal = TRUE)
  )
Characteristic female
N = 58
male
N = 61
p-value1
bill_depth_mm

<0.001
    Mean (SD) 14.24 (0.54) 15.72 (0.74)
    Median (Q1, Q3) 14.25 (13.80, 14.60) 15.70 (15.20, 16.10)
    [Min - Max] [13.10 - 15.50] [14.10 - 17.30]
bill_length_mm

<0.001
    Mean (SD) 45.56 (2.05) 49.47 (2.72)
    Median (Q1, Q3) 45.50 (43.80, 46.90) 49.50 (48.10, 50.50)
    [Min - Max] [40.90 - 50.50] [44.40 - 59.60]
flipper_length_mm

<0.001
    Mean (SD) 212.71 (3.90) 221.54 (5.67)
    Median (Q1, Q3) 212.00 (210.00, 215.00) 221.00 (218.00, 225.00)
    [Min - Max] [203.00 - 222.00] [208.00 - 231.00]
1 Two Sample t-test

Fit Logistic Regression Model

We will fit the logistic regression model using the glm() function. This is pretty much the same syntax as we used above for lm() but we need to remember to set family = binomial otherwise the default is family = gaussian which is essentially running a linear (normal) regression model.

NOTE: sex here is a factor variable where 1=Female and 2=Male, so they are put in order, so the model is predicting Males and using Females as the reference category.

glm1 <- glm(sex ~ bill_depth_mm +
              bill_length_mm + flipper_length_mm,
            family = binomial,
            data = pg2)

View Model Coefficients (Odds Ratios) and Tests

We can again use the tbl_regression() function from the gtsummary package to get a nice table for the model. It is important to remember to set exponentiate = TRUE to get the odds ratios instead of the original raw betas.

You’ll also notice that I added add_glance_source_note() which adds some of the model fit statistics such as the AIC, BIC and other deviance statistics and sample size (n=119).

gtsummary::tbl_regression(glm1,
                          exponentiate = TRUE) %>%
  add_glance_source_note()
Characteristic OR 95% CI p-value
bill_depth_mm 19.0 5.05, 110 <0.001
bill_length_mm 1.76 1.19, 2.85 0.010
flipper_length_mm 1.21 1.01, 1.48 0.044
Abbreviations: CI = Confidence Interval, OR = Odds Ratio
Null deviance = 165; Null df = 118; Log-likelihood = -24.6; AIC = 57.1; BIC = 68.3; Deviance = 49.1; Residual df = 115; No. Obs. = 119

Notice that all predictors are statistically significant, but the odds ratio for Bill Depth is very large 19.0. Remember, the interpretation for an odds ratio is for every 1 unit larger (1 mm larger bill depth), the odds of the penguin being a Male instead of a Female is 19.0 times higher.

Logistic Regression Model Predictions - Confusion Matrix

In linear regression we computed the “fitted” (or “predicted” values) from the model and compared them back to the original values to get an idea of how well the model did in predicting the original data.

We can do the same thing with a logistic regression model. However, it is important to remember that the “original” outcome data for logistic regression is binary and is often coded using a number like 0 versus 1, but could also be a category like “apple” versus “oranges”. So, the logistic regression model actually predicts the probability for each outcome category, i.e., the probability for predicting a “1” or probability of predicting a “0”.

Let’s look at the predicted probabilities from this model. We can use the broom::augment() function, but we need to add type.predict = "response" to get the probabilities in the .fitted variable.

Learn more at https://broom.tidymodels.org/reference/augment.glm.html.

library(broom)
aglm1 <- augment(glm1,
                 type.predict="response")

Look at the top of this augmented output dataset.

head(aglm1)
## # A tibble: 6 × 10
##   sex    bill_depth_mm bill_length_mm flipper_length_mm .fitted  .resid     .hat
##   <fct>          <dbl>          <dbl>             <int>   <dbl>   <dbl>    <dbl>
## 1 female          13.2           46.1               211 0.00122 -0.0495 0.00267 
## 2 male            16.3           50                 230 1.000    0.0228 0.000860
## 3 female          14.1           48.7               210 0.0585  -0.347  0.0464  
## 4 male            15.2           50                 218 0.938    0.357  0.0341  
## 5 male            14.5           47.6               215 0.220    1.74   0.0394  
## 6 female          13.5           46.5               210 0.00305 -0.0782 0.00501 
## # ℹ 3 more variables: .sigma <dbl>, .cooksd <dbl>, .std.resid <dbl>

Notice that the 1st penguin is a female and the probability is 0.0012225 which is really low, close to 0; whereas the 2nd penguin is a male and their probability was 0.9997399 which is essentially 1 (or 100%).

So the model is predicting whether a given penguin is a male or a female based on the model inputs of bill_depth_mm + bill_length_mm + flipper_length_mm.

Let’s use a simple cutscore of 0.5 and classify all penguins with a probability (.fitted) > 0.5 as “males” and those with probabilities (.fitted) <= 0.5 to be “females”.

aglm1 <- aglm1 %>%
  mutate(predictedsex = ifelse(
    .fitted > 0.5, "predicted male",
    "predicted female"
  ))

Let’s look at the table of original sex (in the columns) compared to the sex predicted by the model (in the rows).

This table is the “confusion matrix”.

# make the confusion matrix
aglm1 %>%
  with(table(predictedsex, sex))
##                   sex
## predictedsex       female male
##   predicted female     52    6
##   predicted male        6   55

So, it looks like the model did pretty well. Here are the statistics you can pull from this “confusion matrix”:

  • overall correct classification rate = 52 females predicted correctly + 55 males predicted correctly / 119 penguins = 89.9% correct. Likewise the misclassification rate = 10.1%.
  • False Negatives (Predicted females that were actually males) = 6 predicted females / total number of males 61 = 9.8%.
  • The compliment of this is “sensitivity” which is the number of males corrected predicted by the model = 55/61 = 90.2%.
  • False Positives (Predicted males that were actually females) = 6 predicted males / total number of females 58 = 10.3%.
  • The compliment of this is “specificity” which is the number of females corrected predicted by the model = 52/58 = 89.7%.

Keep in mind this is ALL based on the cutpoint we chose of 0.5. We could use a different cutscore and we will get different correct and incorrect predictions. Let’s try a cutscore of 0.4.

aglm1 <- aglm1 %>%
  mutate(predictedsex_0.4 = ifelse(
    .fitted > 0.4, "predicted male",
    "predicted female"
  ))

# make the confusion matrix
aglm1 %>%
  with(table(predictedsex_0.4, sex))
##                   sex
## predictedsex_0.4   female male
##   predicted female     50    3
##   predicted male        8   58

Notice by setting the cutscore probability a little lower, we now have more females misclassified as males but fewer males misclassified. Try different cutscores and see what happens.

If you’d like an easy way to check these calculations - check out this website for computing “contingency tables” https://statpages.info/ctab2x2.html. Treat the “positive or present” category as the one the logistic regression model is predicting - in this case the “males” are the “present” or “positive” category.

Logistic Regression Model Fit Statistics - AUC

Instead of having to try a bunch of different cutscore probabilities, we can essentially get all of the misclassification rates over the whole range of cutscores from down close to 0 to all the way to almost 1. We can then capture the “sensitivity” (or True Positive Rate) and “1-specificity” (False Positive Rate) and plot the receiver operating characteristic curve and then compute the “area under the curve” (AUC).

One of the best model fit statistics to report for a logistic regression is to compute the AUC - area under the curve - for the ROC - receiver operating characteristic curve - which looks at the classification trade-off between false positives and false negatives on how well the model does in making predictions - this is similar to R2 for linear regression.

Ideally we want the AUC > 0.7 at a minimum, and AUC > 0.8 is good and AUC > 0.9 is very good.

As we can see below, the AUC here is really good, AUC = 0.973, which is not surprising for body size measurements used for predicting biological sex.

# Also compute the AUC and
# plot an ROC
library(ROCR)
p <- predict(glm1, newdata=pg2, 
             type="response")
pr <- ROCR::prediction(p, as.numeric(pg2$sex))
prf <- ROCR::performance(pr, measure = "tpr", 
                         x.measure = "fpr")
#plot(prf)
#abline(0, 1, col = "red")

# compute AUC, area under the curve
# also called the C-statistic
auc <- ROCR::performance(pr, measure = "auc")
auc <- auc@y.values[[1]]
#auc

# ideally you want AUC as close to 1 as possible
# AUC > 0.9 is great
# AUC > 0.8 is good
# AUC > 0.7 is ok
# AUC < 0.7 are not very good
# AUC around 0.5 is no better than flipping a fair coin
# which is a useless model

# also - add title to plot with AUC in title
plot(prf,
     main = paste("ROC Curve, AUC = ", round(auc, 3)))
abline(0, 1, col="red")

OPTIONAL - Pseudo-R2 values

Let’s also take a look at the pseudo-R2 values from a logistic regression using the DescTools package and the

Learn more at https://andrisignorell.github.io/DescTools/reference/PseudoR2.html.

From help("PseudoR2", package = "DescTools")

Although there’s no commonly accepted agreement on how to assess the fit of a logistic regression, there are some approaches. The goodness of fit of the logistic regression model can be expressed by some variants of pseudo R squared statistics, most of which being based on the deviance of the model.

Details:

  • Cox and Snell’s R2 is based on the log likelihood for the model compared to the log likelihood for a baseline model. However, with categorical outcomes, it has a theoretical maximum value of less than 1, even for a “perfect” model.

  • Nagelkerke’s R2 (also sometimes called Cragg-Uhler) is an adjusted version of the Cox and Snell’s R2 that adjusts the scale of the statistic to cover the full range from 0 to 1.

  • McFadden’s R2 is another version, based on the log-likelihood kernels for the intercept-only model and the full estimated model.

  • Veall and Zimmermann concluded that from a set of six widely used measures the measure suggested by McKelvey and Zavoina had the closest correspondence to ordinary least square R2. The Aldrich-Nelson pseudo-R2 with the Veall-Zimmermann correction is the best approximation of the McKelvey-Zavoina pseudo-R2. Efron, Aldrich-Nelson, McFadden and Nagelkerke approaches severely underestimate the “true R2”.

library(DescTools)

PseudoR2(glm1,
         which = "all")
##        McFadden     McFaddenAdj        CoxSnell      Nagelkerke   AldrichNelson 
##       0.7020133       0.6534971       0.6219583       0.8294536       0.4930936 
## VeallZimmermann           Efron McKelveyZavoina            Tjur             AIC 
##       0.8489486       0.7274389       0.8936440       0.7377126      57.1360369 
##             BIC          logLik         logLik0              G2 
##      68.2525309     -24.5680184     -82.4466954     115.7573538

Based on these values reported above, it looks like the pseudo R2 values are all pretty good around 0.7 to 0.89.