Logistic Regression & Intro to Generalized Linear Model

Understanding the LOGIT function

In logistic regression to goal is to predict the probability of an outcome like YES vs NO or the probability that the regression equation predicts membership in group A or B.

The linear logistic-regression or the linear logit model is given by this equation

\[ \pi_i = \frac{1}{1+exp[-(\alpha + \beta X_i)]} \]

where \(\pi_i\) is the probability of the desired outcome.

Given this equation then

\[ Odds = \frac{\pi}{1 - \pi} \]

and

\[ Logit = log_e(\frac{\pi}{1 - \pi}) \]

and

\[ Logit = \alpha + \beta X_i \]

So, let’s look at a hypothetical logistic regression equation where

\[ Logit = 2 + 1.5 X_i \]

We’ll look at this function over X’s ranging from -5 to 2 and we’ll look at each step in between by computing the Logit, Odds and Probability and then we’ll make a plot of each one.

x <- seq(from=-5, to=2, by=0.1)
y.logit <- 2 + 1.5*x
y.odds <- exp(y.logit)
y.prob <- y.odds/(1+y.odds)
x.df <- data.frame(x,y.logit,y.odds,y.prob)

Table of X, Logit, Odds, Probability (excerpt of 20 rows)

knitr::kable(x.df[30:50,],
             col.names = c("X",
                           "Logit = 2+1.5x",
                           "Odds = P/1-P",
                           "Probability P"))

	X	Logit = 2+1.5x	Odds = P/1-P	Probability P
30	-2.1	-1.15	0.3166368	0.2404891
31	-2.0	-1.00	0.3678794	0.2689414
32	-1.9	-0.85	0.4274149	0.2994329
33	-1.8	-0.70	0.4965853	0.3318122
34	-1.7	-0.55	0.5769498	0.3658644
35	-1.6	-0.40	0.6703200	0.4013123
36	-1.5	-0.25	0.7788008	0.4378235
37	-1.4	-0.10	0.9048374	0.4750208
38	-1.3	0.05	1.0512711	0.5124974
39	-1.2	0.20	1.2214028	0.5498340
40	-1.1	0.35	1.4190675	0.5866176
41	-1.0	0.50	1.6487213	0.6224593
42	-0.9	0.65	1.9155408	0.6570105
43	-0.8	0.80	2.2255409	0.6899745
44	-0.7	0.95	2.5857097	0.7211152
45	-0.6	1.10	3.0041660	0.7502601
46	-0.5	1.25	3.4903430	0.7772999
47	-0.4	1.40	4.0552000	0.8021839
48	-0.3	1.55	4.7114702	0.8249137
49	-0.2	1.70	5.4739474	0.8455347
50	-0.1	1.85	6.3598195	0.8641271

Plot of the Logit

plot(x,y.logit,
     xlab = "X values",
     ylab = "Logit = 2 + 1.5*X")
lines(x,y.logit)

Plot of the Odds

plot(x,y.odds,
     xlab = "X values",
     ylab = "Odds = exp(2 + 1.5*X)")
lines(x,y.odds)

Plot of the Probability

plot(x,y.prob,
     xlab = "X values",
     ylab = "Probability = Odds/(1+Odds)")
lines(x,y.prob)

So, when we “fit” a logistic regression model, we are solving for the best fit line for this equation:

\[ Logit = log_e(\frac{\pi}{1 - \pi}) = \alpha + \beta X_i \]

where the “logit” function LINKS the outcome (or a mathematical transformation of the outcome) - in this case the \(\pi\) with the linear predictor equation \(\alpha + \beta X_i\).

Similarly, if we take the “exponent” of both sides of this equation we get:

\[ Odds = \frac{\pi}{1 - \pi} = e^{\alpha + \beta X_i} \]

This is why logistic regression yields “ODDS RATIOS” or some software lists these as “exp B”.

Family	Link	Function	Type of Outcome
Gaussian	Identity	\(\mu_i\)	Continuous - Normal
Binomial	Logit	\(log_e(\frac{\pi}{1 - \pi})\)	Dichotomous; 2 categories
Poisson	Log	\(log_e(\mu_i)\)	Count
Gamma	Inverse	\(\mu_i^-1\)	Time to event - Survival
Inverse-Gamma	Inverse-square	\(\mu_i^-2\)	Inverse of the Gamma

Logistic Regression & Intro to Generalized Linear Model

Melinda K. Higgins, PhD.

October 29, 2017

Understanding the LOGIT function

Table of X, Logit, Odds, Probability (excerpt of 20 rows)

Plot of the Logit

Plot of the Odds

Plot of the Probability

More on Generalized Linear Models

Poisson Regression