Pages

Using the "Divide by 4 Rule" to Interpret Logistic Regression Coefficients

I was recently reading a bit about logistic regression in Gelman and Hill's book on hierarchical/multilevel modeling when I first learned about the "divide by 4 rule" for quickly interpreting coefficients in a logistic regression model in terms of the predicted probabilities of the outcome. The idea is pretty simple. The logistic curve (predicted probabilities) is steepest at the center where a+ßx=0, where logit-1(x+ßx)=0.5. See the plot below (or use the R code to plot it yourself).

x=seq(-5,5,.01)
invlogit=function(x) exp(x)/(1+exp(x))
y=invlogit(x)
plot(x,y,pch=16,ylab=expression(paste(logit^{-1},(x))))
abline(v=0)
abline(h=.5)
text(.55,.55,expression(paste("Slope is ",beta/4)),adj=c(0,0))
view raw logisticcurve.r hosted with ❤ by GitHub


The slope of this curve (1st derivative of the logistic curve) is maximized at a+ßx=0, where it takes on the value:

ße0/(1+e0

=ß(1)/(1+1)²

=ß/4

So you can take the logistic regression coefficients (not including the intercept) and divide them by 4 to get an upper bound of the predictive difference in probability of the outcome y=1 per unit increase in x. This approximation the best at the midpoint of x where predicted probabilities are close to 0.5, which is where most of the data will lie anyhow.

So if your regression coefficient is 0.8, a rough approximation using the ß/4 rule is that a 1 unit increase in x results in about a 0.8/4=0.2, or 20% increase in the probability of y=1.