## Logistic glm: marginal predicted probabilities and log-odds coefficients support different hypotheses

I’m running a fixed effects logistic regression in R. The model consists of a binary outcome and two binary predictors, with no interaction term. On the log-odds scale, and as an odds-ratio, the coefficient for one of the predictors (`carbf`

in the mocked-up example below) indicates that the expected probability of Y=1 (“success”) is different between the two levels of the factor (i.e., the effect is significant).

When I use the `effects`

package to get marginal predicted probabilities, the 95% CIs for the two levels of `carbf`

overlap considerably, indicating there is no evidence of a difference in the expected probability of Y=1 between the two factor levels.

When I use the `mfx`

package to get average marginal effects for the coefficients (i.e., for the expected *difference* in the probability of Y=1 between the two factor levels), I do get a significant difference.

**I’m confused as to whether this discrepancy is because:**

**1) the output from the model and the mfx package is an expected difference in the probability of Y=1 between factor levels, rather than predicted probabilities for each level.**

**2) of the way the effects package is calculating the marginal effect.**

In an effort to determine this, I modified the source code from the `mfx`

package to give me average marginal effects for each level of the `carbf`

factor. The 95% CIs for these predictions *do not* overlap, indicating a significant difference. This makes me wonder why I get such different results using the `effects`

package. Or is it that I’m just confused about the difference between marginal effects for coefficients and for predicted probabilities?

```
#####################################
# packages
library(effects)
library(mfx)
library(ggplot2)
# data
data(mtcars)
carsdat <- mtcars
carsdat$carb <- ifelse(carsdat$carb %in% 1:3, 0, 1)
facvars <- c("vs", "am", "carb")
carsdat[, paste0(facvars, "f")] <- lapply(carsdat[, facvars], factor)
# model
m1 <- glm(vsf ~ amf + carbf,
family = binomial(link = "logit"),
data = carsdat)
summary(m1)
#####################################
# effects package
eff <- allEffects(m1)
plot(eff, rescale.axis = FALSE)
eff_df <- data.frame(eff[["carbf"]])
eff_df
# carbf fit se lower upper
# 1 0 0.607 0.469 0.3808 0.795
# 2 1 0.156 0.797 0.0375 0.469
#####################################
# mfx package marginal effects (at mean)
mfx1 <-logitmfx(vsf ~ amf + carbf, data = carsdat, atmean = TRUE, robust = FALSE)
mfx1
# dF/dx Std. Err. z P>|z|
# amf1 0.217 0.197 1.10 0.2697
# carbf1 -0.450 0.155 -2.91 0.0037
# mfx package marginal effects (averaged)
mfx2 <-logitmfx(vsf ~ amf + carbf, data = carsdat, atmean = FALSE, robust = FALSE)
mfx2
# dF/dx Std. Err. z P>|z|
# amf1 0.177 0.158 1.12 0.2623
# carbf1 -0.436 0.150 -2.90 0.0037
#####################################
# mfx source code
fit <- m1
x1 = model.matrix(fit)
be = as.matrix(na.omit(coef(fit)))
k1 = length(na.omit(coef(fit)))
fxb = mean(plogis(x1 %*% be)*(1-plogis(x1 %*% be)))
vcv = vcov(fit)
# data frame for predictions
mfx_pred <- data.frame(mfx = rep(NA, 4), se = rep(NA, 4),
row.names = c("amf0", "amf1", "carbf0", "carbf1"))
disc <- rownames(mfx_pred)
# hard coded prediction estimates and SE
disx0c <- disx1c <- disx0a <- disx1a <- x1
disx1a[, "amf1"] <- max(x1[, "amf1"])
disx0a[, "amf1"] <- min(x1[, "amf1"])
disx1c[, "carbf1"] <- max(x1[, "carbf1"])
disx0c[, "carbf1"] <- min(x1[, "carbf1"])
mfx_pred["amf0", 1] <- mean(plogis(disx0a %*% be))
mfx_pred["amf1", 1] <- mean(plogis(disx1a %*% be))
mfx_pred["carbf0", 1] <- mean(plogis(disx0c %*% be))
mfx_pred["carbf1", 1] <- mean(plogis(disx1c %*% be))
# standard errors
gr0a <- as.numeric(dlogis(disx0a %*% be)) * disx0a
gr1a <- as.numeric(dlogis(disx1a %*% be)) * disx1a
gr0c <- as.numeric(dlogis(disx0c %*% be)) * disx0c
gr1c <- as.numeric(dlogis(disx1c %*% be)) * disx1c
avegr0a <- as.matrix(colMeans(gr0a))
avegr1a <- as.matrix(colMeans(gr1a))
avegr0c <- as.matrix(colMeans(gr0c))
avegr1c <- as.matrix(colMeans(gr1c))
mfx_pred["amf0", 2] <- sqrt(t(avegr0a) %*% vcv %*% avegr0a)
mfx_pred["amf1", 2] <- sqrt(t(avegr1a) %*% vcv %*% avegr1a)
mfx_pred["carbf0", 2] <- sqrt(t(avegr0c) %*% vcv %*% avegr0c)
mfx_pred["carbf1", 2] <- sqrt(t(avegr1c) %*% vcv %*% avegr1c)
mfx_pred$pred <- rownames(mfx_pred)
mfx_pred$lcl <- mfx_pred$mfx - (mfx_pred$se * 1.96)
mfx_pred$ucl <- mfx_pred$mfx + (mfx_pred$se * 1.96)
# mfx se pred lcl ucl
# amf0 0.366 0.101 amf0 0.1682 0.563
# amf1 0.543 0.122 amf1 0.3041 0.782
# carbf0 0.601 0.107 carbf0 0.3916 0.811
# carbf1 0.165 0.105 carbf1 -0.0412 0.372
ggplot(mfx_pred, aes(x = pred, y = mfx)) +
geom_point() +
geom_errorbar(aes(ymin = lcl, ymax = ucl)) +
theme_bw()
```