Model Design
You have a 2 (between; \(A\)) x 5 (within; \(B\)) design with a binary outcome. Probably the easiest way to handle the situation is in a multi-level model. First, let’s code some variables. I will use a set of orthogonal contrasts here.
\(A\) is easy. It is two levels, so we will simply use a single contrast code:
\[ AC =
\begin{cases}
-.5 & \quad \text{if } A \text{ is level 1} \\
.5 & \quad \text{if } A \text{ is level 2} \\
\end{cases}
\]
Depending on the desired interpretation of \(B\), a variety of codes could be used. But given my ignorance of that variable, I will specify a set of orthogonal polynomial contrast codes for it. These would be most useful if \(B\) represented something like time. In particualr, the linear code would represent growth (or shrinkage) from the beginning of the time-series to the end, and quadratic code would represent the curve of the measurements over time. In the examples below, I will use this coding scheme and focus on the linear trend. Here is a set of such codes:
| BC1: Linear trend |
-2 |
-1 |
0 |
1 |
2 |
| BC2: Quadratic trend |
2 |
-1 |
-2 |
-1 |
2 |
| BC3: Cubic trend |
-1 |
2 |
0 |
-2 |
1 |
| BC4: Quartic trend |
1 |
-4 |
6 |
-4 |
1 |
Other orthogonal codes exist that might be more appropriate. For example, if the question of interestis the difference between, say, levels 1 and 2, then Helmert codes might help more. In this case, the BC1 code would test the tendecy for levels 1 and 2 to differ:
| BC1: L1 vs. L2 |
-1 |
1 |
0 |
0 |
0 |
| BC2: L1 & L2 vs. L3 |
-1 |
-1 |
2 |
0 |
0 |
| BC3: L1, L2, & L3 vs. L4 |
-1 |
-1 |
-1 |
3 |
0 |
| BC4: L1, L2, L3, & L4 vs. L5 |
-1 |
-1 |
-1 |
-1 |
4 |
For all examples below, I use the first set of polynomial codes. I also assume that B is a time factor and that the effect of interest is the linear trend. This is only because it is easier to write about with some meaning imbued, but any (correctly specified orthogonal) code or interpretation could work. Here are a bunch of possible codes.
Now we can construct the model. Here are the equations that represent this model, where \(j\) stands for each measurement within person (here, each level of \(B\)) and \(i\) stands for each person in the sample.
Link function
First, we need a way of taking our observed decisions and modeling them continuously (I’m skipping basically all the math details of logistic regression here because I don’t think that is the question). Here’s the logit function for completeness:
\[\eta_{ij} = log\left(\frac{P(guilty)_{ij}}{1-P(guilty)_{ij}}\right)\]
Level 1:
\[\eta_{ij} = \beta_0 + \beta_{1i} * BC1_{ij} + \beta_{2i} * BC2_{ij} + \beta_{3i} * BC3_{ij} + \beta_{4i} * BC4_{ij}\]
There is no error term in this equation because we are talking about logistic regression.
Level 2:
\[\beta_0 = \gamma_{00} + \gamma_{01} * AC_i + u_{0i}\] \[\beta_1 = \gamma_{10} + \gamma_{11} * AC_i + u_{1i}\] \[\beta_2 = \gamma_{20} + \gamma_{21} * AC_i + u_{2i}\] \[\beta_3 = \gamma_{30} + \gamma_{31} * AC_i + u_{3i}\] \[\beta_4 = \gamma_{40} + \gamma_{41} * AC_i + u_{4i}\]
This is equivalent to a logistic mixed model ANOVA.
Interpretation
I am still skipping logistic regression nuances (parameters as odds ratios, etc.). The question was about testing the simple effects, so I give a really quick description of main effects and then move on to the method for the simples.
Main effects and the interaction on average
I am assuming the orthogonal polynomial codes here, but hopefully this explanation will allow interpretation of any set of properly-specified codes.
This framework allows us to test focused 1 degree of freedom questions. For example, if \(B\) were time of measurement, then the predictor for the \(BC1\) code, \(\beta_{1i}\) represents the predicted increase in logits for each unit of time that passes (assuming measurements were taken approximately the same amount of time apart, etc.). The estimate of \(\gamma_{10}\) (that is, the intercept of the level 2 equation modeling the linear effect of time as the outcome) represents the average of of the \(\beta_{1i}\)s across the entire sample. Because we have used orthogonal contrast codes to represent \(A\), \(\gamma_{10}\) is the main effect of the linear trend (the average linear trend across the levels of \(A\)). If it is significant and positive, we conclude that people choose guilty more often over time on average. Equivalently, the odds of choosing guilty increases from the first measurement to the last on average. A negative estimate (that is significantly different from zero) would imply that people choose guilty less often over time on average.
The interaction question is captured by the estimate of \(\gamma_{11}\). This is the predicted difference in \(\beta_{1i}\) between the levels of \(A\) (in logits). Because \(AC\) is coded -.5 and +.5, the estimate of \(\gamma{11}\) is exactly the logit difference in the effect. If significant and positive, then the linear time trend is larger for level A2 than for level A1. A significantly negative slope indicates the opposite pattern.
Simple effects test
Note that a test of simple effects is, at best, exploratory unless you have a significant interaction! Obligatory caveat.
The original question was about simple effects. This model allows a test of any simple effect simply by changing the codes we use. I particular, we want to test the simple effects of \(B\) at each level of \(A\). I will continue to reference \(BC1\), or the linear effect of “time,” as an example. This was capture by the \(\gamma_{10}\) parameter, which is the intercept of the level 2 model with \(\beta_{1i}\) as its outcome. Like any intercept, it can be interpreted as the predicted value of the outcome (the linear time effect here) at the 0 point of the predictor! The predictor here is \(AC\), and we can change its zero point by simpy recoding:
\[ AL1 =
\begin{cases}
0 & \quad \text{if } A \text{ is level 1} \\
1 & \quad \text{if } A \text{ is level 2} \\
\end{cases}
\]
\[ AL2 =
\begin{cases}
1 & \quad \text{if } A \text{ is level 1} \\
0 & \quad \text{if } A \text{ is level 2} \\
\end{cases}
\]
Then, simply estimate the model above, but use \(AL1\) for testing the simple effect of linear time at level A1, and \(AL2\) to test the simple effect at level A2. For each of these “re-runs,” simply interpret \(\gamma_{10}\) and its significance as before.
Sample Code
Here is some untested sample code for an implementation of this model in lme4. Note that the data, d, should be in long form, with 1 row for each measurement from each participant. Participants are differentiated by d$pid.
library(lme4)
library(lmerTest) # not required, but it is pretty nice
d <- within(d, {
# make the contrast codes here
BC1 <- (
-2*(B=='l1') - 1*(B=='l2') + 0*(B=='l3') + 1*(B=='l4') + 2*(B=='l5')
)
BC2 <- (
2*(B=='l1') - 1*(B=='l2') - 2*(B=='l3') - 1*(B=='l4') + 2*(B=='l5')
)
# The other codes are constructed the same way
AC <- (
-.5 * (A=='l1') + .5 * (A=='l2')
)
})
mainMod <- glmer( # glmer, not lmer, because we need logistic regression
guilty ~ BC1 * AC + BC2 * AC + BC3 * AC + BC4 * AC + # Every B code interacts with A
(BC1 + BC2 + BC3 + BC4 | pid), # specified random effects for B
data = d,
family = binomial
)
summary(mainMod)
Then for simple effects:
d <- within(d, {
AL1 <- (
0 * (A=='l1') + 1 * (A=='l2')
)
})
al1Mod <- glmer( # glmer, not lmer, because we need logistic regression
guilty ~ BC1 * AL1 + BC2 * AL1 + BC3 * AL1 + BC4 * AL1 + # Every B code interacts with A
(BC1 + BC2 + BC3 + BC4 | pid), # specified random effects for B
data = d,
family = binomial
)
summary(al1Mod)
And likewise for level 2 on A.