# Simulation parameters
set.seed(2024)
<- 4
k <- rep(1/k, k)
probs0 <- 100 # sample size
N <- N/2
n <- 100 # number of trials
nt <- log(3)
b1 <- 0.5 # intercept standard deviation
sb0 <- c(0, 1) # group
group <- prob_to_alpha(probs0, link = "logit")
alpha
# Data structure
<- dat1 <- expand.grid(id = 1:n, trial = 1:nt)
dat0 $group <- 0
dat0$group <- 1
dat1$id <- rep(1:n, nt)
dat0$id <- rep((n + 1):N, nt)
dat1<- rbind(dat0, dat1)
dat
<- rnorm(N, 0, sb0) alphai
Random-effects Models
We can extended the base CM model by including random-effects. For example, if the same participant responds to the same item/trial multiple times or responds to multiple items we need to take into account the nested data-structure. Agresti (2010) formalized the random-intercept CM model in Equation 1.
\[ P(Y \leq k) = g^{-1}[(\alpha_k + u_i) - \mathbf{X}\boldsymbol{\beta}] \tag{1}\]
Where \(u_i\) is the by-subject adjustment to the overall intercept-threshold. As in standard mixed-effects models, the random-effect of the intercept is sampled from a normal distribution with \(\mu = 0\) and standard deviation \(\sigma_{u}\), thus \(u_i \sim \mathcal{N}(0, \sigma_u)\). We can use the same simulation strategies (sampling from multinomial distribution or latent approach) of the paper but introducing the random-effect. We simulate \(N = 100\) subjects divided into two groups. Each subject performs \(100\) trial responding on a ordinal item.
Sampling from multinomial distribution
set.seed(2024)
<- lapply(alpha, function(a) plogis(with(dat, (a + alphai[id]) - (b1 * group))))
cump <- t(apply(cbind(0, data.frame(cump), 1), 1, diff))
p names(p) <- paste0("py", 1:k)
<- cbind(dat, p)
dat
$y <- apply(dat[, 4:ncol(dat)], 1, function(p) sample(1:k, 1, replace = TRUE, prob = p))
dat$y <- ordered(dat$y)
dat
<- clmm(y ~ group + (1|id), data = dat, link = "logit")
fit summary(fit)
Cumulative Link Mixed Model fitted with the Laplace approximation
formula: y ~ group + (1 | id)
data: dat
link threshold nobs logLik AIC niter max.grad cond.H
logit flexible 10000 -12676.72 25363.43 324(1014) 4.61e-03 1.4e+02
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 0.2582 0.5081
Number of groups: id 100
Coefficients:
Estimate Std. Error z value Pr(>|z|)
group 1.1178 0.1086 10.29 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Threshold coefficients:
Estimate Std. Error z value
1|2 -1.14160 0.07813 -14.612
2|3 -0.04105 0.07696 -0.533
3|4 1.06992 0.07777 13.757
Sampling from the latent variable
set.seed(2024)
<- with(dat, alphai[id] + (b1*group)) + rlogis(nrow(dat), 0, 1)
ystar $y2 <- ordered(findInterval(ystar, alpha) + 1)
dat<- clmm(y2 ~ group + (1|id), data = dat, link = "logit")
fit2 summary(fit2)
Cumulative Link Mixed Model fitted with the Laplace approximation
formula: y2 ~ group + (1 | id)
data: dat
link threshold nobs logLik AIC niter max.grad cond.H
logit flexible 10000 -12795.34 25600.68 302(910) 1.32e-03 1.5e+02
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 0.2729 0.5224
Number of groups: id 100
Coefficients:
Estimate Std. Error z value Pr(>|z|)
group 1.0646 0.1111 9.58 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Threshold coefficients:
Estimate Std. Error z value
1|2 -1.07709 0.07988 -13.484
2|3 0.01468 0.07885 0.186
3|4 1.11972 0.07967 14.054
In both simulations we are able to recover the parameters. For fitting the model we use the clmm()
function that allows specifying the random-effects structure. The syntax is the same as the lme4
package for standard mixed-effects models.
We can extend the simulation including also random slopes. The group
effect need to be a within effect now, thus we can imagine to have two conditions in the experiment with 100 trials each. We can extend the Equation 1 with Equation 2. \(\beta_{1_i}\) are the by-subjects adjustments to the overall \(\beta_1\) effect, still sampled from a normal distribution.
\[ P(Y \leq k) = g^{-1}[(\alpha_k + u_i) - (\beta_1 + \beta_{1_i})X_1] \tag{2}\]
Let’s use the latent formulation directly:
set.seed(2024)
<- expand.grid(id = 1:N, trials = 1:nt, cond = c(0, 1))
dat <- 0.2 # slope standard deviation
sb1 <- rnorm(N, 0, sb0)
alphai <- rnorm(N, 0, sb1)
b1i
<- with(dat, alphai[id] + ((b1 + b1i[id]) * cond)) + rlogis(nrow(dat), 0, 1)
ystar $y <- ordered(findInterval(ystar, alpha) + 1)
dat<- clmm(y ~ cond + (cond|id), data = dat, link = "logit", )
fit summary(fit)
Cumulative Link Mixed Model fitted with the Laplace approximation
formula: y ~ cond + (cond | id)
data: dat
link threshold nobs logLik AIC niter max.grad cond.H
logit flexible 20000 -25434.35 50882.69 626(2508) 1.09e-02 1.1e+02
Random effects:
Groups Name Variance Std.Dev. Corr
id (Intercept) 0.29342 0.5417
cond 0.06647 0.2578 -0.023
Number of groups: id 100
Coefficients:
Estimate Std. Error z value Pr(>|z|)
cond 1.10553 0.03729 29.65 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Threshold coefficients:
Estimate Std. Error z value
1|2 -1.08130 0.05828 -18.553
2|3 0.03904 0.05758 0.678
3|4 1.15537 0.05818 19.858