Random-effects Models

Authors

Affiliation

Filippo Gambarota

Department of Developmental Psychology and Socialization, University of Padova

Gianmarco Altoè

We can extended the base CM model by including random-effects. For example, if the same participant responds to the same item/trial multiple times or responds to multiple items we need to take into account the nested data-structure. Agresti (2010) formalized the random-intercept CM model in Equation 1.

\[ P(Y \leq k) = g^{-1}[(\alpha_k + u_i) - \mathbf{X}\boldsymbol{\beta}] \tag{1}\]

Where \(u_i\) is the by-subject adjustment to the overall intercept-threshold. As in standard mixed-effects models, the random-effect of the intercept is sampled from a normal distribution with \(\mu = 0\) and standard deviation \(\sigma_{u}\), thus \(u_i \sim \mathcal{N}(0, \sigma_u)\). We can use the same simulation strategies (sampling from multinomial distribution or latent approach) of the paper but introducing the random-effect. We simulate \(N = 100\) subjects divided into two groups. Each subject performs \(100\) trial responding on a ordinal item.

# Simulation parameters
set.seed(2024)
k <- 4
probs0 <- rep(1/k, k)
N <- 100 # sample size
n <- N/2
nt <- 100 # number of trials
b1 <- log(3)
sb0 <- 0.5 # intercept standard deviation
group <- c(0, 1) # group
alpha <- prob_to_alpha(probs0, link = "logit")

# Data structure
dat0 <- dat1 <- expand.grid(id = 1:n, trial = 1:nt)
dat0$group <- 0
dat1$group <- 1
dat0$id <- rep(1:n, nt)
dat1$id <- rep((n + 1):N, nt)
dat <- rbind(dat0, dat1)

alphai <- rnorm(N, 0, sb0)

Sampling from multinomial distribution

set.seed(2024)
cump <- lapply(alpha, function(a) plogis(with(dat, (a + alphai[id]) - (b1 * group))))
p <- t(apply(cbind(0, data.frame(cump), 1), 1, diff))
names(p) <- paste0("py", 1:k)
dat <- cbind(dat, p)

dat$y <- apply(dat[, 4:ncol(dat)], 1, function(p) sample(1:k, 1, replace = TRUE, prob = p))
dat$y <- ordered(dat$y)

fit <- clmm(y ~ group + (1|id), data = dat, link = "logit")
summary(fit)

Cumulative Link Mixed Model fitted with the Laplace approximation

formula: y ~ group + (1 | id)
data:    dat

 link  threshold nobs  logLik    AIC      niter     max.grad cond.H 
 logit flexible  10000 -12676.72 25363.43 324(1014) 4.61e-03 1.4e+02

Random effects:
 Groups Name        Variance Std.Dev.
 id     (Intercept) 0.2582   0.5081  
Number of groups:  id 100 

Coefficients:
      Estimate Std. Error z value Pr(>|z|)    
group   1.1178     0.1086   10.29   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Threshold coefficients:
    Estimate Std. Error z value
1|2 -1.14160    0.07813 -14.612
2|3 -0.04105    0.07696  -0.533
3|4  1.06992    0.07777  13.757

Sampling from the latent variable

set.seed(2024)
ystar <- with(dat, alphai[id] + (b1*group)) + rlogis(nrow(dat), 0, 1)
dat$y2 <- ordered(findInterval(ystar, alpha) + 1)
fit2 <- clmm(y2 ~ group + (1|id), data = dat, link = "logit")
summary(fit2)

Cumulative Link Mixed Model fitted with the Laplace approximation

formula: y2 ~ group + (1 | id)
data:    dat

 link  threshold nobs  logLik    AIC      niter    max.grad cond.H 
 logit flexible  10000 -12795.34 25600.68 302(910) 1.32e-03 1.5e+02

Random effects:
 Groups Name        Variance Std.Dev.
 id     (Intercept) 0.2729   0.5224  
Number of groups:  id 100 

Coefficients:
      Estimate Std. Error z value Pr(>|z|)    
group   1.0646     0.1111    9.58   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Threshold coefficients:
    Estimate Std. Error z value
1|2 -1.07709    0.07988 -13.484
2|3  0.01468    0.07885   0.186
3|4  1.11972    0.07967  14.054

In both simulations we are able to recover the parameters. For fitting the model we use the clmm() function that allows specifying the random-effects structure. The syntax is the same as the lme4 package for standard mixed-effects models.

We can extend the simulation including also random slopes. The group effect need to be a within effect now, thus we can imagine to have two conditions in the experiment with 100 trials each. We can extend the Equation 1 with Equation 2. \(\beta_{1_i}\) are the by-subjects adjustments to the overall \(\beta_1\) effect, still sampled from a normal distribution.

\[ P(Y \leq k) = g^{-1}[(\alpha_k + u_i) - (\beta_1 + \beta_{1_i})X_1] \tag{2}\]

Let’s use the latent formulation directly:

set.seed(2024)
dat <- expand.grid(id = 1:N, trials = 1:nt, cond = c(0, 1))
sb1 <- 0.2 # slope standard deviation
alphai <- rnorm(N, 0, sb0)
b1i <- rnorm(N, 0, sb1)

ystar <- with(dat, alphai[id] + ((b1 + b1i[id]) * cond)) + rlogis(nrow(dat), 0, 1)
dat$y <- ordered(findInterval(ystar, alpha) + 1)
fit <- clmm(y ~ cond + (cond|id), data = dat, link = "logit", )
summary(fit)

Cumulative Link Mixed Model fitted with the Laplace approximation

formula: y ~ cond + (cond | id)
data:    dat

 link  threshold nobs  logLik    AIC      niter     max.grad cond.H 
 logit flexible  20000 -25434.35 50882.69 626(2508) 1.09e-02 1.1e+02

Random effects:
 Groups Name        Variance Std.Dev. Corr   
 id     (Intercept) 0.29342  0.5417          
        cond        0.06647  0.2578   -0.023 
Number of groups:  id 100 

Coefficients:
     Estimate Std. Error z value Pr(>|z|)    
cond  1.10553    0.03729   29.65   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Threshold coefficients:
    Estimate Std. Error z value
1|2 -1.08130    0.05828 -18.553
2|3  0.03904    0.05758   0.678
3|4  1.15537    0.05818  19.858

References

Agresti, A. (2010). Analysis of ordinal categorical data. John Wiley & Sons, Inc. https://doi.org/10.1002/9780470594001