4.5 Measurement model for the D-score

4.5.1 What are measurement models?

From section 3.5 we quote:

The measurement model specifies the relations between the data and the latent variable.

The term Item Response Theory (IRT) refers to the scientific theory of measurement models. Good introductory works include Wright and Masters (1982), Embretsen and Reise (2000) and Engelhard Jr. (2013).

IRT models enable quantification of the locations of both items (milestones) and persons* on the latent variable. We reserve the term item for generic properties, and milestone for child development. In general, items are part of the measurement instrument, persons are the objects to be measured.

An IRT model has three major structural components:

Specification of the underlying latent variable(s). In this work, we restrict ourselves to models with just one latent variable. Multi-dimensional IRT models do have their uses, but they are complicated to fit and not widely used;
For a given item, a specification of the probability of success given a value on the latent variables. This specification can take many forms. Section 4.6 focuses on this in more detail;
Specification how probability models for the different items should be combined. In this work, we will restrict to models that assume local independence of the probabilities. In that case, the probability of passing two items is equal to the product of success probabilities.

4.5.2 Adapt the model? Or adapt the data?

The measurement model induces a predictable pattern in the observed items. We can test this pattern against the observed data. When there is misfit between the expected and observed data, we can follow two strategies:

Make the measurement model more general;
Discard items (and sometimes persons) to make the model fit.

These are very different strategies that have led to heated debates among psychometricians. See Engelhard Jr. (2013) for an overview.

In this work, we opt for the - rigorous - Rasch model (Rasch (1960)) and will adapt the data to reduce discrepancies between model and data. Arguments for this choice are given later, in Section 4.8.