1.2 What this volume is about

It is straightforward to apply the D-score methodology, as explained in Chapter I: Turning milestones into measurement, for measurements observed by one instrument. In practice, however, there is a complication. We often need to deal with multiple, partially overlapping tools. For example, our data may contain

  • different versions of the same instrument (e.g., Bayley I, II and III);
  • different language versions of the same tool;
  • different tools administered to the same sample;
  • different tools administered to different samples;
  • and so on.

Since there are over 150 different instruments to measure child development (L. C. H. Fernald et al. 2017), the chances are high that our data also hold data observed by multiple tools.

It is not apparent how to obtain comparable scores from different instruments. Tools may have idiosyncratic instructions to calculate total scores, distinctive domain definitions, unique compositions of norm groups, different floors and ceilings, or combinations of these.

This chapter addresses the problem how to define and calculate the D-score based on data coming from multiple sources, using various instruments administered at varying ages. We explain techniques that systematically exploit the overlap between tools to create comparable scores. For example, many instruments have variations on milestones like Can stack two blocks, Can stand or Says baba. By carefully mapping out the similarities between instruments, we can construct a constrained measurement model informed by subject matter knowledge. As a result, we can map different instruments onto the same scale.

Many of the techniques are well known within psychometrics and educational research. This chapter translates the concepts to the field of child development.