dscore() function estimates the D-score,
a numeric score that measures child development, from PASS/FAIL
observations on milestones.
dscore( data, items = names(data), xname = "age", xunit = c("decimal", "days", "months"), key = NULL, itembank = dscore::builtin_itembank, metric = c("dscore", "logit"), prior_mean = NULL, prior_sd = NULL, transform = NULL, qp = -10:100, population = NULL, dec = c(2L, 3L), relevance = c(-Inf, Inf) ) dscore_posterior( data, items = names(data), xname = "age", xunit = c("decimal", "days", "months"), key = NULL, itembank = dscore::builtin_itembank, metric = c("dscore", "logit"), prior_mean = NULL, prior_sd = NULL, transform = NULL, qp = -10:100, population = NULL, dec = c(2L, 3L), relevance = c(-Inf, Inf) )
data.frame with the data.
A row collects all observations made on a child on a set of
milestones administered at a given age. The function calculates
a D-score for each row. Different rows correspond to different
children or different ages.
A character vector containing names of items to be
included into the D-score calculation. Milestone scores are coded
1 (pass) and
0 (fail). By default,
D-score calculation is done on all items found in the data
that have a difficulty parameter under the specified
A string with the name of the age variable in
data. The default is
A string specifying the unit in which age is measured
The default (
"decimal") means decimal age in years.
A string that selects a subset in the itembank that
makes up the key, the set of difficulty
estimates from a fitted Rasch model.
The built-in keys are:
"dutch". Since version 1.5.0, the
key = "gsed"
selects the latest key starting with the string "gsed".
key = "" to use all item names,
which should only be done if there are no duplicate itemnames
in the itembank.
data.frame with columns
label. Only columns
tau are required.
The function uses
A string, either
"dscore" (default) or
"logit", signalling the metric in which ability is estimated.
A string specifying where the mean of the
prior for the D-score calculation should come from. It could be
a column name in
data (when you want your own prior for every row),
but normally this is one of the keywords
The default depends on the
key == "dutch" then
prior_mean = ".dutch". The choice
prior_mean = ".dutch"
prior_mean from the Count model coded in
key is #'
prior_mean = ".gcdg".
This setting calculates an age-dependent prior mean internally according
dscore:::count_mu_gcdg(). In other cases,
prior_mean = ".phase1"
which uses the function
Normally, you should not touch this parameter, but feel free to use
prior_mean to override the automatic choices.
A string specifying a column name in
with the standard deviation of the prior for the D-score calculation.
If not specified, the standard deviation is taken as 5 for every row.
Vector of length 2, signalling the intercept
and slope respectively of the linear transform that converts an
observation in the logit scale to the the D-score scale. Only
metric == "logit".
Numeric vector of equally spaced quadrature points.
This vector should span the range of all D-score values. The default
qp = -10:100) is suitable for age range 0-4 years.
A string describing the population. Currently
A vector of two integers specifying the number of
decimals for rounding the D-score and DAZ, respectively.
The default is
dec = c(2L, 3L).
A numeric vector of length with the lower and
upper bounds of the relevance interval. The procedure calculates
a dynamic EAP for each item. If the difficulty level (tau) of the
next item is outside the relevance interval around EAP, the procedure
ignore the score on the item. The default is
c(-Inf, +Inf) does not
dscore() function returns a
nrow(data) rows and the following columns:
|Number of items with valid (0/1) data|
|Percentage of passed milestones|
|Ability estimate, mean of posterior|
|Standard error of measurement, standard deviation of the posterior|
|D-score corrected for age, calculated in Z-scale|
dscore_posterior() function returns a numeric matrix with
nrow(data) rows and
length(qp) columns with the
density at each quadrature point. The vector represents the full
posterior ability distribution. If no valid responses were obtained,
dscore_posterior() returns the prior.
The algorithm is based on the method by Bock and Mislevy (1982). The method uses Bayes rule to update a prior ability into a posterior ability.
The item names should correspond to the
A key is defined by the set of estimated item difficulties.
|1||direct||Van Buuren, 2014/2020|
|20||mixed||GSED Team, 2019|
|22||mixed||GSED Team, 2022|
|22||mixed||GSED Team, 2022|
|22||mixed||GSED Team, 2022|
|1||direct||GSED Team, 2022|
|1||caregiver||GSED Team, 2022|
As a general rule, one should only compare D-scores
that are calculated using the same key and the same
set of quadrature points. For calculating D-scores on new data,
the advice is to use the default, which currently links to
The default starting prior is a mean calculated from a so-called
"Count model" that describes mean D-score as a function of age. The
Count models are stored as internal functions
dscore:::count_mu_dutch(). The spread of the starting prior
is 5 D-score points around this mean D-score, which corresponds to
approximately 1.5 to 2 times the normal spread of child of a given age. The
starting prior is thus somewhat informative for low numbers of
valid items, and uninformative for large number of items (say >10 items).
Bock DD, Mislevy RJ (1982). Adaptive EAP Estimation of Ability in a Microcomputer Environment. Applied Psychological Measurement, 6(4), 431-444.
Van Buuren S (2014). Growth charts of human development. Stat Methods Med Res, 23(4), 346-368. https://stefvanbuuren.name/publication/van-buuren-2014-gc/
Weber AM, Rubio-Codina M, Walker SP, van Buuren S, Eekhout I, Grantham-McGregor S, Caridad Araujo M, Chang SM, Fernald LCH, Hamadani JD, Hanlon A, Karam SM, Lozoff B, Ratsifandrihamanana L, Richter L, Black MM (2019). The D-score: a metric for interpreting the early development of infants and toddlers across global settings. BMJ Global Health, BMJ Global Health 4: e001724. https://gh.bmj.com/content/bmjgh/4/6/e001724.full.pdf
data <- data.frame( age = rep(round(21 / 365.25, 4), 10), ddifmd001 = c(NA, NA, 0, 0, 0, 1, 0, 1, 1, 1), ddicmm029 = c(NA, NA, NA, 0, 1, 0, 1, 0, 1, 1), ddigmd053 = c(NA, 0, 0, 1, 0, 0, 1, 1, 0, 1) ) items <- names(data)[2:4] # third item is not part of default key get_tau(items) #> ddifmd001 ddicmm029 ddigmd053 #> 8.61 8.47 NA # calculate D-score dscore(data) #> a n p d sem daz #> 1 NA 0 NA NA NA NA #> 2 NA 0 NA NA NA NA #> 3 0.0575 1 0.0 6.61 2.763004 -2.019 #> 4 0.0575 2 0.0 5.60 2.459750 -2.235 #> 5 0.0575 2 0.5 9.09 1.695326 -1.447 #> 6 0.0575 2 0.5 9.09 1.695326 -1.447 #> 7 0.0575 2 0.5 9.09 1.695326 -1.447 #> 8 0.0575 2 0.5 9.09 1.695326 -1.447 #> 9 0.0575 2 1.0 15.30 3.851173 0.277 #> 10 0.0575 2 1.0 15.30 3.851173 0.277 # calculate full posterior p <- dscore_posterior(data) # plot posterior for row 7 plot(x = -10:100, y = p[7, ], type = "l", xlab = "D-score", ylab = "Density", xlim = c(0, 30))