Domain specific D-score

ddomain(
  data,
  set,
  domain = NULL,
  vote_weight = NULL,
  items = names(data),
  key = NULL,
  population = NULL,
  xname = "age",
  xunit = c("decimal", "days", "months"),
  prepend = NULL,
  itembank = NULL,
  metric = c("dscore", "logit"),
  prior_mean = NULL,
  prior_mean_NA = NULL,
  prior_sd = NULL,
  prior_sd_NA = NULL,
  transform = NULL,
  qp = NULL,
  dec = c(2L, 3L),
  relevance = c(-Inf, Inf),
  algorithm = c("current", "1.8.7"),
  verbose = FALSE
)

Arguments

data

A data.frame or matrix with the data. A row collects all observations made on a child on a set of milestones administered at a given age. The function calculates a D-score for each row. Different rows can correspond to different children or ages.

set

String. The name of the set of domains to use. See with(builtin_domaintable, table(set, domain)) for the domain names in each set.

domain

character vector of the name of the domain(s) for which to compute the domain score. Per default all domains in the set are used .

vote_weight

minimum proportion of votes (weight) for a domain that an item needs to have to count for that domain.

items

A character vector containing names of items to be included into the D-score calculation. Milestone scores are coded numerically as 1 (pass) and 0 (fail). By default, D-score calculation is done on all items found in the data that have a difficulty parameter under the specified key.

key

String. They key identifies 1) the difficulty estimates pertaining to a particular Rasch model, and 2) the prior mean and standard deviation of the prior distribution for calculating the D-score. The default key NULL sets key = "gsed2406". View builtin_keys for an overview of the available keys.

population

String. The name of the reference population to calculate DAZ. Use with(builtin_references, table(key, population)) to see which built-in references are available for key - population combinations. If not specified, the function set the default population as builtin_keys$base_population[key == builtin_keys$key].

xname

A string with the name of the age variable in data. The default is "age". Do not round age.

xunit

A string specifying the unit in which age is measured (either "decimal", "days" or "months"). The default "decimal" corresponds to decimal age in years.

prepend

Character vector with column names in data that will be prepended to the returned data frame. This is useful for copying columns from data into the result, e.g., for matching.

itembank

A data.frame with at least three columns named key, item and tau. By default, the function uses dscore::builtin_itembank. If you specify your own itembank, then you should also provide the relevant transform and qp arguments.

metric

A string, either "dscore" (default) or "logit", signalling the metric in which ability is estimated. daz is not calculated for the logit scale.

prior_mean

NULL (default), a string, a numeric scalar, or a numeric vector with nrow(data) elements. The default value NULL will consult the base_population field in builtin_keys, and use the corresponding median of that reference as prior mean for the D-score. The string should refer to a column name in data that contains user-supplied values of the prior mean for each observation. A numeric scalar will be expanded to all observations. A numeric vector will be used as is.

prior_mean_NA

NULL (default) or a scalar numeric, representing the prior mean for observations with missing ages. By default, D-scores with missing ages will we NA. We suggest setting prior_mean_NA = 50 as a reasonable choice for samples between 0-3 years. The argument is ignored if prior_mean is specified per observation, which gives you full control of priors for observations with missing ages.

prior_sd

NULL (default), a string, a numeric scalar, or a numeric vector with nrow(data) elements. The default (NULL) uses a value of 5 for all ages. The string should refer to a column name in data that contains user-supplied values of the prior sd for each observation. A numeric scalar will be expanded to all observations. A numeric vector will be used as is.

prior_sd_NA

NULL (default) or a scalar numeric, representing the prior sd for observations with missing ages. By default, D-scores with missing ages will we NA. We suggest setting prior_sd_NA = 20 as a reasonable choice for samples between 0-3 years. The argument is ignored if prior_sd is specified per observation, which gives you full control of priors for observations with missing ages.

transform

Numeric vector, length 2, containing the intercept and slope of the linear transform from the logit scale into the the D-score scale. The default (NULL) searches builtin_keys for intercept and slope values.

qp

Numeric vector of equally spaced quadrature points. This vector should span the range of all D-score or logit values. The default (NULL) creates seq(from, to, by) searching the arguments from builtin_keys.

dec

A vector of two integers specifying the number of decimals for rounding the D-score and DAZ, respectively. The default is dec = c(2L, 3L).

relevance

A numeric vector of length with the lower and upper bounds of the relevance interval. The procedure calculates a dynamic EAP for each item. If the difficulty level (tau) of the next item is outside the relevance interval around EAP, the procedure ignore the score on the item. The default is c(-Inf, +Inf) does not ignore scores.

algorithm

Computational method, for backward compatibility. Either "current" (default) or "1.8.7" (deprecated).

verbose

Logical. Print settings.

Value

The ddomain() function returns a list of data.frame objects with each nrow(data) rows. The name of the list is the name of the domain. The data.frame consists of the following columns:

NameLabel
aDecimal age (years)
nNumber of items with valid (0/1) data
pPercentage of passed milestones
dD-score, mean of posterior distribution
semStandard error of measurement, standard deviation of the posterior
dazD-score corrected for age, calculated in Z-scale (for metric "dscore")

The D-score in column d is a linear scale, with values usually ranging from 0 to 100. The D-score is NA if age is missing or if age is lower than -1/12. It is possible to calculate D-scores for cases with missing ages by setting prior_mean_NA and prior_sd_NA to some reasonable value, e.g., prior_mean_NA = 50 and prior_sd_NA = 20, for the sample at hand.

The SEM is a positive number that quantifies the uncertainty of the D-score. It is NA if the D-score is NA.

The DAZ in column daz is a Z-score that corrects the D-score for age. It is NA when there are no reference values for the given age, or when the D-score is extremely unlikely to be valid at the given age.

Examples

sample <- dscore::gsample
colnames(sample) <- dscore::rename_vector(colnames(sample), lexin = "gsed2", lexout = "gsed3")
sample <- sample |> dplyr::select(subjid, agedays, starts_with("gs1")) |>
 dplyr::mutate(age = agedays / 365.25)
ddomain(sample, set = "GFCLS")
#> $grossmotor
#>         a  n      p     d      sem    daz
#> 1  2.2204  6 1.0000 72.65 3.691257  1.328
#> 2  2.4586  6 0.8333 70.31 3.399467 -0.020
#> 3  0.5558 28 0.6071 39.15 2.110025  1.115
#> 4  2.6448  6 0.8333 74.26 3.307301  0.566
#> 5  2.1081  9 0.6667 67.82 2.728499  0.342
#> 6  0.8378 28 0.6071 40.93 2.103936 -1.196
#> 7  3.3238  7 1.0000 80.47 3.770707  1.104
#> 8  1.9767 11 0.8182 66.28 2.772352  0.341
#> 9  0.3587 23 0.4348 22.55 2.026521 -1.409
#> 10 0.1369 16 0.5625 20.29 2.335652  0.907
#> 
#> $finemotor
#>         a  n      p     d      sem    daz
#> 1  2.2204  7 1.0000 73.65 3.566725  1.603
#> 2  2.4586  7 0.8571 71.44 3.326522  0.284
#> 3  0.5558 19 0.5789 36.63 2.226232  0.324
#> 4  2.6448  8 0.7500 74.57 2.869917  0.651
#> 5  2.1081 12 0.7500 71.86 2.583802  1.466
#> 6  0.8378 18 0.7222 43.21 2.634703 -0.536
#> 7  3.3238  9 0.8889 78.94 3.087618  0.694
#> 8  1.9767 11 0.7273 64.96 2.733852 -0.025
#> 9  0.3587 16 0.4375 23.54 2.443604 -1.142
#> 10 0.1369  9 0.5556 17.63 2.871447  0.012
#> 
#> $language
#>         a  n      p     d      sem    daz
#> 1  2.2204 20 0.6500 65.91 1.827420 -0.508
#> 2  2.4586 29 0.6552 69.43 1.610327 -0.253
#> 3  0.5558 11 0.8182 40.16 3.492057  1.432
#> 4  2.6448 25 0.8000 75.35 1.798865  0.863
#> 5  2.1081 34 0.3824 64.05 1.517986 -0.667
#> 6  0.8378 12 0.7500 43.64 3.168877 -0.407
#> 7  3.3238 25 0.7600 75.03 1.770642 -0.325
#> 8  1.9767 26 0.5000 63.17 1.611405 -0.509
#> 9  0.3587 19 0.8947 27.87 3.024516  0.128
#> 10 0.1369 17 0.5882 15.46 1.980980 -0.676
#> 
#> $cognitive
#>         a  n      p     d      sem    daz
#> 1  2.2204 11 0.7273 68.12 2.538955  0.078
#> 2  2.4586 20 0.5500 69.13 1.832777 -0.331
#> 3  0.5558 14 0.8571 41.71 3.006924  1.915
#> 4  2.6448 21 0.7143 74.60 1.758835  0.659
#> 5  2.1081 27 0.4444 67.34 1.655714  0.210
#> 6  0.8378 15 0.8000 44.32 2.920619 -0.200
#> 7  3.3238 22 0.6818 74.52 1.715425 -0.452
#> 8  1.9767 18 0.3889 61.19 1.985434 -1.020
#> 9  0.3587 15 0.4667 24.39 2.452043 -0.906
#> 10 0.1369  9 0.5556 18.62 2.735954  0.340
#> 
#> $social
#>         a  n      p     d      sem    daz
#> 1  2.2204 13 0.8462 69.55 2.638940  0.469
#> 2  2.4586 18 0.5556 66.70 2.053824 -0.946
#> 3  0.5558 10 0.8000 37.23 3.520380  0.511
#> 4  2.6448 17 0.7647 74.87 2.117452  0.732
#> 5  2.1081 24 0.3750 63.52 1.867752 -0.802
#> 6  0.8378 11 0.8182 44.33 3.324143 -0.197
#> 7  3.3238 17 0.8235 77.08 2.311132  0.202
#> 8  1.9767 17 0.7647 65.69 2.269435  0.176
#> 9  0.3587 16 1.0000 31.78 3.780745  1.359
#> 10 0.1369 14 0.5000 13.27 2.115120 -1.316
#> 
ddomain(sample, set = "GFCLS", domain = c("finemotor", "grossmotor"))
#> $finemotor
#>         a  n      p     d      sem    daz
#> 1  2.2204  7 1.0000 73.65 3.566725  1.603
#> 2  2.4586  7 0.8571 71.44 3.326522  0.284
#> 3  0.5558 19 0.5789 36.63 2.226232  0.324
#> 4  2.6448  8 0.7500 74.57 2.869917  0.651
#> 5  2.1081 12 0.7500 71.86 2.583802  1.466
#> 6  0.8378 18 0.7222 43.21 2.634703 -0.536
#> 7  3.3238  9 0.8889 78.94 3.087618  0.694
#> 8  1.9767 11 0.7273 64.96 2.733852 -0.025
#> 9  0.3587 16 0.4375 23.54 2.443604 -1.142
#> 10 0.1369  9 0.5556 17.63 2.871447  0.012
#> 
#> $grossmotor
#>         a  n      p     d      sem    daz
#> 1  2.2204  6 1.0000 72.65 3.691257  1.328
#> 2  2.4586  6 0.8333 70.31 3.399467 -0.020
#> 3  0.5558 28 0.6071 39.15 2.110025  1.115
#> 4  2.6448  6 0.8333 74.26 3.307301  0.566
#> 5  2.1081  9 0.6667 67.82 2.728499  0.342
#> 6  0.8378 28 0.6071 40.93 2.103936 -1.196
#> 7  3.3238  7 1.0000 80.47 3.770707  1.104
#> 8  1.9767 11 0.8182 66.28 2.772352  0.341
#> 9  0.3587 23 0.4348 22.55 2.026521 -1.409
#> 10 0.1369 16 0.5625 20.29 2.335652  0.907
#> 
ddomain(sample, set = "GFCLS", domain = c("language"))
#> $language
#>         a  n      p     d      sem    daz
#> 1  2.2204 20 0.6500 65.91 1.827420 -0.508
#> 2  2.4586 29 0.6552 69.43 1.610327 -0.253
#> 3  0.5558 11 0.8182 40.16 3.492057  1.432
#> 4  2.6448 25 0.8000 75.35 1.798865  0.863
#> 5  2.1081 34 0.3824 64.05 1.517986 -0.667
#> 6  0.8378 12 0.7500 43.64 3.168877 -0.407
#> 7  3.3238 25 0.7600 75.03 1.770642 -0.325
#> 8  1.9767 26 0.5000 63.17 1.611405 -0.509
#> 9  0.3587 19 0.8947 27.87 3.024516  0.128
#> 10 0.1369 17 0.5882 15.46 1.980980 -0.676
#>