Translates names between different lexicons (naming schema).

rename_vector(
  input,
  lexin = c("phase2", "phase1", "short1", "short2", "gsed", "gsed2", "gsed3", "gsed4"),
  lexout = c("gsed4", "gsed3", "gsed2", "gsed", "short2", "short1", "phase1", "phase2"),
  notfound = "copy",
  contains = c("", "Ma_SF_", "Ma_LF_", "bsid_"),
  underscore = TRUE,
  trim = "Ma_",
  lowercase = TRUE,
  force_subjid_agedays = FALSE
)

Arguments

input

A character vector with names to be translated

lexin

A string indicating the input lexicon. One of "phase1", "phase2", "short1", "short2", "gsed", "gsed2", "gsed3" or "gsed4" Default is "phase2", which orders item names according to the published 2023 version of the SF and LF instruments.

lexout

A string indicating the output lexicon. One of "phase1", "phase2", "short1", "short2", "gsed", "gsed2", "gsed3" or "gsed4". Default is "gsed4". The default output "gsed4" applies instrument codes sf_ and lf_, which can be understood by the dscore package.

notfound

A string indicating what to do some input value is not found

contains

A string to filter the translation table prior to matching. Needed to prevent double matches. The default ("") does not filter.

underscore

Replaces space (" ") and dash ("-") by underscore ("_")

trim

A substring to be removed from input. Defaults to "Ma_".

lowercase

Sets all variables in lower case. in lexin? The default notfound = "copy" copies the input values into the output value. In other cases (e.g. "" or NA_character_), the function uses the string specified in notfound as a replacement value.

force_subjid_agedays

If TRUE, forces the output to have "subjid" and "agedays" as names for the "ID" and "age", respectively.

Value

A character vector of the same length as input with processed names.

Details

The recommended approach for reading new data is to name the columns according to the names defined by "short2" and the apply rename_vector() to translate the names to the "gsed4" lexicon.

The lexicons "phase1", "short1", "gsed" and "gsed2" are included for backward compatibility, and are not recommended for use with new data.

Examples

# Using Ma_SF_Cxx as input names, 2023 SF/LF version
input <- c("file", "GSED_ID", "Ma_SF_Parent ID", "Ma_SF_C01", "Ma_SF_C02")
rename_vector(input)
#> [1] "file"         "gsed_id"      "sf_parent_id" "sf_sec001"    "sf_moc002"   
rename_vector(input, lexout = "short2", lowercase = FALSE)
#> [1] "file"         "GSED_ID"      "SF_Parent_ID" "SF001"        "SF002"       
rename_vector(input, lexout = "gsed3", trim = "Ma_SF_")
#> [1] "file"      "gsed_id"   "parent_id" "gs1sec001" "gs1moc002"

# Convert short names to gsed names
input <- c("file", "GSED_ID", "Ma_SF_Parent ID", paste0("SF00", 1:3))
rename_vector(input, lexin = "short2", lowercase = TRUE)
#> [1] "file"         "gsed_id"      "sf_parent_id" "sf_sec001"    "sf_moc002"   
#> [6] "sf_sec003"