Normalize scholarly identifiers

Vectorized normalizer that converts supported scholarly identifier values to a canonical form (e.g., removing URL prefixes, labels, or separators).

Normalization requires that inputs match the expected identifier structure. For identifier types with checksum algorithms (ORCID, ROR, ISNI, ISBN, ISSN), normalization also requires checksum-valid values. Inputs that do not meet these requirements yield NA_character_.

Normalized outputs are canonical, type-specific representations of valid identifiers.

Use is_scholid() to test whether already-canonical values are valid identifiers of a given type. Both functions apply checksum verification where applicable; normalization additionally accepts wrapped input forms and returns canonical strings.

Usage

normalize_scholid(x, type)

Arguments

x: A vector of values to normalize.
type: A single string giving the identifier type. See scholid_types() for supported values.

Value

A character vector with the same length as x. Invalid, checksum- failing, or structurally non-matching inputs yield NA_character_.

Examples

normalize_scholid("https://doi.org/10.1000/182", "doi")
#> [1] "10.1000/182"
normalize_scholid("https://orcid.org/0000-0002-1825-0097", "orcid")
#> [1] "0000-0002-1825-0097"

Usage

Arguments

Value

See also

Examples