scholid provides lightweight, dependency-free utilities for working with scholarly identifiers in R. The package is designed as a small, well-tested foundation that can be safely reused by other packages and data workflows.
Scope
The package focuses on common identifier systems used in scholarly communication:
- DOI
- ORCID iD
- ISBN
- ISSN
- arXiv
- PubMed (PMID)
- PubMed Central (PMCID)
Interface
User-available functions:
| Function | Purpose |
|---|---|
scholid_types() |
List supported scholarly identifier types |
is_scholid(x, type) |
Test whether values conform to a given identifier type |
normalize_scholid(x, type) |
Normalize identifiers to canonical form |
extract_scholid(text, type) |
Extract identifiers of a given type from free text |
classify_scholid(x) |
Guess the identifier type of each input value |
detect_scholid_type(x) |
Detect identifier types from canonical or wrapped input values |
Examples
# list supported scholarly identifier types
scholid::scholid_types()
# test whether values match a given identifier type
scholid::is_scholid(
x = "10.1000/182",
type = "doi"
)
# normalize identifiers to canonical form
scholid::normalize_scholid(
x = "https://doi.org/10.1000/182",
type = "doi"
)
# extract identifiers of a given type from free text
scholid::extract_scholid(
text = "See https://doi.org/10.1000/182 for details.",
type = "doi"
)
# classify the identifier type of each input value
scholid::classify_scholid(
x = c(
"10.1000/182",
"0000-0002-1825-0097",
"not an id"
)
)
# detect identifier types from canonical or wrapped input values
scholid::detect_scholid_type(
x = c(
"https://doi.org/10.1000/182",
"ORCID: 0000-0002-1825-0097",
"arXiv:2101.00001",
"not an id"
)
)For more detailed usage patterns, including extraction from text and classification of mixed identifier columns, see the Get started vignette.