Useful especially for visualizations. standardize_titles applies some English-language conventions, including converting underscores to spaces, capitalizing important words, removing leading articles, and dropping subtitles.
Value
A data frame with one column adjusted. If data is a character vector instead of a data frame, then a character vector is returned.
Examples
if (FALSE) { # \dontrun{
dubliners <- get_gutenberg_corpus(2814) |>
load_texts() |>
identify_by(part)
##### Standardizing strings #####
# Before `standardize_titles()`
unique(dubliners$doc_id)
# After `standardize_titles()`
unique(dubliners$doc_id) |>
standardize_titles()
##### Standardizing a data frame #####
dubliners_measured <- dubliners |>
add_vocabulary()
# Before `standardize_titles()`
dubliners_measured |>
plot_vocabulary(labeling = "inline")
# After `standardize_titles()`
dubliners_measured |>
standardize_titles() |>
plot_vocabulary(labeling = "inline")
} # }
