Skip to contents

Plot bar graphs of frequent features

Usage

plot_doc_word_bars(
  df,
  rows = 1:10,
  by = doc_id,
  feature = word,
  inorder = TRUE,
  reorder_y = NULL,
  color_y = FALSE,
  percents = TRUE,
  label = FALSE,
  label_tweak = 2,
  label_inside = FALSE,
  na_rm = TRUE
)

Arguments

df

A tidy data frame, potentially containing columns called "doc_id" and "word"

rows

The features to show

by

The column used for document grouping, with doc_id as the default

feature

The column to measure, as in "word" or "lemma"

inorder

Whether to retain the factor order of the "by" column

reorder_y

Whether to reorder the Y-values by facet

color_y

Whether bars should be filled by Y-values

percents

Whether to display word frequencies as percentage instead of raw counts

label

Whether to show the value as a label with each bar

label_tweak

The numeric value by which to tweak the label, if shown. For percentages, this value adjusts the decimal-point precision. For raw counts, this value adjusts labels' offset from the bars

label_inside

Whether to show the value as a label inside each bar

na_rm

Whether to drop empty features

Value

A ggplot object

Examples

dubliners <- get_gutenberg_corpus(2814) |>
  load_texts(lemma = TRUE) |>
  identify_by(part) |>
  standardize_titles()

dubliners |>
  plot_doc_word_bars(rows = 1:4)


dubliners |>
  dplyr::filter(doc_id %in% c("The Sisters", "The Dead")) |>
  plot_doc_word_bars(feature = lemma, rows = 1:20)