Skip to contents

Get inverse document frequencies of values in one vector x categorized by another vector by.

Usage

get_idf_by(x, by)

Arguments

x

A vector, such as a column of character strings

by

A vector of categories, such as a column of document identifiers

Value

A vector of inverse document frequencies for each value pair of x and by.

See also

Examples

my_values <- c(
  "the", "cat", "was", "bad",
  "the", "dog", "was", "very", "good",
  "the", "lizard", "is", "the", "most", "bad")
my_docs <- c(
  "A", "A", "A", "A",
  "B", "B", "B", "B", "B",
  "C", "C", "C", "C", "C", "C")

get_idf_by(my_values, my_docs)
#>  [1] 0.0000000 1.0986123 0.4054651 0.4054651 0.0000000 1.0986123 0.4054651
#>  [8] 1.0986123 1.0986123 0.0000000 1.0986123 1.0986123 0.0000000 1.0986123
#> [15] 0.4054651