Skip to contents

Combine ngram columns

Usage

combine_ngrams(df, feature = word, keep = FALSE)

Arguments

df

A tidy data frame, potentially containing columns called "word_1", "word_2", etc.

feature

The column name prefix for the numbered columns of ngrams

keep

Whether to keep the original columns called "word_1", "word_2", etc., alongside the new "ngram" column.

Value

A data frame with a column called ngram

See also

Other n-gram helpers: add_ngrams(), plot_bigrams(), separate_ngrams()

Examples

if (FALSE) { # \dontrun{
  my_corpus <- load_texts(n = 2)

  my_bigrams <- my_corpus |>
    add_ngrams(collapse = FALSE) |>
    combine_ngrams()
} # }

dubliners <- get_gutenberg_corpus(2814) |>
  load_texts() |>
  identify_by(part) |>
  standardize_titles()

dubliners |>
  add_ngrams(2) |>
  combine_ngrams() |>
  head()
#> # A tibble: 6 × 5
#>   doc_id      title     author       part        ngram    
#>   <fct>       <chr>     <chr>        <chr>       <chr>    
#> 1 The Sisters Dubliners Joyce, James THE SISTERS there was
#> 2 The Sisters Dubliners Joyce, James THE SISTERS was no   
#> 3 The Sisters Dubliners Joyce, James THE SISTERS no hope  
#> 4 The Sisters Dubliners Joyce, James THE SISTERS hope for 
#> 5 The Sisters Dubliners Joyce, James THE SISTERS for him  
#> 6 The Sisters Dubliners Joyce, James THE SISTERS him this