get_gutenberg_corpus() should do less, and now it does. Other functionality is available via gutenbergr.
tmtyro 0.4.0
New function contextualize() shows terms in a window of context
New function add_index() adds a column showing word indices within each document
load_texts() adds support to keep original capitalization and punctuation alongside the tokenized word column with the keep_original argument. This process does not work in all instances, so the option defaults to FALSE.
add_dictionary() includes an option to keep original terms. This is useful for n-gram dictionaries, where a match might otherwise span multiple rows.
add_ngrams() supports negative ranges, for building context windows
New function add_partitions() adds a partition column, useful for getting same-sized samples
identify_by() now works with multiple columns, and it keeps existing metadata columns. This is especially useful with the new add_partitions() column, using something like my_corpus() |> add_partitions() |> identify_by(title, partition) before continuing to work with partitioned documents. To return framing to unpartitioned data, used identify_by(title) or whatever other column is most relevant.
When the ggraph package is loaded, plot_bigrams() now uses a color scale on edges, rather than spot color on nodes, with full support for change_color()
Improved documentation with website articles for customizing colors and showing code comparisons
tmtyro 0.1.0
First “public” release! 🎉
Unnecessary components removed and dependencies reduced
New function identify_by() to simplify using something other than doc_id
Improved internal linking within documentation
tmtyro (development version 0.0.6.9000)
Better working visualize() function as generic with supported methods
Improved change_colors() with added support for the Okabe-Ito colorset and the option of starting with something other than the first color of a palette. With these changes, color options have been removed from other visualization functions to consolidate them within change_colors().
When a data set includes only one unique doc_id, visualizations are no longer divided into facets.
In an effort to reduce the number of dependencies, many packages have been removed from “Imports” (geomtextpath, ggrepel, glue, NLP, openNLP, plotly, RColorBrewer, stopwords, textstem, wordcloud). Where appropriate, these have been shifted to “Suggests” or dropped entirely.