Once installed, stylo2gg will interface with data recorded by the stylo package. The examples below introduce functionality using the eighty-five Federalist Papers, originally published pseudonymously in 1788.
Principal component analysis
As called here, the stylo package limits words to those common to at
least 75% of the texts (using the culling...
argumements),
saves the data in an object called federalist_mfw
, and
plots the texts based on their word usage with principal component
analysis:
library(stylo)
federalist_mfw <-
stylo(gui = FALSE,
corpus.dir = system.file("extdata/federalist", package = "stylo2gg"),
analysis.type = "PCR",
pca.visual.flavour = "symbols",
analyzed.features = "w",
ngram.size = 1,
display.on.screen = TRUE,
sampling = "no.sampling",
culling.max = 75,
culling.min = 75,
mfw.min = 900,
mfw.max = 900)
By default, the stylo2gg()
function uses both the data
and visualization settings from federalist_mfw
:
Other settings are explained in the article on principle component analysis.
Hierarchical clustering
In addition to two-dimensional relationships with principal components, stylo can also show a dendrogram for cluster analysis, showing texts’ relationships based on their distance to each other.
federalist_mfw2 <-
stylo(gui = FALSE,
corpus.dir = system.file("extdata/federalist", package = "stylo2gg"),
custom.graph.title = "Federalist Papers",
analysis.type = "CA",
analyzed.features = "w",
ngram.size = 1,
display.on.screen = TRUE,
sampling = "no.sampling",
culling.max = 75,
culling.min = 75,
mfw.min = 900,
mfw.max = 900)
This federalist_mfw2
object can then be piped into
stylo2gg()
:
federalist_mfw2 |>
stylo2gg()
Alternatively, using the unnumbered federalist_mfw
object from earlier will create a similar cluster analysis using the
option viz="CA"
:
federalist_mfw |>
stylo2gg(viz="CA",
shapes = FALSE)
Additional settings for visualizing clusters with dendrograms are explained in the article on hierarchical clustering.