Changelog
Source:NEWS.md
audubon 0.5.2
CRAN release: 2024-04-27
- Corrected probabilistic IDF calculation by
global_idf3
. - Refactored
bind_tf_idf2
.- Changed behavior when
norm=TRUE
. Cosine nomalization is now performed ontf_idf
values as in the RMeCab package. - Added
tf="itf"
andidf="df"
options.
- Changed behavior when
- Refactored
pack
for performance.
audubon 0.5.0
CRAN release: 2023-03-04
- Added
bind_lr
function which can calculate the ‘LR’ value of bigrams. -
pack
now always returns a tibble, not a data.frame.
audubon 0.4.0
CRAN release: 2022-12-15
- Added some new functions.
-
bind_tf_idf2
can calculate and bind the term frequency, inverse document frequency, and tf-idf of the tidy text dataset. -
collapse_tokens
,mute_tokens
, andlexical_density
can be used for handling a tidy text dataset of tokens.
-
-
strj_tokenize
now preserves the original order of text names. -
prettify
now can getdelim
argument.
audubon 0.3.0
CRAN release: 2022-07-22
- Updated
strj_fill_iter_mark
function.-
strj_fill_iter_mark
now replaces a sequence of iteration marks recursively.
-
- Updated
strj_tokenize
function.-
strj_tokenize
now can retrieveengine
argument to switch tokenizers for splitting text into tokens.
-
audubon 0.2.0
CRAN release: 2022-05-24
- Updated
ngram_tokenizer
function. - Added a wrapper function of the ‘TinySegmenter’ written by Taku Kudo.