Skip to contents

audubon 0.5.2

CRAN release: 2024-04-27

  • Corrected probabilistic IDF calculation by global_idf3.
  • Refactored bind_tf_idf2.
    • Changed behavior when norm=TRUE. Cosine nomalization is now performed on tf_idf values as in the RMeCab package.
    • Added tf="itf" and idf="df" options.
  • Refactored pack for performance.

audubon 0.5.1

CRAN release: 2023-05-02

  • Refactored tokenize_mecab and tokenize_sudachipy.

audubon 0.5.0

CRAN release: 2023-03-04

  • Added bind_lr function which can calculate the ‘LR’ value of bigrams.
  • pack now always returns a tibble, not a data.frame.

audubon 0.4.0

CRAN release: 2022-12-15

  • Added some new functions.
    • bind_tf_idf2 can calculate and bind the term frequency, inverse document frequency, and tf-idf of the tidy text dataset.
    • collapse_tokens, mute_tokens, and lexical_density can be used for handling a tidy text dataset of tokens.
  • strj_tokenize now preserves the original order of text names.
  • prettify now can get delim argument.

audubon 0.3.0

CRAN release: 2022-07-22

  • Updated strj_fill_iter_mark function.
    • strj_fill_iter_mark now replaces a sequence of iteration marks recursively.
  • Updated strj_tokenize function.
    • strj_tokenize now can retrieve engine argument to switch tokenizers for splitting text into tokens.

audubon 0.2.0

CRAN release: 2022-05-24

  • Updated ngram_tokenizer function.
  • Added a wrapper function of the ‘TinySegmenter’ written by Taku Kudo.

audubon 0.1.2

CRAN release: 2022-04-02

  • Updated pack function.
    • Switched arguments order of pack function. pack now accepts pull as its second argument and n as its third argument.
    • pull now can accept a symbol.

audubon 0.1.1

CRAN release: 2022-02-14

  • Updated documentation.

audubon 0.1.0

  • Relicensed as Apache License, Version 2.0.
  • Added a NEWS.md file to track changes to the package.