
Package index
-
bind_lr() - Bind importance of bigrams
-
bind_tf_idf2() - Bind term frequency and inverse document frequency
-
collapse_tokens() - Collapse sequences of tokens by condition
-
get_dict_features() - Get dictionary's features
-
hiroba - Whole tokens of 'Porano no Hiroba' written by Miyazawa Kenji from Aozora Bunko
-
lex_density() - Calculate lexical density
-
mute_tokens() - Mute tokens by condition
-
ngram_tokenizer() - Ngrams tokenizer
-
pack() - Pack a data.frame of tokens
-
polano - Whole text of 'Porano no Hiroba' written by Miyazawa Kenji from Aozora Bunko
-
prettify() - Prettify tokenized output
-
read_rewrite_def() - Read a rewrite.def file
-
strj_fill_iter_mark() - Fill Japanese iteration marks
-
strj_hiraganize() - Hiraganize Japanese characters
-
strj_katakanize() - Katakanize Japanese characters
-
strj_normalize() - Convert text following the rules of 'NEologd'
-
strj_rewrite_as_def() - Rewrite text using rewrite.def
-
strj_romanize() - Romanize Japanese Hiragana and Katakana
-
strj_segment() - Segment text into tokens
-
strj_tinyseg() - Segment text into phrases
-
strj_tokenize() - Split text into tokens
-
strj_transcribe_num() - Transcribe Arabic to Kansuji