Skip to contents

All functions

bind_lr()
Bind importance of bigrams
bind_tf_idf2()
Bind term frequency and inverse document frequency
collapse_tokens()
Collapse sequences of tokens by condition
get_dict_features()
Get dictionary's features
hiroba
Whole tokens of 'Porano no Hiroba' written by Miyazawa Kenji from Aozora Bunko
lex_density()
Calculate lexical density
mute_tokens()
Mute tokens by condition
ngram_tokenizer()
Ngrams tokenizer
pack()
Pack a data.frame of tokens
polano
Whole text of 'Porano no Hiroba' written by Miyazawa Kenji from Aozora Bunko
prettify()
Prettify tokenized output
read_rewrite_def()
Read a rewrite.def file
strj_fill_iter_mark()
Fill Japanese iteration marks
strj_hiraganize()
Hiraganize Japanese characters
strj_katakanize()
Katakanize Japanese characters
strj_normalize()
Convert text following the rules of 'NEologd'
strj_rewrite_as_def()
Rewrite text using rewrite.def
strj_romanize()
Romanize Japanese Hiragana and Katakana
strj_segment()
Segment text into tokens
strj_tinyseg()
Segment text into phrases
strj_tokenize()
Split text into tokens
strj_transcribe_num()
Transcribe Arabic to Kansuji