Package index
-
bind_lr()
- Bind importance of bigrams
-
bind_tf_idf2()
- Bind term frequency and inverse document frequency
-
collapse_tokens()
- Collapse sequences of tokens by condition
-
get_dict_features()
- Get dictionary's features
-
hiroba
- Whole tokens of 'Porano no Hiroba' written by Miyazawa Kenji from Aozora Bunko
-
lex_density()
- Calculate lexical density
-
mute_tokens()
- Mute tokens by condition
-
ngram_tokenizer()
- Ngrams tokenizer
-
pack()
- Pack a data.frame of tokens
-
polano
- Whole text of 'Porano no Hiroba' written by Miyazawa Kenji from Aozora Bunko
-
prettify()
- Prettify tokenized output
-
read_rewrite_def()
- Read a rewrite.def file
-
strj_fill_iter_mark()
- Fill Japanese iteration marks
-
strj_hiraganize()
- Hiraganize Japanese characters
-
strj_katakanize()
- Katakanize Japanese characters
-
strj_normalize()
- Convert text following the rules of 'NEologd'
-
strj_rewrite_as_def()
- Rewrite text using rewrite.def
-
strj_romanize()
- Romanize Japanese Hiragana and Katakana
-
strj_segment()
- Segment text into tokens
-
strj_tinyseg()
- Segment text into phrases
-
strj_tokenize()
- Split text into tokens
-
strj_transcribe_num()
- Transcribe Arabic to Kansuji