Function reference
-
read_aozora()
- Download text file from Aozora Bunko
-
read_ja_text8()
- Read the ja.text8 corpus
-
read_jrte()
- Read the JRTE Corpus
-
read_ldnws()
- Read the Livedoor News Corpus
-
clean_emoji()
- Remove emojis
-
clean_url()
- Remove URLs
-
download_unidic()
- Download and unzip 'UniDic'
-
is_within_era()
- Check if dates are within Japanese era
-
jrte_rte_files()
- Data for Textual Entailment
-
ldnws_categories()
- List of categories of the Livedoor News Corpus
-
parse_jrte_reasoning()
- Parse reasoning column of 'rte.*.tsv'
-
parse_to_jdate()
- Parse dates to Japanese dates
-
unidic_availables()
- List of available 'UniDic'
-
AozoraBunkoSnapshot
- Meta data of text files published on Aozora Bunko
-
NekoText
- Whole text of ‘Wagahai Wa Neko Dearu’ written by Natsume Souseki from Aozora Bunko