Mute tokens by condition — mute

Replaces tokens in the tidy text dataset with a string scalar only if they are matched to an expression.

Usage

mute_tokens(tbl, condition, .as = NA_character_)

Arguments

tbl: A tidy text dataset.
condition: <data-masked> A logical expression.
.as: String with which tokens are replaced when they are matched to condition. The default value is NA_character.

Value

A data.frame.

Examples

df <- prettify(head(hiroba), col_select = "POS1")
mute_tokens(df, POS1 %in% c("\u52a9\u8a5e", "\u52a9\u52d5\u8a5e"))
#>   doc_id sentence_id token_id    token   POS1
#> 1      1           1        1 ポラーノ   名詞
#> 2      1           1        2     <NA>   助詞
#> 3      1           1        3     広場   名詞
#> 4      2           2        1     宮沢   名詞
#> 5      2           2        2     賢治   名詞
#> 6      3           3        1       前 接頭詞