Skip to contents

Create a list of tokens

Usage

as_tokens(
  tbl,
  token_field = "token",
  pos_field = get_dict_features()[1],
  nm = NULL
)

Arguments

tbl

A tibble of tokens out of tokenize().

token_field

<data-masked> Column containing tokens.

pos_field

Column containing features that will be kept as the names of tokens. of tokens. If you don't need them, give a NULL for this argument.

nm

Names of returned list. If left with NULL, "doc_id" field of tbl is used instead.

Value

A named list of tokens.

Examples

if (FALSE) {
tokenize(
  data.frame(
    doc_id = seq_along(ginga[5:8]),
    text = ginga[5:8]
  )
) |>
  prettify(col_select = "POS1") |>
  as_tokens()
}