Search for regex patterns in a text column and display matching tokens,
counts, and highlighted text. Unlike other pipeline functions, edstr_view()
does not save results — it is meant for iterating on patterns before
extraction.
Usage
edstr_view(
data,
text_input = getOption("edstr_text"),
id = NULL,
replace = NULL,
pattern,
ngrams = 1,
...
)Arguments
- data
<data.frame>The data to search.- text_input
<character(1)>Name of the text column. Defaults to theedstr_textoption set byedstr_config().- id
<character(1)>Name of the unique identifier column. Auto- detected automatically if not provided.- replace
A named character vector or list of named character vectors. Optional regex replacements applied to the text before matching (see
edstr_clean()for details).- pattern
<character(1)>Regex pattern to search for.- ngrams
<integer(1)>Total n-gram window size including the matched token (default1). For example,ngrams = 3withpattern = "diabete"matches"diabete type 2".- ...
Additional arguments passed to
stringr::str_view().
Value
Invisibly returns a list with three elements:
matchA tibble of all matches with the
idandmatchcolumns.countA tibble of distinct matches with their frequency.
textOutput of
stringr::str_view()for visual inspection.
Examples
df <- data.frame(
id = 1:3,
note = c("diabete type 2", "bilan normal", "diabete gestationnel")
)
edstr_view(data = df, text_input = "note", pattern = "diabete", ngrams = 3)
#>
#> ── edstr_view ──────────────────────────────────────────────────────────────────
#>
#> # A tibble: 2 × 2
#> match n
#> <chr> <int>
#> 1 diabete gestationnel 1
#> 2 diabete type 2 1
#>
#> ────────────────────────────────────────────────────────────────────────────────
#>
#> Full steps: 0.041 sec elapsed
#>
#> ℹ Documents: 3 id
#>
#> ℹ Matches
#> • Total: 2 across 2 id (66.7% id)
#> • Distinct: 2
#>
#> ────────────────────────────────────────────────────────────────────────────────
#>