edstr_config() sets global options that all downstream functions rely on. It must be called before any other edstr_* function.
Required arguments
Two arguments are mandatory: the output directory and the file name prefix.
edstr_config(
edstr_dirname = "output/my_study",
edstr_filename = "my_study"
)The directory is created automatically if it does not exist. Output files are named {edstr_filename}_{step}.parquet for import and clean steps (e.g. my_study_import.parquet, my_study_clean.parquet), and {edstr_filename}_{step}.rds for the extract step.
Glue syntax
Both edstr_dirname and edstr_filename support glue expressions, which are evaluated at call time.
edstr_config(
edstr_dirname = "output/{Sys.Date()}",
edstr_filename = "study_{format(Sys.Date(), '%Y%m%d')}"
)Text column
Setting edstr_text avoids repeating the column name in edstr_clean(), edstr_extract(), and edstr_view().
edstr_config(
edstr_dirname = "output/my_study",
edstr_filename = "my_study",
edstr_text = "note_text"
)If omitted, the text column must be passed explicitly to each function.
Caching behaviour
The edstr_overwrite option controls what happens when an output file already exists.
| Value | Behaviour |
|---|---|
TRUE |
Overwrite without prompting |
FALSE |
Load the cached file without prompting |
NULL |
Prompt an interactive menu (load / overwrite / cancel) |
NULL is the default — useful during interactive development. Set TRUE when re-running a full pipeline, FALSE when resuming work on a cached dataset.
edstr_config(
edstr_dirname = "output/my_study",
edstr_filename = "my_study",
edstr_overwrite = TRUE
)Additional options
Extra named arguments are passed directly to options(), which can be useful for setting project-level options alongside edstr.
edstr_config(
edstr_dirname = "output/my_study",
edstr_filename = "my_study",
dplyr.summarise.inform = FALSE
)