edstr_import() executes a SQL query against an Oracle database and saves the result as a Parquet file. It requires edstr_config() to be called first.
Prerequisites
edstr_config(
edstr_dirname = "output/my_study",
edstr_filename = "my_study",
edstr_text = "note_text",
edstr_overwrite = FALSE
)Running a query
The query argument accepts either a SQL string or a path to a .sql file. Connection parameters (user, password, tns) configure access to the Oracle database.
df_import <- edstr_import(
query = "SELECT * FROM clinical_notes WHERE rownum <= 1000",
user = "my_user"
)Using a .sql file keeps queries out of R scripts and makes them easier to version:
df_import <- edstr_import(
query = "sql/clinical_notes.sql",
user = "my_user"
)Limiting rows
The head argument appends FETCH FIRST ... ROWS ONLY to the query. This is useful for development and testing without modifying the SQL itself.
df_import <- edstr_import(
query = "sql/clinical_notes.sql",
head = 500,
user = "my_user"
)Loading cached data
When the output file already exists, edstr_import() skips the database query entirely and loads from cache. The behaviour depends on the edstr_overwrite option set in edstr_config().
df_import <- edstr_import()Column names
By default,
lower = TRUEconverts all column names to lowercase after import. Setlower = FALSEto preserve the original casing from the database.