--- title: "Using sqlhelper in packages" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Using sqlhelper in packages} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` `sqlhelper` was written to streamline SQL integration in packages designed to meet organizations' particular analytical needs, in stable data contexts. The pattern of integrating SQL with R code in analysis packages is intended to: * make use of the data-wrangling capacity of databases and SQL; * take advantage of R's packaging system to write reliable, understandable code; * keep all the code for a task in one place. This note describes how to use `sqlhelper` safely inside a package. ## Connections If more than one connection is needed for a project, the easiest approach will be to define them in a yaml file as described in `vignette("connections")`. The yaml file is placed under `inst` in the root of the package and accessed with `devtools::system.file()` (see [r-pkgs](https://r-pkgs.org/misc.html#sec-misc-inst) on using installed files). ```r sqlhelper::connect( system.file( "sqlhelper_connection_conf.yml" ), exclusive=TRUE ) ``` Connections that cannot be defined this way may be defined within the package. How this is handled will depend on considerations such as how long each connection will be needed for, how widely it will need to be shared between other package components, and whether it will need to be exposed to users. If a connection is needed once and may be used and closed during the execution of a single function it may be sufficient to define it within the namespace of that function and close it on or before the function exits, for example: ```r get_some_data <- function(){ conn <- DBI::dbConnect(a_driver, "a connection string") d <- sqlhelper::run_files( system.file("SQL/my_sql_file.SQL"), default.conn = conn ) DBI::dbDisconnect(conn) d } ``` A connection that needs to be shared across functions or function calls, but not exposed to users, may be stored in an environment in the package's top-level namespace, for example: ```r assign("connection_store", new.env(parent = emptyenv()), environment()) connect <- function(){ assign( "c1", DBI::dbConnect(a_driver, "a connection string"), envir = connection_store ) } get_some_data <- function(){ sqlhelper::run_files( system.file("SQL/my_sql_file.SQL"), default.conn = connection_store$c1 ) } # This is a bit belt-and-braces, but thorough. disconnect <- function(){ DBI::dbDisconnect(connection_store$c1) connection_store$c1 <- NULL rm(list=c("c1), envir=connection_store) } ``` (this approach is in fact more or less the one take by `sqlhelper` itself) ## SQL files If multiple connections are required but cannot be defined in yaml it will somewhat diminish the ability to control execution on a query-by-query basis and may produce a need to split queries into more files than would otherwise be necessary. mf a package contains many SQL files, it may be useful to store them in a SQL directory under `inst/`. They may be accessed in the same way, with `system.file()`, either `system.file("SQL", "file_name.SQL")` or `system.file("SQL/file_name.SQL")` (the latter _may_ be somewhat less portable). It is often convenient to define SQL parameters in a different scope to the calling scope of `run_files()`. In this case it is important to ensure that the package does not interfere with the user's global environment. The easiest way to do this is to store them in an environment (e.g. in the same way as illustrated for connections, above) and pass them to the `values` parameter of `run_files()`, `run_queries()` or `prepare_sql()`. See `run_queries()` or `vignette("execution")` for examples. ## Exit It may be desirable to close `sqlhelper`'s connections when a calling package has completed it's operations. This can be achieved easily with `sqlhelper::disconnect()`.