R
Use of R Studio IDE is recommended.
Structure
Packages
R code should be structured as a package - follow the conventions detailed in the R Packages e-book. This ensures the code follows a standard set of structural conventions (e.g. code goes in the R/
folder, data goes in the data/
folder). Make sure you complete the DESCRIPTION
file, particularly for project dependencies.
Documentation
Document functions using roxygen2
. Also use this manage the NAMESPACE
.
#' The length of a string (in characters).
#'
#' @param string input character vector
#' @return numeric vector giving number of characters in each element of the
#' character vector. Missing strings have missing length.
#' @seealso \code{\link{nchar}} which this function wraps
#' @export
#' @examples
#' str_length(letters)
#' str_length(c("i", "like", "programming", NA))
str_length <- function(string) {
string <- check_string(string)
nc <- nchar(string, allowNA = TRUE)
is.na(nc) <- is.na(string)
nc
}
Unit tests
Implement unit tests (with the testthat
framework) - you needn't test every function, but aim to get a code coverage of around 75% (using the covr
package).
Secrets
Store secrets in the environment and encrypt them at rest (see Secrets). See Hadley Wickham's guide on storing secrets in R.
Packrat
Use packrat
to log and store your project dependencies.
Parameters
Avoid hard coded parameters wherever possible: use a separate file (params.R
or params.json
or some other single place to set parameters you're likely to change).
Detailing the parameters as a table in README.md
can also be useful.
Code
Style
Follow the r-pkg style guide and use lintR to check your code for stylistic errors. Use comments as needed.
Sensible defaults
Default to using the Tidyverse family of packages. Reference the R for Data Science e-book for usage.
- Prefer
tibbles
todata.frames
- Use modern plotting packages (e.g.
ggplot2
) rather than base graphics - Prefer
purrr
iteration tools overapply
.
Moving away from the defaults is okay when the project requires it. For example, if you're developing a package that has to be entirely stable, then resorting to base R may be preferred. It becomes a judgement call on whether the functionality the non-base packages used outweighs the risk that breaking changes may be introduced in the future.