Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specialized formats for big targets #977

Merged
merged 28 commits into from
Aug 5, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
48df658
Add failing tests
wlandau-lilly Aug 4, 2019
4f04692
Tweak failing tests
wlandau-lilly Aug 4, 2019
52ce217
Add decorated storr and temporarily deactivate new tests
wlandau-lilly Aug 4, 2019
d433cd3
Work around a lintr parse issue
wlandau-lilly Aug 4, 2019
873c755
Restore some coverage
wlandau-lilly Aug 5, 2019
064efed
Write S3 method placeholders
wlandau-lilly Aug 5, 2019
ca6abb1
Get ready to fill in decorated storr methods
wlandau-lilly Aug 5, 2019
b663c3f
Document cache decoration
wlandau-lilly Aug 5, 2019
2d66a8a
Repair a lint
wlandau-lilly Aug 5, 2019
bf2f0c1
Document return_*() functions
wlandau-lilly Aug 5, 2019
364b4f4
Refine return_*() functions
wlandau-lilly Aug 5, 2019
a4bc721
Fix a pkg check
wlandau-lilly Aug 5, 2019
36dd3c8
Always decorate the storr
wlandau-lilly Aug 5, 2019
ecb0ea5
Spelling
wlandau-lilly Aug 5, 2019
26ce71b
Make progress on return_rds()
wlandau-lilly Aug 5, 2019
91de206
Fix a lint
wlandau-lilly Aug 5, 2019
b6125c3
Unskip a test on cran
wlandau-lilly Aug 5, 2019
6964cae
Use custom column for format
wlandau-lilly Aug 5, 2019
777825a
Fix lints
wlandau-lilly Aug 5, 2019
82c12fd
Try to repair a test
wlandau-lilly Aug 5, 2019
19bc97c
Check for illegal formats
wlandau-lilly Aug 5, 2019
9c65592
Log a message with the format if given
wlandau-lilly Aug 5, 2019
6d9d2bd
Activate fst format
wlandau-lilly Aug 5, 2019
616f72f
Skip some tests below R version 3.5.0
wlandau-lilly Aug 5, 2019
1d51d5c
Refactor skipping and redoc
wlandau-lilly Aug 5, 2019
f0ed56e
Make version checking more concise
wlandau-lilly Aug 5, 2019
f96ebd8
Activate keras format
wlandau-lilly Aug 5, 2019
49af834
Test keras format
wlandau-lilly Aug 5, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -99,10 +99,12 @@ Suggests:
curl (>= 2.7),
datasets,
downloader,
fst,
future,
ggplot2,
ggraph,
grDevices,
keras,
knitr,
lubridate,
networkD3,
Expand Down
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@

## New features

- Support specialized data storage via a decorated cache and `return_*()` functions (#971). This allows users to leverage faster ways to save and load targets, such as `write_fst()` for data frames and `save_model_hdf5()` for Keras models. It also improves memory because it prevents `storr` from making a serialized in-memory copy of large data objects.
- Add `tidyselect` functionality for `...` in `progress()`, analogous to `loadd()`, `build_times()`, and `clean()`.
- Support S3 for user-defined generics (#959). If the generic `do_stuff()` and the method `stuff.your_class()` are defined in `envir`, and if `do_stuff()` has a call to `UseMethod("stuff")`, then `drake`'s code analysis will detect `stuff.your_class()` as a dependency of `do_stuff()`.

Expand Down
1 change: 1 addition & 0 deletions R/build_times.R
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ build_times <- function(
if (is.null(cache)) {
return(weak_as_tibble(empty_times()))
}
cache <- decorate_storr(cache)
eval(parse(text = "require(methods, quietly = TRUE)")) # needed for lubridate
targets <- c(as.character(match.call(expand.dots = FALSE)$...), list)
if (requireNamespace("tidyselect", quietly = TRUE)) {
Expand Down
25 changes: 21 additions & 4 deletions R/cache.R
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ readd <- function(
if (is.null(cache)) {
stop("cannot find drake cache.")
}
cache <- decorate_storr(cache)
if (!character_only) {
target <- as.character(substitute(target))
}
Expand Down Expand Up @@ -224,6 +225,7 @@ loadd <- function(
if (is.null(cache)) {
stop("cannot find drake cache.")
}
cache <- decorate_storr(cache)
if (is.null(namespace)) {
namespace <- cache$default_namespace
}
Expand Down Expand Up @@ -503,6 +505,7 @@ read_drake_seed <- function(
if (is.null(cache)) {
stop("cannot find drake cache.")
}
cache <- decorate_storr(cache)
if (cache$exists(key = "seed", namespace = "session")) {
cache$get(key = "seed", namespace = "session")
} else {
Expand Down Expand Up @@ -579,6 +582,7 @@ cached <- function(
if (is.null(cache)) {
return(character(0))
}
cache <- decorate_storr(cache)
if (is.null(namespace)) {
namespace <- cache$default_namespace
}
Expand Down Expand Up @@ -628,6 +632,9 @@ is_imported_cache <- Vectorize(function(target, cache) {
#' You can also supply your own `storr` cache to the `cache`
#' argument of `make()`. The `drake_cache()` function retrieves
#' this cache.
#' @details `drake_cache()` actually returns a *decorated* `storr`,
#' an object that *contains* a `storr` (plus bells and whistles).
#' To get the *actual* inner `storr`, use `drake_cache()$storr`.
#' @seealso [new_cache()], [drake_config()]
#' @export
#' @return A drake/storr cache in a folder called `.drake/`,
Expand All @@ -650,9 +657,12 @@ is_imported_cache <- Vectorize(function(target, cache) {
#' load_mtcars_example() # Get the code with drake_example("mtcars").
#' make(my_plan) # Run the project, build the targets.
#' x <- drake_cache() # Now, there is a cache.
#' y <- storr::storr_rds(".drake") # Equivalent.
#' y <- storr::storr_rds(".drake") # Nearly equivalent.
#' # List the objects readable from the cache with readd().
#' x$list()
#' # drake_cache() actually returns a *decorated* storr.
#' # The *real* storr is inside.
#' drake_cache()$storr
#' }
#' })
#' }
Expand Down Expand Up @@ -732,8 +742,10 @@ find_cache <- function(
#' Use `storr` to customize your caches instead.
#' @param hash_algorithm Name of a hash algorithm to use.
#' See the `algo` argument of the `digest` package for your options.
#' @param short_hash_algo Deprecated on 2018-12-12. Use `hash_algorithm` instead.
#' @param long_hash_algo Deprecated on 2018-12-12. Use `hash_algorithm` instead.
#' @param short_hash_algo Deprecated on 2018-12-12.
#' Use `hash_algorithm` instead.
#' @param long_hash_algo Deprecated on 2018-12-12.
#' Use `hash_algorithm` instead.
#' @param ... other arguments to the cache constructor.
#' @examples
#' \dontrun{
Expand Down Expand Up @@ -772,6 +784,7 @@ new_cache <- function(
mangle_key = FALSE,
hash_algorithm = hash_algorithm
)
cache <- decorate_storr(cache)
writeLines(
text = c("*", "!/.gitignore"),
con = file.path(path, ".gitignore")
Expand Down Expand Up @@ -816,7 +829,7 @@ drake_fetch_rds <- function(path) {
if (!file.exists(path)) {
return(NULL)
}
storr::storr_rds(path = path)
decorate_storr(storr::storr_rds(path = path))
}

cache_vers_stop <- function(cache){
Expand Down Expand Up @@ -905,6 +918,7 @@ drake_get_session_info <- function(
if (is.null(cache)) {
stop("No drake::make() session detected.")
}
cache <- decorate_storr(cache)
return(cache$get("sessionInfo", namespace = "session"))
}

Expand Down Expand Up @@ -999,6 +1013,7 @@ drake_cache_log <- function(
)
)
}
cache <- decorate_storr(cache)
out <- lightly_parallelize(
X = cache$list(),
FUN = single_cache_log,
Expand Down Expand Up @@ -1091,6 +1106,7 @@ diagnose <- function(
if (is.null(cache)) {
return(character(0))
}
cache <- decorate_storr(cache)
if (!character_only) {
target <- as.character(substitute(target))
}
Expand Down Expand Up @@ -1248,6 +1264,7 @@ progress <- function(
if (is.null(cache)) {
return(weak_tibble(target = character(0), progress = character(0)))
}
cache <- decorate_storr(cache)
if (!is.null(no_imported_objects)) {
warning(
"Argument `no_imported_objects` of progress() is deprecated. ",
Expand Down
3 changes: 3 additions & 0 deletions R/clean.R
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,7 @@ clean <- function(
if (is.null(cache)) {
return(invisible())
}
cache <- decorate_storr(cache)
targets <- c(as.character(match.call(expand.dots = FALSE)$...), list)
if (requireNamespace("tidyselect", quietly = TRUE)) {
targets <- drake_tidyselect_cache(
Expand Down Expand Up @@ -260,6 +261,7 @@ drake_gc <- function(
) {
deprecate_search(search)
if (!is.null(cache)) {
cache <- decorate_storr(cache)
cache$gc()
rm_bad_cache_filenames(cache)
}
Expand Down Expand Up @@ -333,6 +335,7 @@ rescue_cache <- function(
if (is.null(cache)) {
return(invisible())
}
cache <- decorate_storr(cache)
for (namespace in cache$list_namespaces()) {
X <- cache$list(namespace = namespace)
if (!is.null(targets)) {
Expand Down
Loading