Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom key mangling #89

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,5 @@ Rprof\.out
^scripts$
^vignettes_src$
^appveyor\.yml$
^.*\.Rproj$
^\.Rproj\.user$
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -27,5 +27,5 @@ Suggests:
rbenchmark,
testthat (>= 1.0.0)
VignetteBuilder: knitr
RoxygenNote: 6.0.1
RoxygenNote: 6.1.0
Encoding: UTF-8
67 changes: 59 additions & 8 deletions R/driver_rds.R
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,16 @@
##'
##' @param default_namespace Default namespace (see
##' \code{\link{storr}}).
##'
##' @param mangle_key_encode Optional function for mangling keys.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking about the interface here and I think that we should offer a single argument here; the existing mangle_key which can take string arguments

  • none - equivalent to the current FALSE
  • base64 - equivalent to the current TRUE
  • xxxx - a key for a "registered" mangler

Then we need a new function:

register_mangler <- function(name, encode, decode, overwrite = FALSE) {...}

which adds a new mangling strategy. Then drake would call as part of initialisation (.onLoad), storr::register_mangler("drake::lazy_base64", drake::lazy_base64_encode, drake::lazy_base64_decode)

We'll save the functions into the class much as you have here.

My thinking here is that storr archives need to be openable by previous and future versions of storr, and that the options should be saved into the archive itself to avoid corrupting them (there's some fairly nasty logic going on already for that).

Copy link
Contributor Author

@wlandau wlandau Nov 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that idea the best so far. Implementation and back compatibility are much easier this way. I will start a new branch and PR tomorrow.

##' Only used if `mangle_key` is `TRUE`.
##' Should accept arguments `key` and `pad`.
##'
##' @param mangle_key_decode Optional function for unmangling keys.
##' Inverse operation of `mangle_key_encode`.
##' Only used if `mangle_key` is `TRUE`.
##' Should accept arguments `key` and `error`.
##'
##' @export
##' @examples
##'
Expand Down Expand Up @@ -110,16 +120,21 @@
##' st2$destroy()
storr_rds <- function(path, compress = NULL, mangle_key = NULL,
mangle_key_pad = NULL, hash_algorithm = NULL,
default_namespace = "objects") {
storr(driver_rds(path, compress, mangle_key, mangle_key_pad, hash_algorithm),
default_namespace = "objects",
mangle_key_encode = NULL,
mangle_key_decode = NULL) {
storr(driver_rds(path, compress, mangle_key, mangle_key_pad, hash_algorithm,
mangle_key_encode, mangle_key_decode),
default_namespace)
}

##' @export
##' @rdname storr_rds
driver_rds <- function(path, compress = NULL, mangle_key = NULL,
mangle_key_pad = NULL, hash_algorithm = NULL) {
R6_driver_rds$new(path, compress, mangle_key, mangle_key_pad, hash_algorithm)
mangle_key_pad = NULL, hash_algorithm = NULL,
mangle_key_encode = NULL, mangle_key_decode = NULL) {
R6_driver_rds$new(path, compress, mangle_key, mangle_key_pad, hash_algorithm,
mangle_key_encode, mangle_key_decode)
}

R6_driver_rds <- R6::R6Class(
Expand All @@ -133,10 +148,13 @@ R6_driver_rds <- R6::R6Class(
mangle_key = NULL,
mangle_key_pad = NULL,
hash_algorithm = NULL,
mangle_key_encode = NULL,
mangle_key_decode = NULL,
traits = list(accept = "raw"),

initialize = function(path, compress, mangle_key, mangle_key_pad,
hash_algorithm) {
hash_algorithm,
mangle_key_encode, mangle_key_decode) {
is_new <- !file.exists(file.path(path, "config"))
dir_create(path)
dir_create(file.path(path, "data"))
Expand All @@ -157,6 +175,14 @@ R6_driver_rds <- R6::R6Class(
write_if_missing("TRUE", driver_rds_config_file(path, "mangle_key_pad"))
write_if_missing("TRUE", driver_rds_config_file(path, "compress"))
write_if_missing("md5", driver_rds_config_file(path, "hash_algorithm"))
write_if_missing(
deparse(mangle_key_encode),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One downside of this approach is that the deparse route is not going to result in something that can be parsed back out as a usable function later

driver_rds_config_file(path, "mangle_key_encode")
)
write_if_missing(
deparse(mangle_key_decode),
driver_rds_config_file(path, "mangle_key_decode")
)
}
## Then write out the version number:
write_if_missing(as.character(packageVersion("storr")),
Expand All @@ -175,6 +201,28 @@ R6_driver_rds <- R6::R6Class(
driver_rds_config(path, "mangle_key_pad", mangle_key_pad,
FALSE, TRUE)

if (!is.null(mangle_key_encode)){
assert_function(mangle_key_encode)
mangle_key_encode <- deparse(mangle_key_encode)
}
self$mangle_key_encode <-
driver_rds_config(path, "mangle_key_encode",
mangle_key_encode, deparse(encode64), TRUE)
self$mangle_key_encode <- eval(
parse(text = self$mangle_key_encode, keep.source = FALSE)
)

if (!is.null(mangle_key_decode)){
assert_function(mangle_key_decode)
mangle_key_decode <- deparse(mangle_key_decode)
}
self$mangle_key_decode <-
driver_rds_config(path, "mangle_key_decode",
mangle_key_decode, deparse(decode64), TRUE)
self$mangle_key_decode <- eval(
parse(text = self$mangle_key_decode, keep.source = FALSE)
)

if (!is.null(compress)) {
assert_scalar_logical(compress)
}
Expand Down Expand Up @@ -245,7 +293,8 @@ R6_driver_rds <- R6::R6Class(
path <- file.path(self$path, "keys", namespace)
files <- dir(path)
if (self$mangle_key) {
ret <- decode64(files, error = FALSE)
decode_fun <- self$mangle_key_decode %||% decode64
ret <- decode_fun(files, error = FALSE)
if (anyNA(ret)) {
message_corrupted_rds_keys(namespace, path, files[is.na(ret)])
ret <- ret[!is.na(ret)]
Expand All @@ -268,7 +317,8 @@ R6_driver_rds <- R6::R6Class(
if (self$mangle_key) {
path <- file.path(self$path, "keys", namespace)
files <- dir(path)
i <- is.na(decode64(files, error = FALSE))
decode_fun <- self$mangle_key_decode %||% decode64
i <- is.na(decode_fun(files, error = FALSE))
if (any(i)) {
res <- file.remove(file.path(path, files[i]))
message(sprintf("Removed %d of %d corrupt %s",
Expand All @@ -287,7 +337,8 @@ R6_driver_rds <- R6::R6Class(

name_key = function(key, namespace) {
if (self$mangle_key) {
key <- encode64(key, pad = self$mangle_key_pad)
encode_fun <- self$mangle_key_encode %||% encode64
key <- encode_fun(key, pad = self$mangle_key_pad)
}
file.path(self$path, "keys", namespace, key)
}
Expand Down
5 changes: 3 additions & 2 deletions R/driver_remote.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,8 @@
##'
##' @param ... Arguments to pass through to \code{\link{driver_rds}},
##' including \code{compress}, \code{mangle_key},
##' \code{mangle_key_pad} and \code{hash_algorithm}.
##' \code{mangle_key_pad}, \code{hash_algorithm},
##' \code{mangle_key_encode}, and \code{mangle_key_decode}.
##'
##' @param path_local Path to a local cache. This can be left as
##' \code{NULL}, in which case a per-session cache will be used.
Expand Down Expand Up @@ -134,7 +135,7 @@ R6_driver_remote <- R6::R6Class(
return(character(0))
}
ret <- self$ops$list_dir(path)
if (self$rds$mangle_key) decode64(ret, TRUE) else ret
if (self$rds$mangle_key) self$rds$mangle_key_decode(ret, TRUE) else ret
},

## These functions could be done better if driver_rds takes a
Expand Down
1 change: 0 additions & 1 deletion R/utils.R
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,6 @@ assert_raw <- function(x, name = deparse(substitute(x))) {
}
}


assert_probably_storr_driver <- function(x, name = deparse(substitute(x))) {
expected <- c("type", "get_hash", "set_hash", "get_object",
"set_object", "exists_hash", "exists_object",
Expand Down
3 changes: 2 additions & 1 deletion man/driver_remote.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

18 changes: 15 additions & 3 deletions man/storr_rds.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.