-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Different graphs when using pipes or traditional #335
Comments
It looks like you have at least two different versions of |
Nope - the resulting configs are different: > config_pipe <- plan %>%
+ drake_config(verbose = TRUE)
> config <- drake_config(plan, verbose = TRUE)
> sapply(names(config), function(n){identical(config[[n]], config_piupe[[n]])})
plan targets envir cache
TRUE TRUE FALSE FALSE
cache_path fetch_cache parallelism jobs
TRUE TRUE TRUE TRUE
verbose hook prepend prework
TRUE TRUE TRUE TRUE
command args recipe_command graph
TRUE TRUE TRUE FALSE
short_hash_algo long_hash_algo seed trigger
TRUE TRUE TRUE TRUE
timeout cpu elapsed retries
TRUE TRUE TRUE TRUE
imports_only skip_imports skip_safety_checks log_progress
TRUE TRUE TRUE TRUE
lazy_load session_info cache_log_file caching
TRUE TRUE TRUE TRUE
evaluator keep_going
TRUE TRUE
> |
And a reproducible example: > plan <- drake_plan(x = 1)
> config <- drake_config(plan, verbose = TRUE)
> config_pipe <- plan %>%
+ drake_config(verbose = TRUE)
> sapply(names(config), function(n){identical(config[[n]], config_piupe[[n]])})
plan targets envir cache cache_path fetch_cache
FALSE FALSE FALSE FALSE TRUE TRUE
parallelism jobs verbose hook prepend prework
TRUE TRUE TRUE TRUE TRUE TRUE
command args recipe_command graph short_hash_algo long_hash_algo
TRUE TRUE TRUE FALSE TRUE TRUE
seed trigger timeout cpu elapsed retries
TRUE TRUE TRUE TRUE TRUE TRUE
imports_only skip_imports skip_safety_checks log_progress lazy_load session_info
TRUE TRUE TRUE TRUE TRUE TRUE
cache_log_file caching evaluator keep_going
TRUE TRUE TRUE TRUE |
I think that is because plan <- drake_plan(x = 1)
envir <- environment()
config <- drake_config(plan, envir = envir)
config_pipe <- plan %>%
drake_config(envir = envir) You can expect the |
It can get even worse: > library(drake)
> library(magrittr)
> expose_imports(graphics)
<environment: R_GlobalEnv>
> plan <- drake_plan(x = plot(1))
> config_pipe <- plan %>%
+ drake_config(verbose = TRUE)
> config <- drake_config(plan, verbose = TRUE)
Error in validObject(.Object) :
invalid class “ScriptNodeInfo” object: invalid object for slot "removes" in class "ScriptNodeInfo": got class "list", should be or extend class "character" Is there any way of excluding pipes, or raising an error if |
Sounds like a problem with the code analysis. But either way, I strongly discourage |
Don't worry - I am not using it. It was just an example. One could actually check if the package is a base package and than throw a warning in the I am using But am I correct in assuming, that |
It may work if you also pass an explicit envir <- environment()
"your_pkg" %>% expose_imports(envir = envir) |
Re: #335 (comment): this is a problem in CodeDepends::getInputs(body(graphics::close.screen))
#> Warning: replacing previous import 'graph::addNode' by 'XML::addNode' when
#> loading 'CodeDepends'
#> Warning: replacing previous import 'graph::plot' by 'graphics::plot' when
#> loading 'CodeDepends'
#> Error in validObject(.Object): invalid class "ScriptNodeInfo" object: invalid object for slot "removes" in class "ScriptNodeInfo": got class "list", should be or extend class "character" |
Confirmed: explicitly passing library(drake)
#> Warning: replacing previous import 'graph::addNode' by 'XML::addNode' when
#> loading 'CodeDepends'
#> Warning: replacing previous import 'graph::plot' by 'graphics::plot' when
#> loading 'CodeDepends'
library(magrittr)
expose_imports(graphics)
#> <environment: namespace:graphics>
envir <- environment()
plan <- drake_plan(x = plot(1))
config <- drake_config(plan = plan, envir = envir)
#> Error in validObject(.Object): invalid class "ScriptNodeInfo" object: invalid object for slot "removes" in class "ScriptNodeInfo": got class "list", should be or extend class "character"
config <- plan %>% drake_config(envir = envir)
#> Error in validObject(.Object): invalid class "ScriptNodeInfo" object: invalid object for slot "removes" in class "ScriptNodeInfo": got class "list", should be or extend class "character" |
Update: with duncantl/CodeDepends@dfe31e7, the error from #335 (comment) should go away. |
OK - so that one should be fixed. But I think it is a bit worrying, that in a tool like drake, which aims at reproducability, the use of pipe versus classical programming makes a difference - even if the difference is only in the graphs created, and the targets remain the same. I think that should be addressed. One solution would be to internally use a pipe to make the traditional behave as the pipe? Rename drake_config <- function(plan, ARGUMENTS) {
plan %>%
.drake_config(ARGUMENTS) %>%
return
} |
I think reproducibility is a different set of issues:
If the code changes, we should expect that the results may change, even in a reproducible workflow.
The issue is not with the pipe, but the default value of the |
This is given, if I assume the code is the code which
is exactly the same.
OK.
No - as pipes are assumed to be equivalent to traditional R - and this is here not the case (I also see the dependency graph as a result which should be reproducible - not only the targets).
This is a critical issue here. As far as I am aware, pipes and traditional R programming are considered to be equivalent - which they are not here in drake. Setting the |
From the Basic piping I will see if the tidyverse folks think edge cases with environments are worth fixing. |
Thanks. I surely think they are, as subtle errors can creep in which will be quite difficult to identify. |
Please see last three graphs in attached html - am I missing something, or are pipes and traditional handled differently?
DrakeMake.html.zip
The text was updated successfully, but these errors were encountered: