-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider building targets in a separate environment. #296
Comments
Could be an alternative to #253. |
Maybe it's worthwhile to make this a pkgconfig-controlled option too. |
Possibly. Right now, though, I have a preference for a clean switch. With the |
I agree with @wlandau -- a clean switch to an own environment would be the cleanest option. |
It just occurred to me: we now need to distinguish between the |
Expose some of the code used for expose_imports(). Will use it to create a special execution environment.
This may break user scripts if they access targets after running |
An option would be to have an argument ecport_targets which exports the targets of wanted by the user after the isolated evaluation.
Von meinem iPhone gesendet
… Am 10.03.2018 um 10:26 schrieb Kirill Müller ***@***.***>:
This may break user scripts if they access targets after running make(). I think we need to clearly communicate that this is a potentially breaking change.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
- Make sure the "future"/globalenv() testing scenario actually uses the global environment. - Repair a testing scenario: "future" backend with globalenv(). Need to pass globals explicitly in a named list rather than let `future` just guess at the values. - Use a clumsy workaround to make sure globals passed to the "future" backend do not conflict with anything `drake` needs. When #296 is solved, the point will be moot. - Set future.globals = FALSE in `future_lapply()` to reduce overhead. This is possible since the "future_lapply" backend assumes file system access and relies completely on the cache.
Re: #330 (comment) (cc @rkrug): a separate R session for
@krlmlr, what do you think? |
That sounds like a plan, but why not use Overhead: One could add an argument to disable this behaviour and ude the existing session instead? #330 : Why does the call to compile a target have any information if it was called from within a knitr document? If the compilation is taking place in a brand new R session, why should there be a problem? |
Sure.
Maybe. I am not yet sure if this is worth complicating the interface.
I was just referring to one of your later comments in #330, not the main issue itself. Your code in this comment did not call |
I suppose |
Just realized that the original bit about environments in #296 should be solved as planned before this session business. This way, it will be easier to pass "globals" to |
I like the idea of running We could implement a two-step process:
These two steps should work when run from separate sessions. Then, the user can just do I was planning to implement something like this for redrake, not sure if and when though. |
@krlmlr I disagree - make should provide reproducibility as far as possible. Other tools handle R package versions management very well, so this is not the business of drake. Also there is docker, which introduces the OS and R version issue. But within one session, make should be consistent and always produce the same results (which, by the way, should also include the same documentation of the make file, i.e. the graphs - see #335). To put the burden of the rather complex issue of environments and session management onto the shoulders, if it could be more easily be solved by a simple argument (for the user), is doing drake a disservice as there are many surprises waiting (see.g #335 for me). If drake want's to achieve true reproducability on a project level, the use of a clean R session (--vanilla) would be essential. This would make calling drake from a bash makefile possible and consistent with running |
I deliberately designed |
Could we take the RStudio approach? If the user opts to use the "knit" button in the UI, it's done in a new session. This doesn't prevent them from using This might mean that drake needs an RStudio addin/extension? Anyone who isn't on RStudio wouldn't benefit, but they probably are okay with knowing that they are responsible for managing their own sessions. |
This would be a first step, but I still think that one should be easily be able to run |
For this and #253, I am leaning towards just removing targets from the environment after |
I have the feeling that this is doctoring (no offense intended) and there will be other effects (side effects from the compilation of the targets, e.g. the dreaded |
I'll say that from a reproduciblity/minimizing side effect standpoint, I'm on team |
So the suggestion would be that
Is this correct? |
I'll agree with that suggestion, with the exception that I think the arguments for That way, if make(my_plan, session = r_copycat) # run against current session for debugging
make(my_plan, session = r_vanilla) # run against clean session, for production |
Sounds perfect to me. This opens as well the option to define own |
In that case, I think we can close this and reopen #333. I will have new comments there. |
I want to revisit the issue of the environment in which targets are built. Using the global environment is not reproducible, but in the past, my attempts to create a special execution environment have lead to lexical scoping issues for functions. @dapperjapper's |
reassign_object <- function(name, old, new){
object <- get(name, envir = old, inherits = FALSE)
if (rlang::is_closure(object)){
expr <- as.call(c(quote(`<-`), as.symbol(name), rlang::enexpr(object)))
eval(expr, envir = new)
} else {
assign(x = name, value = object, envir = new, inherits = FALSE)
}
invisible()
} Unfortunately, closures still scope the same way after reassignment. old <- new.env(parent = globalenv())
old$y <- 1
eval(
quote(
f <- function(x){
x + y
}
),
envir = old
)
new <- new.env(parent = globalenv())
reassign_object("f", old, new)
new$f(0) # Should be 1.
#> [1] 1
new$y <- 2
new$f(0) # Should be 2.
#> [1] 1 |
Looking at Hadley's chapter on evaluation, reassign_object <- function(name, old, new){
object <- get(name, envir = old, inherits = FALSE)
if (rlang::is_closure(object)){
object <- rlang::new_function(
args = formals(object),
body = body(object),
env = new
)
}
assign(x = name, value = object, envir = new, inherits = FALSE)
invisible()
} But that has a problem too: functions produced with old <- new.env(parent = globalenv())
old$y <- 1
eval(
quote(
f <- Vectorize(function(x){
x + y
}, "x")
),
envir = old
)
new <- new.env(parent = globalenv())
reassign_object("f", old, new)
new$f(0)
#> Error in names %in% vectorize.args: object 'vectorize.args' not found Similarly perverse edge cases may arise in the custom code of |
So that leaves us with |
I am revisiting this issue in response to a recent suggestion in the comments: create a special new environment and lock it. That way, impure functions error out. We could maintain a separate |
Yet again, I experimented a version of |
Maybe
make()
should execute with anew.envir(parent = globalenv())
rather thanglobalenv()
by default. I did not have the issues figured out early in development, but this might be possible using the @dapperjapper's idea that madeexpose_imports()
possible.The text was updated successfully, but these errors were encountered: