Consider building targets in a separate environment. #296

wlandau · 2018-03-05T15:09:35Z

Maybe make() should execute with a new.envir(parent = globalenv()) rather than globalenv() by default. I did not have the issues figured out early in development, but this might be possible using the @dapperjapper's idea that made expose_imports() possible.

The text was updated successfully, but these errors were encountered:

wlandau · 2018-03-05T16:49:29Z

Could be an alternative to #253.

krlmlr · 2018-03-07T00:09:14Z

Maybe it's worthwhile to make this a pkgconfig-controlled option too.

wlandau · 2018-03-07T01:50:51Z

Possibly. Right now, though, I have a preference for a clean switch. With the environment<- trick, I think we can do it.

rkrug · 2018-03-07T07:48:14Z

I agree with @wlandau -- a clean switch to an own environment would be the cleanest option.

wlandau-lilly · 2018-03-07T13:47:35Z

It just occurred to me: we now need to distinguish between the envir arguments to make() and drake_config() and the environment that will eventually reside in config$envir. Not a big deal, but worth noting.

Expose some of the code used for expose_imports(). Will use it to create a special execution environment.

krlmlr · 2018-03-10T09:26:46Z

This may break user scripts if they access targets after running make(). I think we need to clearly communicate that this is a potentially breaking change.

rkrug · 2018-03-10T12:21:38Z

An option would be to have an argument ecport_targets which exports the targets of wanted by the user after the isolated evaluation. Von meinem iPhone gesendet

…

Am 10.03.2018 um 10:26 schrieb Kirill Müller ***@***.***>: This may break user scripts if they access targets after running make(). I think we need to clearly communicate that this is a potentially breaking change. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

wlandau · 2018-03-10T23:29:52Z

@krlmlr: so this should probably not be part of drake 5.1.0. Removing it from the milestone.

@rkrug: we could pursue something like that, but the main concern is changes to the default behavior that break compatibility. It gets complicated (see #253).

- Make sure the "future"/globalenv() testing scenario actually uses the global environment. - Repair a testing scenario: "future" backend with globalenv(). Need to pass globals explicitly in a named list rather than let `future` just guess at the values. - Use a clumsy workaround to make sure globals passed to the "future" backend do not conflict with anything `drake` needs. When #296 is solved, the point will be moot. - Set future.globals = FALSE in `future_lapply()` to reduce overhead. This is possible since the "future_lapply" backend assumes file system access and relies completely on the cache.

wlandau-lilly · 2018-03-20T13:25:12Z

Re: #330 (comment) (cc @rkrug): a separate R session for make() is worth considering. reprex uses callr::r_safe(), which is now just callr::r(). (Alternatively, we could just use callr::r_bg() to free up the current session.) It strikes me as a safer and more thorough solution to the problem #296 is trying to solve. But I do have concerns:

It adds annoying overhead.
Ultimately, we still get imports from the user's R session, so @rkrug's code from Creating plan in Rmd file - knit() same chunk names #330 (comment) would behave the same way.

@krlmlr, what do you think?

rkrug · 2018-03-20T13:39:21Z

That sounds like a plan, but why not use callr::r_vanilla() ? This would be even more reproducible as nothing is set by anybody than drake?

Overhead: One could add an argument to disable this behaviour and ude the existing session instead?

#330 : Why does the call to compile a target have any information if it was called from within a knitr document? If the compilation is taking place in a brand new R session, why should there be a problem?

wlandau-lilly · 2018-03-20T15:17:09Z

why not use callr::r_vanilla() ?

Sure.

Overhead: One could add an argument to disable this behaviour and use the existing session instead?

Maybe. I am not yet sure if this is worth complicating the interface.

#330 : Why does the call to compile a target have any information if it was called from within a knitr document? If the compilation is taking place in a brand new R session, why should there be a problem?

I was just referring to one of your later comments in #330, not the main issue itself. Your code in this comment did not call knitr specifically.

wlandau · 2018-03-20T16:13:19Z

I suppose withr::local_envvar() may be worth a mention too. Maybe it can be a low-overhead alternative to r_vanilla().

wlandau · 2018-03-20T17:27:38Z

Just realized that the original bit about environments in #296 should be solved as planned before this session business. This way, it will be easier to pass "globals" to callr. Will open a separate issue.

krlmlr · 2018-03-21T23:05:05Z

I like the idea of running make() in a separate session, but I'm not convinced drake should be handling this.

We could implement a two-step process:

Create plan, communicate it to drake(). The plan and all imports are recorded and stored.
Run make().

These two steps should work when run from separate sessions. Then, the user can just do callr::r(drake::make()) when the setup is finished.

I was planning to implement something like this for redrake, not sure if and when though.

rkrug · 2018-03-22T08:10:32Z

@krlmlr I disagree - make should provide reproducibility as far as possible. Other tools handle R package versions management very well, so this is not the business of drake. Also there is docker, which introduces the OS and R version issue.

But within one session, make should be consistent and always produce the same results (which, by the way, should also include the same documentation of the make file, i.e. the graphs - see #335).

To put the burden of the rather complex issue of environments and session management onto the shoulders, if it could be more easily be solved by a simple argument (for the user), is doing drake a disservice as there are many surprises waiting (see.g #335 for me).

If drake want's to achieve true reproducability on a project level, the use of a clean R session (--vanilla) would be essential. This would make calling drake from a bash makefile possible and consistent with running drake::make() in an "old and dirty" R session.

wlandau · 2018-03-22T14:39:44Z

I deliberately designed drake to be as R-focused as possible. That means it depends on your R workspace and session rather than on a bunch of files. Reproducibility principles and best practices for files, interactive sessions, and workspaces are already well-known and well-established. It is just a matter of applying them to make() just like you would to any other piece of R code. I think that what you bring up is a useful niche for redrake.

AlexAxthelm · 2018-03-26T03:57:27Z

Could we take the RStudio approach?

If the user opts to use the "knit" button in the UI, it's done in a new session. This doesn't prevent them from using rmarkdown::render() in the console, and running with their "dirty" session.

This might mean that drake needs an RStudio addin/extension? Anyone who isn't on RStudio wouldn't benefit, but they probably are okay with knowing that they are responsible for managing their own sessions.

rkrug · 2018-03-26T06:29:56Z

This would be a first step, but I still think that one should be easily be able to run make() in it's own session. And if the session is a clone of the current session, no additional files would be needed to set the environment up.

wlandau · 2018-03-27T19:18:35Z

For this and #253, I am leaning towards just removing targets from the environment after make(). Given that drake is designed to be immersed in the master session, I think this is the most consistent behavior. Thoughts?

rkrug · 2018-03-28T06:43:51Z

I have the feeling that this is doctoring (no offense intended) and there will be other effects (side effects from the compilation of the targets, e.g. the dreaded <<-) which can cause problems. I still think it would be the safest way of running the make process in it's own R session which is discarded afterwards. This can be a copy of the calling session (callr::r_vanilla()), but via an option also be a vanilla session (callr::r_vanilla()). But it could also be, if the overhang is the problem, executed in the current session (with all side effects).

AlexAxthelm · 2018-03-28T13:22:55Z

I'll say that from a reproduciblity/minimizing side effect standpoint, I'm on team vanilla, but I have enough situations where I want drake to import something that is already in the environment, that I feel that it should be user accessible option with a r_vanilla() default. Possibly also with an option to loadd the most recent targets automatically, to preserve current behavior (although I'm not sure this is behavior worth preserving).

rkrug · 2018-03-28T13:49:56Z

So the suggestion would be that

make() runs in it's own session, where two arguments / options are available:
session which can take the values
- vanilla (run in vanilla session),
- as_current (run in session which is configured as the current session)
- current (run in current session, i.e. the current behaviour),
- ?
return_targets which, if TRUE should loadd the targets (as in the current behaviour) and if FALSE does not loadd any targets. But I am here with @AlexAxthelm - this is probably not worth preserving.

Is this correct?

AlexAxthelm · 2018-03-28T14:25:16Z

I'll agree with that suggestion, with the exception that I think the arguments for session should be actual callr::r_*() functions (drake could re-export them, or at least a few), rather than re-interpreting and renaming.

That way, if callr expands its options, they are covered, since the API to pass to r_*() appears to be the same. This also give the user flexiblity if they want to do something along the lines of

make(my_plan, session = r_copycat) # run against current session for debugging
make(my_plan, session = r_vanilla) # run against clean session, for production

rkrug · 2018-03-28T15:13:58Z

Sounds perfect to me. This opens as well the option to define own session functions.

wlandau · 2018-03-28T17:32:23Z

In that case, I think we can close this and reopen #333. I will have new comments there.

wlandau · 2018-08-10T14:06:49Z

I want to revisit the issue of the environment in which targets are built. Using the global environment is not reproducible, but in the past, my attempts to create a special execution environment have lead to lexical scoping issues for functions. @dapperjapper's environment<- hack (he called it that first) was a potential solution, but I did not trust it enough. Now, with tidy evaluation, we could simply eval(enexpr()) any functions we pull from the global environment.

wlandau · 2018-08-10T15:37:14Z

rlang::enexpr() seemed like a candidate approach for taking the code for a function and assigning it in the new environment.

reassign_object <- function(name, old, new){
  object <- get(name, envir = old, inherits = FALSE)
  if (rlang::is_closure(object)){
    expr <- as.call(c(quote(`<-`), as.symbol(name), rlang::enexpr(object)))
    eval(expr, envir = new)
  } else {
    assign(x = name, value = object, envir = new, inherits = FALSE)
  }
  invisible()
}

Unfortunately, closures still scope the same way after reassignment.

old <- new.env(parent = globalenv())
old$y <- 1
eval(
  quote(
    f <- function(x){
      x + y
    }
  ),
  envir = old
)
new <- new.env(parent = globalenv())

reassign_object("f", old, new)
new$f(0) # Should be 1.
#> [1] 1
new$y <- 2
new$f(0) # Should be 2.
#> [1] 1

wlandau · 2018-08-10T15:44:38Z

Looking at Hadley's chapter on evaluation, new_function() looks like a great way to extract the good parts of a function without the baggage of the closure.

reassign_object <- function(name, old, new){
  object <- get(name, envir = old, inherits = FALSE)
  if (rlang::is_closure(object)){
    object <- rlang::new_function(
      args = formals(object),
      body = body(object),
      env = new
    )
  }
  assign(x = name, value = object, envir = new, inherits = FALSE)
  invisible()
}

But that has a problem too: functions produced with Vectorize() really do need their closures.

old <- new.env(parent = globalenv())
old$y <- 1
eval(
  quote(
    f <- Vectorize(function(x){
      x + y
    }, "x")
  ),
  envir = old
)
new <- new.env(parent = globalenv())
reassign_object("f", old, new)
new$f(0)
#> Error in names %in% vectorize.args: object 'vectorize.args' not found

Similarly perverse edge cases may arise in the custom code of drake's more aggressive users.

wlandau · 2018-08-10T15:47:16Z

So that leaves us with environment<- all over again, and at this point, I am still not willing to use it as a foundation for drake's internals. But at least my experimentation is in the thread for reference.

wlandau · 2018-12-15T18:21:13Z

Relevant question: https://stackoverflow.com/questions/53795952/force-an-r-function-call-to-be-pure

wlandau · 2018-12-15T19:29:47Z

I am revisiting this issue in response to a recent suggestion in the comments: create a special new environment and lock it. That way, impure functions error out. We could maintain a separate enclos argument to eval() to contain targets that get built over the course of make().

wlandau · 2018-12-15T19:53:50Z

Yet again, I experimented a version of expose_envir() that used the environment<- trick only on functions, and functions from Vectorize() lost their closures. We should probably just recommend make(session = callr::r_vanilla())`.

wlandau · 2018-12-15T21:39:02Z

See #619 and the prework #620.

wlandau mentioned this issue Mar 6, 2018

Symbols loaded after make() #253

Closed

krlmlr added the topic: api label Mar 7, 2018

wlandau added this to the CRAN release 5.1.0 milestone Mar 9, 2018

wlandau-lilly added a commit that referenced this issue Mar 9, 2018

Prepare for #296

f245e62

Expose some of the code used for expose_imports(). Will use it to create a special execution environment.

wlandau-lilly added a commit that referenced this issue Mar 9, 2018

Start #296

21d61c7

wlandau removed this from the CRAN release 5.1.0 milestone Mar 10, 2018

wlandau added internals difficulty: advanced and removed topic: api labels Mar 13, 2018

wlandau mentioned this issue Mar 18, 2018

More debugging and tweaking for futures #329

Merged

7 tasks

wlandau-lilly added the status: priority label Mar 19, 2018

wlandau-lilly changed the title ~~Consider executing inside a new.envir(parent = globalenv()) rather than globalenv() by default~~ Consider not modifying the user's R workspace/environment. Mar 20, 2018

wlandau mentioned this issue Mar 20, 2018

Creating plan in Rmd file - knit() same chunk names #330

Closed

wlandau removed the type: new feature label Mar 20, 2018

wlandau removed the status: priority label Mar 20, 2018

wlandau mentioned this issue Mar 20, 2018

Option to make targets in a new session #333

Closed

wlandau closed this as completed Mar 28, 2018

wlandau mentioned this issue Mar 29, 2018

make(session = callr::r_vanilla) #346

Merged

7 tasks

wlandau reopened this Aug 10, 2018

wlandau changed the title ~~Consider not modifying the user's R workspace/environment.~~ Consider building targets in a separate environment. Aug 10, 2018

wlandau closed this as completed Aug 10, 2018

wlandau mentioned this issue Dec 15, 2018

Targets get outdated when running make() in multiple R session #615

Closed

wlandau reopened this Dec 15, 2018

wlandau closed this as completed Dec 15, 2018

wlandau mentioned this issue Dec 15, 2018

Lock the user's enviornment during make() #619

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider building targets in a separate environment. #296

Consider building targets in a separate environment. #296

wlandau commented Mar 5, 2018 •

edited by wlandau-lilly

Loading

wlandau commented Mar 5, 2018

krlmlr commented Mar 7, 2018

wlandau commented Mar 7, 2018 •

edited

Loading

rkrug commented Mar 7, 2018

wlandau-lilly commented Mar 7, 2018

krlmlr commented Mar 10, 2018

rkrug commented Mar 10, 2018 via email

wlandau commented Mar 10, 2018

wlandau-lilly commented Mar 20, 2018

rkrug commented Mar 20, 2018

wlandau-lilly commented Mar 20, 2018 •

edited

Loading

wlandau commented Mar 20, 2018

wlandau commented Mar 20, 2018

krlmlr commented Mar 21, 2018

rkrug commented Mar 22, 2018

wlandau commented Mar 22, 2018 •

edited

Loading

AlexAxthelm commented Mar 26, 2018

rkrug commented Mar 26, 2018

wlandau commented Mar 27, 2018

rkrug commented Mar 28, 2018 •

edited

Loading

AlexAxthelm commented Mar 28, 2018

rkrug commented Mar 28, 2018

AlexAxthelm commented Mar 28, 2018

rkrug commented Mar 28, 2018

wlandau commented Mar 28, 2018

wlandau commented Aug 10, 2018

wlandau commented Aug 10, 2018 •

edited

Loading

wlandau commented Aug 10, 2018 •

edited

Loading

wlandau commented Aug 10, 2018

wlandau commented Dec 15, 2018

wlandau commented Dec 15, 2018

wlandau commented Dec 15, 2018

wlandau commented Dec 15, 2018

Consider building targets in a separate environment. #296

Consider building targets in a separate environment. #296

Comments

wlandau commented Mar 5, 2018 • edited by wlandau-lilly Loading

wlandau commented Mar 5, 2018

krlmlr commented Mar 7, 2018

wlandau commented Mar 7, 2018 • edited Loading

rkrug commented Mar 7, 2018

wlandau-lilly commented Mar 7, 2018

krlmlr commented Mar 10, 2018

rkrug commented Mar 10, 2018 via email

wlandau commented Mar 10, 2018

wlandau-lilly commented Mar 20, 2018

rkrug commented Mar 20, 2018

wlandau-lilly commented Mar 20, 2018 • edited Loading

wlandau commented Mar 20, 2018

wlandau commented Mar 20, 2018

krlmlr commented Mar 21, 2018

rkrug commented Mar 22, 2018

wlandau commented Mar 22, 2018 • edited Loading

AlexAxthelm commented Mar 26, 2018

rkrug commented Mar 26, 2018

wlandau commented Mar 27, 2018

rkrug commented Mar 28, 2018 • edited Loading

AlexAxthelm commented Mar 28, 2018

rkrug commented Mar 28, 2018

AlexAxthelm commented Mar 28, 2018

rkrug commented Mar 28, 2018

wlandau commented Mar 28, 2018

wlandau commented Aug 10, 2018

wlandau commented Aug 10, 2018 • edited Loading

wlandau commented Aug 10, 2018 • edited Loading

wlandau commented Aug 10, 2018

wlandau commented Dec 15, 2018

wlandau commented Dec 15, 2018

wlandau commented Dec 15, 2018

wlandau commented Dec 15, 2018

wlandau commented Mar 5, 2018 •

edited by wlandau-lilly

Loading

wlandau commented Mar 7, 2018 •

edited

Loading

wlandau-lilly commented Mar 20, 2018 •

edited

Loading

wlandau commented Mar 22, 2018 •

edited

Loading

rkrug commented Mar 28, 2018 •

edited

Loading

wlandau commented Aug 10, 2018 •

edited

Loading

wlandau commented Aug 10, 2018 •

edited

Loading