Symbols loaded after make() #253

krlmlr · 2018-02-14T05:48:24Z

When running make() for a target, it is available in the global environment only if it is built during this run of make() (with the default config). Can we implement a more consistent way of handling built targets?

library(drake)

plan <- drake_plan(
  target1 = 5
)

make(plan)
#> cache /tmp/RtmpMxWAKq/.drake
#> connect 1 import: plan
#> connect 1 target: target1
#> check 1 item: target1
#> target target1
target1
#> [1] 5

make(plan)
#> cache /tmp/RtmpMxWAKq/.drake
#> Unloading targets from environment:
#>   target1
#> connect 1 import: plan
#> connect 1 target: target1
#> check 1 item: target1
#> All targets are already up to date.
target1
#> Error in eval(expr, envir, enclos): object 'target1' not found

Created on 2018-02-14 by the reprex package (v0.2.0).

The text was updated successfully, but these errors were encountered:

wlandau · 2018-02-14T14:47:27Z

For now, users can get around this by creating a custom envir for make(). Going forward, we can definitely address this. By default, drake operates in the user's workspace. Early in development, I tried creating a special isolated execution environment for drake, but I could not solve the lexical scoping problems (nested imported functions were not seeing each other). We might be able to try again with something like expose_imports(). A little extra overhead, but it could be worth it. In fact, this would mean we no longer have to test everything separately for the global and custom environments, so it would cut the scaled-up testing workflow in half. cc @dapperjapper.

violetcereza · 2018-02-14T16:19:05Z

I think this specific issue could be solved by just loadding any targets in make(plan, targets) as a convenience to the user when all targets are already up to date and make was run in a way that would modify the global environment (parallelism != Makefile). Alternatively, drake could do the "Unloading targets from environment" thing to clean up after all make runs. I can see a downside to both options, because they require a little more overhead in pursuit of consistency. But I'm fine with that.

I'm not sure about the question of environments. I like way future deals with environments and scope, because it meshes with the normal use of environments and scope in R really naturally, by definition and design. The nature of drake (targets are stored on a flat key-value level, I can't see anyone using #228) precludes some of the complexity found in future, but I think the question of how drake deals with environments is very existentially central. (if that makes sense??)

wlandau · 2018-02-14T22:01:19Z

I think this specific issue could be solved by just loadding any targets in make(plan, targets) as a convenience to the user when all targets are already up to date and make was run in a way that would modify the global environment (parallelism != Makefile).

I'm not sure I agree. Some of these targets get really big, and automatically loading things could get really expensive. Sometimes people want to run make() just to check for outdated targets.

Alternatively, drake could do the "Unloading targets from environment" thing to clean up after all make runs.

Unfortunately, that needs to happen first because sometimes targets conflict with imports, especially when users load targets from the cache to interact with the results.

I'm not sure about the question of environments. I like way future deals with environments and scope, because it meshes with the normal use of environments and scope in R really naturally, by definition and design. The nature of drake (targets are stored on a flat key-value level, I can't see anyone using #228) precludes some of the complexity found in future, but I think the question of how drake deals with environments is very existentially central. (if that makes sense??)

Flatness is a good way to put it. It's a source source of a lot of struggle for drake, and thinking that way is keeping me from wrapping my head around #228 (both in picturing what users might want and trying to guess if it could work).

violetcereza · 2018-02-14T22:25:55Z

Unfortunately, that needs to happen first because sometimes targets conflict with imports, especially when users load targets from the cache to interact with the results.

Sorry, I meant both before and after all make runs!

wlandau · 2018-02-24T19:28:23Z

I thought about this one more, and I think our best options are the following (ranked according to my preference):

Unload all the targets at the end of make(). Consistent, but possibly inconvenient if you want to loadd() things all over again. Then again, this approach would help get things ready for a possible next make().
Leave any in-memory targets alone at the end of make() (current behavior). Possibly convenient, but inconsistent. And the next make() has more prep work to do.
Load the final targets at the end of make(), including ones that were skipped. It would be difficult to figure out which targets to load, and loading for the sake of loading may not be worth the time for large targets.

Just say the word and I will implement the first one. I may need more convincing on the other two.

krlmlr · 2018-02-24T19:59:43Z

Consistency is king, but it looks like the first option might break existing user code. Maybe a variant of option 3 combined with lazy loading (=active bindings)? It doesn't matter if we load more than necessary as long as we're consistent in what we load.

Anyway, this feels sufficiently low prio to me at this point, labeled accordingly.

wlandau · 2018-02-24T22:50:25Z

Maybe active bindings for all the targets in the cache with no revdeps?

krlmlr · 2018-02-28T13:11:23Z

I'm really not sure what's best at this point.

wlandau · 2018-03-02T14:31:14Z

At the very least, this is documented in the caution.Rmd vignette.

What about a console message that lists the newly-loaded targets?

krlmlr · 2018-03-02T14:34:44Z

I'm leaning towards loading always with active bindings. Do the active bindings work with caching, so that the second access is faster?

wlandau · 2018-03-02T14:52:16Z

You mean in-memory caching, right? I just made that happen in 2fc5932.

Creating active bindings for all non-file targets in the cache should cover it. But awkwardly, the bindings would need to be destroyed/unloaded before the next make() in the same session so we don't confuse targets with imports.

krlmlr · 2018-03-02T16:18:52Z

But unloading bindings is cheap, and reloading them too if they are in the in-memory cache?

wlandau · 2018-03-02T18:00:24Z

I think so.

wlandau · 2018-03-02T20:43:15Z

At this point, my only concern is the read-only nature of active bindings. Is there a way to relax that with bindr?

> make(drake_plan(x = 1))
> loadd(x, lazy = "bind")
> x <- 2
Error: Binding is read-only.

wlandau · 2018-03-02T20:59:28Z

Edit: just saw r-lib/R6#99. Not sure how users will feel about that. What about lazy loading with promises instead?

wlandau · 2018-03-02T21:12:50Z

Except that delayedAssign() fails if we clean the cache and make() again. The second make() tries to fulfill a promise based on a cache that may not even exist anymore.

> x <- drake_plan(a = 1)
> make(x)
> clean(destroy = TRUE)
> make(x)
Error: key 'a' ('objects') not found

So promises with delayedAssign() don't seem to be an option either.

krlmlr · 2018-03-02T21:36:43Z

The theory of functional programming, and also much of drake's philosophy, suggests that these variables should be read-only. But this turns the problem into something we may wish to do for a major release. (Also, I don't see how to fully support assignment with bindr. The variable would remain an active binding, and we'd have to store the value the user gave us somewhere. Yikes.)

wlandau · 2018-03-06T12:12:01Z

If we implement #296, no targets will be automatically loaded anymore, which will give us consistency here.

wlandau · 2018-03-08T18:09:13Z

Closing in favor of #296. I do not think it will be a major hit to back compatibility. Will reopen if we cannot implement #296.

wlandau · 2018-05-02T14:01:00Z

FYI: as a side effect of #369, the symbols loaded after make() are likely to go away for mclapply and parLapply parallelism when jobs > 1. In this case, I think it's worth the sudden change. I have a preference for also removing these symbols when jobs is 1.

krlmlr · 2018-05-02T23:29:58Z

Good news. Agree it's good to be consistent regardless of the jobs value.

wlandau mentioned this issue Feb 14, 2018

Subprojects? #228

Closed

wlandau added internals and removed requires a new separate tool labels Feb 22, 2018

krlmlr added difficulty: advanced undecided: may or may not fix labels Feb 24, 2018

wlandau mentioned this issue Mar 5, 2018

Consider building targets in a separate environment. #296

Closed

wlandau-lilly mentioned this issue Mar 7, 2018

Immutability of targets and pure functions krlmlr/remake-slides#8

Open

wlandau closed this as completed Mar 8, 2018

wlandau mentioned this issue Mar 29, 2018

make(session = callr::r_vanilla) #346

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Symbols loaded after make() #253

Symbols loaded after make() #253

krlmlr commented Feb 14, 2018

wlandau commented Feb 14, 2018 •

edited

Loading

violetcereza commented Feb 14, 2018

wlandau commented Feb 14, 2018

violetcereza commented Feb 14, 2018

wlandau commented Feb 24, 2018

krlmlr commented Feb 24, 2018

wlandau commented Feb 24, 2018

krlmlr commented Feb 28, 2018

wlandau commented Mar 2, 2018

krlmlr commented Mar 2, 2018

wlandau commented Mar 2, 2018

krlmlr commented Mar 2, 2018

wlandau commented Mar 2, 2018

wlandau commented Mar 2, 2018

wlandau commented Mar 2, 2018 •

edited

Loading

wlandau commented Mar 2, 2018

krlmlr commented Mar 2, 2018

wlandau commented Mar 6, 2018

wlandau commented Mar 8, 2018

wlandau commented May 2, 2018

krlmlr commented May 2, 2018

Symbols loaded after make() #253

Symbols loaded after make() #253

Comments

krlmlr commented Feb 14, 2018

wlandau commented Feb 14, 2018 • edited Loading

violetcereza commented Feb 14, 2018

wlandau commented Feb 14, 2018

violetcereza commented Feb 14, 2018

wlandau commented Feb 24, 2018

krlmlr commented Feb 24, 2018

wlandau commented Feb 24, 2018

krlmlr commented Feb 28, 2018

wlandau commented Mar 2, 2018

krlmlr commented Mar 2, 2018

wlandau commented Mar 2, 2018

krlmlr commented Mar 2, 2018

wlandau commented Mar 2, 2018

wlandau commented Mar 2, 2018

wlandau commented Mar 2, 2018 • edited Loading

wlandau commented Mar 2, 2018

krlmlr commented Mar 2, 2018

wlandau commented Mar 6, 2018

wlandau commented Mar 8, 2018

wlandau commented May 2, 2018

krlmlr commented May 2, 2018

wlandau commented Feb 14, 2018 •

edited

Loading

wlandau commented Mar 2, 2018 •

edited

Loading