-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MEMORY: Garbage collect future process after value has been collected #69
Comments
The simplest solution might be to wrap up the future expression expr_gc <- {
value <- local(expr)
gc()
value
} This can be used when Should we add a |
Added For now, we'll leave it up to the user to specify it, which is somewhat sub-optimal, but good enough for now until proven useful/needed. Also, it might be that some futures should be garbage collected whereas other might not. The develop can control this using: x <- future({ expr }, gc=TRUE)
x %<-% { expr } %tweak% list(gc=TRUE) |
If/when we collect memory and timing stats per future (Issue #59), we could use these to decide whether running the garbage collector is necessary (e.g. enough memory was allocated) and / or would only take a fractional amount of time relative to the total evaluation time of the future (i.e. for long running futures, the time that the garbage collector consumes will be relatively small). For instance, we can have options controlling when the garbage collector should be run, e.g.
|
Issue
When using multiprocess futures that relies on PSOCK cluster nodes, background R sessions or forked processes ("multicore") there might be large objects left behind in those processes after we've collect the value. The processes will keep being alive in the background. Thus, if we run say 20 processes and 19 of them finish early and one keeps processing a long time there after, we occupy unnecessary memory (RAM) due to those 19 processes.
Suggestion
After retrieving the value of a future:
of the R environment in process where the future was resolved.
We already do Step 1 before launching new futures for some of the multiprocess future types. We don't garbage collect explicitly anywhere. Also, Step 1 should not be done for "persistent" futures (
persistent=TRUE
).It is not clear to me if it is possible to run code after the value has been retrieved for all types of futures. This might be an issue.
The text was updated successfully, but these errors were encountered: