Use a different storr namespace for each status #380

wlandau · 2018-05-05T22:56:15Z

Summary

Here, I implement @krlmlr's suggestion to create a storr namespace for each persistent worker (ref: #369 (comment)). There are no speed improvements yet, but this approach to the cache is much better.

GitHub issues addressed

Ref: A quicker end to staged parallelism #369 (not done yet)

Checklist

I have read drake's code of conduct, and I agree to follow its rules.
I have read the guidelines for contributing.
I have listed any substantial changes in the development news.
I have added testthat unit tests to tests/testthat to confirm that any new features or functionality work correctly.
I have tested this pull request locally with devtools::check()
This pull request is ready for review.
I think this pull request is ready to merge.

codecov-io · 2018-05-05T23:17:21Z

Codecov Report

Merging #380 into master will not change coverage.
The diff coverage is 100%.

@@          Coverage Diff          @@
##           master   #380   +/-   ##
=====================================
  Coverage     100%   100%           
=====================================
  Files          65     65           
  Lines        5489   5508   +19     
=====================================
+ Hits         5489   5508   +19

Impacted Files	Coverage Δ
R/mclapply.R	`100% <100%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 97dabef...7b8105d. Read the comment docs.

wlandau · 2018-05-05T23:43:01Z

Before, getting/setting worker status meant reading/writing the contents of files. Now, getting status means checking if a file exists, and setting status means removing one file and creating another. I am not sure which is faster, and it probably depends on the user's file system. However, this design seems a lot safer when it comes to potential race conditions I may have missed. The code is also a little cleaner, and it has the potential to be faster. So even though speed appears just a hair worse for @krlmlr's example (we now have mclapply_staged parallelism now for that) I will merge.

wlandau · 2018-05-06T00:25:21Z

I so apparently, processes started hanging unpredictably in my tests. I think it has something to do with this PR, so I reverted it. The code is still in the workers_namespaces branch, but master no longer has it.

krlmlr

Trying to shed some light on the new failures...

krlmlr · 2018-05-06T07:37:26Z

R/mclapply.R

+        silent = TRUE
+      )
+    )
+    if (is.character(out)){


The try() pattern seems to be unsafe here:

is.character(try(stop())) #> [1] TRUE class(try(stop())) #> [1] "try-error"

Created on 2018-05-06 by the reprex package (v0.2.0).

How about the following general function:

set.seed(123) flaky <- function(p) { print("Side effect") if (p < runif(1)) stop("oops") p } retry <- function(code) { quo <- rlang::enquo(code) repeat { tryCatch( return(rlang::eval_tidy(quo)), error = identity ) Sys.sleep(0.2) } } retry(flaky(0.1)) #> [1] "Side effect" #> [1] "Side effect" #> [1] "Side effect" #> [1] "Side effect" #> [1] "Side effect" #> [1] "Side effect" #> [1] 0.1

Created on 2018-05-06 by the reprex package (v0.2.0).

krlmlr · 2018-05-06T07:40:11Z

R/mclapply.R

-mc_set_not_ready <- function(worker, config){
-  mc_set_status(worker = worker, status = "not ready", config = config)
+mc_unset_status <- function(worker, config){
+  lapply(


Can we somehow make the old worker status available here, to avoid the iteration? This code could live on as a new mc_purge_status() function, perhaps useful for initialization.

Use a different storr namespace for each status

7b8105d

wlandau self-assigned this May 5, 2018

wlandau requested a review from krlmlr May 5, 2018 22:56

wlandau merged commit 97973e2 into master May 5, 2018

wlandau deleted the worker_namespaces branch May 5, 2018 23:43

wlandau restored the worker_namespaces branch May 6, 2018 00:14

krlmlr reviewed May 6, 2018

View reviewed changes

wlandau deleted the worker_namespaces branch May 28, 2018 22:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use a different storr namespace for each status #380

Use a different storr namespace for each status #380

wlandau commented May 5, 2018

codecov-io commented May 5, 2018 •

edited

Loading

wlandau commented May 5, 2018

wlandau commented May 6, 2018

krlmlr left a comment

krlmlr May 6, 2018

krlmlr May 6, 2018

Use a different storr namespace for each status #380

Use a different storr namespace for each status #380

Conversation

wlandau commented May 5, 2018

Summary

GitHub issues addressed

Checklist

codecov-io commented May 5, 2018 • edited Loading

Codecov Report

wlandau commented May 5, 2018

wlandau commented May 6, 2018

krlmlr left a comment

Choose a reason for hiding this comment

krlmlr May 6, 2018

Choose a reason for hiding this comment

krlmlr May 6, 2018

Choose a reason for hiding this comment

codecov-io commented May 5, 2018 •

edited

Loading