Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a different storr namespace for each status #380

Merged
merged 1 commit into from
May 5, 2018
Merged

Conversation

wlandau
Copy link
Member

@wlandau wlandau commented May 5, 2018

Summary

Here, I implement @krlmlr's suggestion to create a storr namespace for each persistent worker (ref: #369 (comment)). There are no speed improvements yet, but this approach to the cache is much better.

GitHub issues addressed

Checklist

  • I have read drake's code of conduct, and I agree to follow its rules.
  • I have read the guidelines for contributing.
  • I have listed any substantial changes in the development news.
  • I have added testthat unit tests to tests/testthat to confirm that any new features or functionality work correctly.
  • I have tested this pull request locally with devtools::check()
  • This pull request is ready for review.
  • I think this pull request is ready to merge.

@wlandau wlandau self-assigned this May 5, 2018
@wlandau wlandau requested a review from krlmlr May 5, 2018 22:56
@codecov-io
Copy link

codecov-io commented May 5, 2018

Codecov Report

Merging #380 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@          Coverage Diff          @@
##           master   #380   +/-   ##
=====================================
  Coverage     100%   100%           
=====================================
  Files          65     65           
  Lines        5489   5508   +19     
=====================================
+ Hits         5489   5508   +19
Impacted Files Coverage Δ
R/mclapply.R 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 97dabef...7b8105d. Read the comment docs.

@wlandau
Copy link
Member Author

wlandau commented May 5, 2018

Before, getting/setting worker status meant reading/writing the contents of files. Now, getting status means checking if a file exists, and setting status means removing one file and creating another. I am not sure which is faster, and it probably depends on the user's file system. However, this design seems a lot safer when it comes to potential race conditions I may have missed. The code is also a little cleaner, and it has the potential to be faster. So even though speed appears just a hair worse for @krlmlr's example (we now have mclapply_staged parallelism now for that) I will merge.

@wlandau wlandau merged commit 97973e2 into master May 5, 2018
@wlandau wlandau deleted the worker_namespaces branch May 5, 2018 23:43
@wlandau wlandau restored the worker_namespaces branch May 6, 2018 00:14
@wlandau
Copy link
Member Author

wlandau commented May 6, 2018

I so apparently, processes started hanging unpredictably in my tests. I think it has something to do with this PR, so I reverted it. The code is still in the workers_namespaces branch, but master no longer has it.

Copy link
Collaborator

@krlmlr krlmlr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to shed some light on the new failures...

silent = TRUE
)
)
if (is.character(out)){
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The try() pattern seems to be unsafe here:

is.character(try(stop()))
#> [1] TRUE
class(try(stop()))
#> [1] "try-error"

Created on 2018-05-06 by the reprex package (v0.2.0).

How about the following general function:

set.seed(123)

flaky <- function(p) {
  print("Side effect")
  if (p < runif(1)) stop("oops")
  p
}


retry <- function(code) {
  quo <- rlang::enquo(code)
  repeat {
    tryCatch(
      return(rlang::eval_tidy(quo)),
      error = identity
    )
    Sys.sleep(0.2)
  }
}

retry(flaky(0.1))
#> [1] "Side effect"
#> [1] "Side effect"
#> [1] "Side effect"
#> [1] "Side effect"
#> [1] "Side effect"
#> [1] "Side effect"
#> [1] 0.1

Created on 2018-05-06 by the reprex package (v0.2.0).

mc_set_not_ready <- function(worker, config){
mc_set_status(worker = worker, status = "not ready", config = config)
mc_unset_status <- function(worker, config){
lapply(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we somehow make the old worker status available here, to avoid the iteration? This code could live on as a new mc_purge_status() function, perhaps useful for initialization.

@wlandau wlandau deleted the worker_namespaces branch May 28, 2018 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants