Refactor the `jobs` argument. #251

wlandau · 2018-02-10T18:24:27Z

Reasons:

"Jobs" was a poor choice of words to begin with. I actually meant "workers". Now that drake relies so much on future, their vocabularies should agree.
Targets and imports are processed differently. Imports should always be processed locally, while targets may be deployed to a totally different computing environment. The parallelism needs are different in each case. Makefile parallelism can assign different numbers of workers, but all of drake should be capable of this, and it should all be cleaner.

The text was updated successfully, but these errors were encountered:

wlandau · 2018-02-10T18:25:55Z

We may just go with a workers argument if #227 becomes the only parallel backend. But a switch this dramatic would take a lot of time to transition safely.

wlandau · 2018-02-11T04:11:11Z

The max_useful_jobs() would need a deprecation and name change, as will most of the functions that take jobs. So I hesitate to jump into this one too quickly.

wlandau · 2018-02-14T22:03:01Z

If the manual scheduler in #227 replaces the non-Makefile backends, there may not actually be anything to do here. But I predict that the staged parallelism mclapply backend will continue to be the best way to process imports because of its minimal overhead. (We might look into mcfork() with the manual scheduler eventually though.)

wlandau · 2018-02-23T19:02:10Z

What about just one single workers argument? make(..., workers = 8) should set an overarching max of 8 workers at a time, and make(workers = c(future = 100, callr = 4) uses 100 remote workers and 4 local workers. We don't want to overload our local resources just because we want to submit 100 jobs to the cluster. This is not a problem for the old "future_lapply" and "Makefile" backends, but it is already a problem for the experimental new "future" backend and upcoming "callr" backend (#227). Two major tasks:

Bite the bullet and deprecate jobs in favor of workers.
Implement the optional two-part workers argument for the existing parallel computing functionality.

krlmlr · 2018-02-23T20:59:22Z

Are you still going to support workers = c(imports = 16), perhaps defaulting to the number of available cores?

wlandau · 2018-02-23T21:10:06Z

That's interesting. So if we name just one of the slots, the other should be some other default? For workers = c(future = 16), I was thinking imports should just be 1 if missing. I don't want to accidentally max out somebody's shared login node. On the other hand, I thought an unnmaed workers = 16 would use a max of 16 for each of future and callr. Either that or an overarching max of 16 for both kinds of worker combined.

A note on names: I would actually rather make the distinction between local vs remote workers rather than targets vs imports. With the upcoming parallelism column in workflow plans, some targets may use callr and others may use future (with imports exclusively using callr).

wlandau · 2018-02-25T02:11:22Z

Looking back at the Make man page, I remember now why I originally picked "jobs" rather than "workers".

$ man make

...
-j [jobs], --jobs[=jobs]
     Specifies the number of jobs (commands) to run simultaneously...

So "jobs" was not an entirely terrible choice of words. In any case, I am biased toward keeping it because it would be extremely painful to deprectate.

If/when we run multiple jobs per worker, we could make the behavior back compatible: workers could be the max parallel workers, jobs could be the max concurrent commands, and we could let drake do the load balancing.

# 1 worker with 1 job (default)
make(...)

# 2 jobs with 2 workers to match (1 job per worker, back compatible)
make(..., jobs = 2)

# same
make(..., workers = 2)

# 4 concurrent jobs spread over 2 parallel workers 
make(..., workers = 2, jobs = 4) 

# 16 concurrent jobs spread over 4 parallel workers,
# plus 2 multicore jobs on the master process for imports, etc.
make(..., workers = 4, jobs = c(master = 2, workers = 16))

# 16 concurrent jobs spread over 16 parallel workers (one job per worker)
# 2 multicore jobs on the master process for imports, etc.
make(..., jobs = c(master = 2, workers = 16))

Thoughts? I could start implementing jobs = c(master = x, workers = y) fairly soon, and we won't have to create a workers argument until we are ready to have multiple jobs per worker.

wlandau · 2018-02-27T16:20:53Z

Are you still going to support workers = c(imports = 16), perhaps defaulting to the number of available cores?

I have been thinking about this, and I decided I agree after all. jobs = c(imports = x, targets = y) gains us a lot of ground, and it is much easier to implement than what I was thinking of before.

wlandau · 2018-02-27T19:15:37Z

Fixed via #278. May revisit when it is time to consider multiple jobs per worker.

wlandau · 2018-05-11T00:12:55Z

Update: because of #369 and its documentation, I am more comfortable leaving the name "jobs" alone. Yes, we actually have persistent workers in the lapply-like parallel backends, but the workers are transient (essentially per-target jobs) for the "future" and "Makefile" backends. So "jobs" can stay a catch-all for "jobs or workers".

wlandau added the topic: api label Feb 10, 2018

wlandau added the difficulty: advanced label Feb 11, 2018

wlandau changed the title ~~Introduce distinct workers_imports and workers_targets arguments to replace jobs~~ Set a different number of remote vs local workers for manual scheduling. Feb 23, 2018

wlandau changed the title ~~Set a different number of remote vs local workers for manual scheduling.~~ Refactor and rename the jobs argument. Feb 23, 2018

wlandau added the status: priority label Feb 23, 2018

wlandau changed the title ~~Refactor and rename the jobs argument.~~ Refactor the jobs argument. Feb 25, 2018

wlandau self-assigned this Feb 27, 2018

wlandau removed the difficulty: advanced label Feb 27, 2018

wlandau mentioned this issue Feb 27, 2018

Allow different sets of jobs for imports and targets #287

Merged

6 tasks

wlandau closed this as completed Feb 27, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor the `jobs` argument. #251

Refactor the `jobs` argument. #251

wlandau commented Feb 10, 2018

wlandau commented Feb 10, 2018

wlandau commented Feb 11, 2018

wlandau commented Feb 14, 2018 •

edited

Loading

wlandau commented Feb 23, 2018

krlmlr commented Feb 23, 2018

wlandau commented Feb 23, 2018

wlandau commented Feb 25, 2018 •

edited

Loading

wlandau commented Feb 27, 2018

wlandau commented Feb 27, 2018

wlandau commented May 11, 2018 •

edited

Loading

Refactor the jobs argument. #251

Refactor the jobs argument. #251

Comments

wlandau commented Feb 10, 2018

wlandau commented Feb 10, 2018

wlandau commented Feb 11, 2018

wlandau commented Feb 14, 2018 • edited Loading

wlandau commented Feb 23, 2018

krlmlr commented Feb 23, 2018

wlandau commented Feb 23, 2018

wlandau commented Feb 25, 2018 • edited Loading

wlandau commented Feb 27, 2018

wlandau commented Feb 27, 2018

wlandau commented May 11, 2018 • edited Loading

Refactor the `jobs` argument. #251

Refactor the `jobs` argument. #251

wlandau commented Feb 14, 2018 •

edited

Loading

wlandau commented Feb 25, 2018 •

edited

Loading

wlandau commented May 11, 2018 •

edited

Loading