-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sequential job submission #105
Comments
Interesting. What about the Also, I'm curious: are all your jobs independent? In other words, would you be able to submit all 5000 jobs at once if your sys admin let you? |
Hi Marco,
That's likely because we use job arrays instead of submitting each job sequentially. I think
I'm still wondering how common that is. How long does one function call last, and what is the maximum wall time your cluster allows? Related to #101.
This is currently not supported. The simplest workaround would be: result = list()
for (i in seq(1, nrow(df), 500))
result = c(result, Q_rows(df[i:max(nrow(df), 500),], ..., job_size=1)) |
More than somewhat - I don't know how that works, but the scheduling is insanly fast (from pending to running, not only spawning the jobs).
I think I am using all of that in an quite unconvential way, so probably not that common. I am simulating and analysing artificial landscapes and that takes quite some time, so I single function call can last up to 4,5 hours. The max walltime I can get out of our multi purpose queue is 48 h. If i have more than 10 times the amount of jobs I can submit at once, I reach that limit...
That makes sense, thanks 👍 @wlandau Yup, my jobs are all independent (or embarrassingly parallel 😋 ). I am doing a parameter space exploration and the scheduling with lsf gives me a bonus if I send small jobs (getting an exclusive node takes a day or two here vs getting a single core here and there is no problem at all). |
Hi Michael,
thanks a ton for clustermq, was really nice to play around with it so far!
I have a question about how one could use
Q
to submit a high number of jobs. I don't know if that is specific to our HPC (which uses LSF), but if I use clustermq all the jobs are submitted instantly. This is a huge advantage over batchtools and reduces the computation time significantly.What I want to achieve is to submit each row of a data frame (with say 5000 rows) as a single job.
Apparently, I can only submit ~500 Jobs at once to our HPC, so I would like to send 500 Jobs there and after they are finished, I would like to send the next 500 jobs to the HPC.
However, clustermq seems to hold jobs open and submits new rows in the job it opened at the function call. This means that a single job could come the walltime of our cluster quite close.
Is it possible to submit for example 5000 jobs in chunks of 500 single jobs that are submitted at once, but after finishing them Q sends 500 new jobs?
With future I would do something like this to get this:
Cheers
Marco
The text was updated successfully, but these errors were encountered: