-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there a way to disable drake's caching system? #384
Comments
Unfortunately,
I suspect You might also use custom files for all your targets and make sure the outputs of your commands are light. That way, |
@wlandau, thanks for your response.
From your reply, I understand that it is not possible to keep dependencies in memory between targets for parallel workflows. Is that correct? |
From looking at the console output of Until that is fixed, there are a couple things we could try. Are you open to modifying your workflow so that it relies less on the cache? Is it possible to have most of the commands read in CSV files and output feather files? In your case, the more often you have You could also try using an in-memory cache, but I do not generally recommend it for parallel processing because it is not thread safe. cache <- storr::storr_environment()
make(cache = cache, jobs = 8) # May error, but could be worth a shot. What is supposed to happen
Future workI do realize that |
@guilhermealles after closer inspection of development drake's behavior, I am more convinced that the slowdown you see is because |
And back to your original question, unfortunately, disabling |
Ah, I now see how this could been related to StarVZ. The development version of |
To clarify, "hasty mode" disables the caching system, but it does not skip up-to-date targets unless you come up with your own system for doing so (maybe in the |
Another approach: maybe try |
|
I am trying to use Drake's parallelism features to speed up a data manipulation workflow. My workflow reads some large .csv files from disk, manipulates them and outputs some optimized .feather files to be used by another application.
The problem is that, by using Drake (even in parallel), the workflow is considerably slower to complete (twice as slow or even more), when compared to its sequential version. My hypothesis is that it is taking so long because Drake caches its targets in to disk, which is not necessary in the specific case I am working with. Is there a way to disable Drake's caching system?
The text was updated successfully, but these errors were encountered: