Integration with futile.logger #268

adrfantini · 2018-12-07T13:11:16Z

Hi and thanks for the awesome package.
I usually use futile.logger to log to the console, but unfortunately when using parallel futures, messages are relayed only at the end of the job. Is there a way to make the logging call flush directly to the console?

I opened a similar issue (zatonovo/futile.logger#84) on futile.logger's github page, and the author suggested to raise the matter here.

The text was updated successfully, but these errors were encountered:

HenrikBengtsson · 2018-12-07T23:34:56Z

Is there a way to make the logging call flush directly to the console?

I generally, not, but it might be possible to add support for this for certain future backends (e.g. future.callr).

For more background, see the discussion in Issue #172 ("DESIGN: Future API - Minimal/Core/Essential API and Extended/Optional API"), which has pointers to other, closely related feature requests.

adrfantini · 2018-12-09T09:55:34Z

Thanks for the info. @muxspace, do you find anything in #172 that could help?

muxspace · 2018-12-11T22:27:24Z

@adrfantini if the hooks were implemented, maybe that would work? Let's jump back to zatonovo/futile.logger#84 issue in futile.logger, as I have a few questions for you.

HenrikBengtsson · 2018-12-11T22:46:00Z

Thanks for the interest. Just for clarification: there are basically two kinds/levels of logging and progress framework people are asking for. It's useful to distinguish those two when discussing feature requests because their implementations are quite different:

The "easy" one is the one that is solely orchestrated from the local, main R session where we can imagine hook functions being signaled/called when a future is created, launched, queried ("are you resolved yet"?), and results are pulled back from the worker. I can also imagine global hook functions that are called for all futures and hook functions that are specific to a particular future (i.e. there might be a need for two different hook function APIs).
The other one is much harder where users/developers want "live" updates from the worker - that requires a much more complicated, asynchronous orchestration framework. To get this working consistently across OSes and R environments (R terminal, Rgui, RStudio, ...) needs to be considered too.

muxspace · 2018-12-12T00:20:49Z

I think @adrfantini just wants to have the log messages emitted from futile.logger emitted when executed. They appear to not be flushed until the future is fully resolved. That's my understanding.

HenrikBengtsson · 2018-12-12T04:15:06Z

So, then that's the second and much harder kind of output relaying. Below are two minimal example that illustrates the problems involved. These examples correspond to future backends plan(multicore) and plan(multisession).

Example 1: Using forked processes (Unix & macOS only) displays the output from child processes immediately when running R in a terminal, e.g.

> y <- parallel::mclapply(1:2, FUN = print)
[1] 1    <= stdout from the child process
[1] 2    <= stdout from the child process
> str(y)
List of 2
 $ : int 1
 $ : int 2
>

yet you cannot capture it from the main R process;

> out <- capture.output(y <- parallel::mclapply(1:2, FUN = print))
> str(out)
 chr(0)
> str(y)
List of 2
 $ : int 1
 $ : int 2
>

and if called from, say, the RStudio Console, the output is completely lost:

> y <- parallel::mclapply(1:2, FUN = print)
> str(y)
List of 2
 $ : int 1
 $ : int 2
>

Example 2: Using a PSOCK cluster (all OSes), we get:

> cl <- parallel::makeCluster(1L)
> y <- parallel::parLapply(cl, X = 1:2, fun = print)
> str(y)
List of 2
 $ : int 1
 $ : int 2
>

regardless whether R runs in a terminal of, say, RStudio. If running in the terminal (also RStudio Console on Linux), we can do:

> cl <- parallel::makeCluster(1L, outfile = NULL)
starting worker pid=9000 on localhost:11851 at 20:05:19.612  <<= also here

> y <- parallel::parLapply(cl, X = 1:2, fun = print)
[1] 1    <= "terminal" stdout from the child process
[1] 2    <= "terminal" stdout from the child process
> str(y)
List of 2
 $ : int 1
 $ : int 2
>

Also, that stdout is going straight to the terminal and never reaches R, e.g.

> out <- capture.output(y <- parallel::parLapply(cl, X = 1:2, fun = print))
[1] 1
[1] 2
> str(out)
chr(0)
>

But if using RStudio Console on Windows, I think the above output is completely lost.

I don't see how it's possible to relay the output from print() in a "live" fashion in a consistent manner. This is why I say, it may be done for some parallel backends and R setups but not all.

adrfantini · 2018-12-12T09:02:40Z

As far as my tests go (only on Linux) the live logging does not work in any case if working from the console, but works if logging to a file, which is good and, I would say, covers many usecases.

s-fleck · 2019-01-05T12:12:51Z

Not a solution, but if you are on linux you can just have a terminal open with tail -f logfile.log. That's pretty close to real-time logging, though not to the R console I guess :/

HenrikBengtsson added the feature request label Dec 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integration with futile.logger #268

Integration with futile.logger #268

adrfantini commented Dec 7, 2018

HenrikBengtsson commented Dec 7, 2018 •

edited

Loading

adrfantini commented Dec 9, 2018

muxspace commented Dec 11, 2018

HenrikBengtsson commented Dec 11, 2018

muxspace commented Dec 12, 2018

HenrikBengtsson commented Dec 12, 2018

adrfantini commented Dec 12, 2018

s-fleck commented Jan 5, 2019

Integration with futile.logger #268

Integration with futile.logger #268

Comments

adrfantini commented Dec 7, 2018

HenrikBengtsson commented Dec 7, 2018 • edited Loading

adrfantini commented Dec 9, 2018

muxspace commented Dec 11, 2018

HenrikBengtsson commented Dec 11, 2018

muxspace commented Dec 12, 2018

HenrikBengtsson commented Dec 12, 2018

adrfantini commented Dec 12, 2018

s-fleck commented Jan 5, 2019

HenrikBengtsson commented Dec 7, 2018 •

edited

Loading