Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking user-defined functions that do not create targets #260

Closed
6 tasks done
arnold-c opened this issue Jan 6, 2021 · 1 comment
Closed
6 tasks done

Tracking user-defined functions that do not create targets #260

arnold-c opened this issue Jan 6, 2021 · 1 comment

Comments

@arnold-c
Copy link

arnold-c commented Jan 6, 2021

Prework

  • Read and agree to the code of conduct and contributing guidelines.
  • If there is already a relevant issue, whether open or closed, comment on the existing thread instead of posting a new issue.
  • For any problems you identify, post a minimal reproducible example so the maintainer can troubleshoot. A reproducible example is:
    • Runnable: post enough R code and data so any onlooker can create the error on their own computer.
    • Minimal: reduce runtime wherever possible and remove complicated details that are irrelevant to the issue at hand.
    • Readable: format your code according to the tidyverse style guide.

Question

I'm hoping to try and track changes to user-defined functions do not create targets, but are used in a notebook that is part of a pipeline. For example, I have a quick plotting function that I use repeatedly, but I do not want to have to create a target for every single plot in a notebook as this would clutter the _target.R file. However, if I made a change to that function, even if it is sourced into the notebook, the pipeline thinks it is up to date.

I believe this is different to issues #241 and #239 as these relate to packages, not standalone functions.

One thought I had was to write these functions in a separate R file that I can track using format = "file" like below, and this seems to work, but I just wanted to make sure this is a reasonable thing to do and wouldn't lead to unintended consequences.

In _targets.R:

list(
    ...
    tar_target(
        user_functions,
        here::here("funs", "user-defined-functions.R")
        ),
        format = "file",
    ...
    tar_render(analysis_notebook, "analysis-notebook.Rmd")
    )

In my user-defined-functions.R file:

library(tidyverse)

df <- tibble(thresholds = 0:10, sensitivity = seq(0, 1, 0.1), specificity = rev(seq(0, 1, 0.1)))

my_function <- function(data){
  data %>%
    pivot_longer(cols = sensitivity:specificity, names_to = "metric", values_to = "value") %>%
    ggplot(aes(x = thresholds, y = value, color = metric)) +
    geom_point()
}

threshold_metric_plot(data = df)

In my Rmd notebook:

tar_load(user_functions)

my_function(data = df)

Created on 2021-01-06 by the reprex package (v0.3.0)

@wlandau
Copy link
Member

wlandau commented Jan 6, 2021

One thought I had was to write these functions in a separate R file that I can track using format = "file" like below, and this seems to work, but I just wanted to make sure this is a reasonable thing to do and wouldn't lead to unintended consequences.

Sounds like a great workaround to me.

Another approach is to define a target for all the function objects:

tar_target(functions, list(my_function, fun2, fun3, ...))

This technique is less trigger-happy than format = "file" because it does not invalidate the target if there are trivial changes to comments and whitespace. For a function in memory, targets cannot tell the difference between this:

my_function <- function(data){
  data %>%
    pivot_longer(cols = sensitivity:specificity, names_to = "metric", values_to = "value") %>%
    ggplot(aes(x = thresholds, y = value, color = metric)) +
    geom_point()
}

and this:

my_function <- function(data){
  data %>%

    pivot_longer(
      cols = sensitivity:specificity,
      names_to = "metric",
      values_to = "value"
    ) %>%

    ggplot(aes(x = thresholds, y = value, color = metric)) +

    geom_point()
}

@wlandau wlandau closed this as completed Jan 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants