Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add copy / paste to @render.data_frame #1560

Open
mdaeron opened this issue Jul 19, 2024 · 6 comments
Open

Add copy / paste to @render.data_frame #1560

mdaeron opened this issue Jul 19, 2024 · 6 comments
Labels
data frame Related to @render.data_frame enhancement New feature or request

Comments

@mdaeron
Copy link

mdaeron commented Jul 19, 2024

This is half a feature request, half a question about how shiny conceptualizes data input.

I'd like to use shiny to help my researcher colleagues process some analytical data. Because most of the potential users are not proficient coders, the most accessible/realistic way to input data should be copy-and-paste from a spreadsheet app into the shiny app. Pasting into an editable datagrid does not appear to be currently implemented, and the fact that DataGrid is explicitly defined as an output makes me wonder if my use case is at odds with some core philosophy of shiny, such as "input data resides on the server", or "input data must be uploaded from a file".

My two cents: pasting into a datagrid offers at least two benefits over simply pasting data into a text area: (a) because any problems in formatting/cell mismatches are immediately apparent, the user can easily perform some kind of visual data validation; (b) one thing end-users frequently mistype is column headers; this problem disappears if the headers are already specified by a datagrid input.

So, is it possible that datagrids will some day be treated as full-blown inputs, or would that break things and/or go against some core conceptualization of py-shiny?

@schloerke schloerke added enhancement New feature or request data frame Related to @render.data_frame and removed needs-triage labels Jul 22, 2024
@schloerke
Copy link
Collaborator

Related - Add ability to update cells and data from the server: #1449


Pasting into an editable datagrid does not appear to be currently implemented ...

Correct. As of v1.0.0, this is not implemented. (But it could be in the future! Will expand on this in followup comment)


... and the fact that DataGrid is explicitly defined as an output makes me wonder if my use case is at odds with some core philosophy of shiny, such as "input data resides on the server", or "input data must be uploaded from a file".

We already have the ability to edit a data frame within the browser via a single cell edit.

The current approach has been:

  • The data starts from the return value from the function decorated with @render.data_frame.
  • An edit is created in the browser and the patch is sent to the server
  • The server processed the patch (approving it by not throwing an error and returning it and any other updated patches)
  • The browser receives the approved patches and inserts them into the displayed table

We can access the updated data via .data_view() and the original data via .data()


is at odds with some core philosophy of shiny

It is definitely different as updates are happening to the rendered output after the initial rendering has occurred.

But if the decorated @render.data_frame method is executed again (due to reactivity), then (currently) the user edits, user sorting, and user filtering is dropped. Everything is reset. This prevents us from entering a permanent dirty state where a previous interaction has compromised the current state that we can not escape.


such as "input data resides on the server", or "input data must be uploaded from a file"

@render.data_frame needs the data to be returned from it's decorated function. It does not care how it gets there... from a file or from the server itself.

@schloerke
Copy link
Collaborator

Questions I run into with pasting data:

  • Situation: Pasted data has new columns
    • ... Should they be string types on the server?
    • What if they look like numbers? Should they be converted then?
    • Can the server reject the new columns but keep the cells in the current columns? How can we accept and reject in a single request.
      • Currently it only returns an accepted set or the originally edited cells fail.
      • Maybe change the return patch structure to allow for error: str?
    • What new column names should be used?
      • I do not believe they're included in the pasted data
  • Situation: If the column type can be determined from the pasted data...
    • Should we auto reject if the types are a mismatch?
    • Should .set_patches_fn handle everything?
  • Situation: Pasted data has new rows
    • Add new rows! (This isn't contentious, but will need to be addressed)
    • What value do we use for the columns that did not receive data?
  • Situation: Remove columns
    • If you can add columns, it quickly extends to being able to remove columns
  • Situation: Remove rows
    • If the newly pasted data is smaller, should the row count be trimmed?
    • If not, should they be None values on the server? (What value is used as a placeholder value? Will it conflict with the column type (polars)?)

On the server side, we can get pretty far with .update_cell_value() and .update_data() proposed in #1449 .

  • By only updating cells, we can accept/reject immediately.
  • When updating data, it will come with a new set of columns, rows, and data types all at once. No questions around missing column names or missing cell values.

Goal: Reduce this question set . Any thoughts or approaches appreciated!

@kwa
Copy link

kwa commented Sep 19, 2024

Questions I run into with pasting data:

* Situation: Pasted data has new columns
  
  * ... Should they be string types on the server?
  * What if they _look_ like numbers? Should they be converted then?
  * Can the server reject the new columns but keep the cells in the current columns? How can we accept and reject in a single request.
    
    * Currently it only returns an accepted set or the originally edited cells fail.
    * Maybe change the return patch structure to allow for `error: str`?
  * What new column names should be used?
    
    * I do not believe they're included in the pasted data

* Situation: If the column type **can be** determined from the pasted data...
  
  * Should we auto reject if the types are a mismatch?
  * Should `.set_patches_fn` handle everything?

* Situation: Pasted data has new rows
  
  * Add new rows! (This isn't contentious, but will need to be addressed)
  * What value do we use for the columns that did not receive data?

* Situation: Remove columns
  
  * If you can add columns, it quickly extends to being able to remove columns

* Situation: Remove rows
  
  * If the newly pasted data is smaller, should the row count be trimmed?
  * If not, should they be `None` values on the server? (What value is used as a placeholder value? Will it conflict with the column type (polars)?)

On the server side, we can get pretty far with .update_cell_value() and .update_data() proposed in #1449 .

* By only updating cells, we can accept/reject immediately.

* When updating data, it will come with a new set of columns, rows, and data types all at once. No questions around missing column names or missing cell values.

Goal: Reduce this question set . Any thoughts or approaches appreciated!

Hi, I was on your Discord and got an answer and pointer to this issue about interacting with a datagrid/dataframe and specifically keeping its state after re-rendering. I think there is another issue on that subject perhaps it is the #1449 one.

I think two separate issues are being discussed here:

  1. How to paste (possibly copy as well) into an existing dataframe.
  2. How to manipulate the state of a dataframe.

I see the former (1) as a special case of (2).
The copy/paste feature, to me, seems like a somewhat special use case but state handling seems more of a general concern for all widgets.

I'll get to my proposal and discuss the implications of copying and pasting afterward.

Other frameworks have the concept of filters or interceptors. They operate similarly to a Python decorator in that they can intercept a function call and manipulate the response from the function call. In my case, I have external state in the form of form controls that have settings or filters that want to influence the dataframe's current data. I want to impose state changes like presorting, removing rows, or removing columns and have access to the current user-selected options in the currently rendered dataframe like the filtering and sort-order for one or several columns in the dataframe so that I can choose to include or disregard the user's current choices.

Having an interceptor that can manipulate the data before it is rendered and get a hook into when the user has made changes to the dataframe widget that caused it to change its UI is a powerful feature. Basically, you are dynamically changing the underlying dataframe before it is displayed. The API already has a notion of original .data() and currently .data_view() as well as the @grid.set_patch_fn and so on. Being able to interact with those two states is what I am looking at.

Back to pasting. I do not think it is the framework's role to keep track of the validity of data. For one thing, you have to guess what the correct outcome should be and my experience is that one never guesses what the user actually wants. Furthermore, all your questions and concerns seem to reinforce my point as there are too many things to consider.

However, you can give that responsibility to the code author using.... an interceptor :-)
The pasting interceptor could intercept the pasted content and attempt to rectify the data so that it matches what is currently rendered. The author probably knows exactly what is required and can make educated guesses or simply just know what to do. If things do not work out during the paste the author can provide a decent error message or message informing on why the paste did not work and give context-sensitive information for the user to act on.

The same concept applies to validation, to take another example, where the code intercepts a form transmission and tells the user to do something instead of sending it to the server and getting a validation error when it could have been intercepted before sending the form (yes you should always have validation server-side, but that is beside the point I think).

Sure it would be good if the framework could figure out all the things, but I fear there will be an abundance of edge cases. Especially as you can have anything in your clipboard that the widget would have to deal with. Then again perhaps I understand the original request poorly.

So having a way to hook into the rendering pipeline, especially for a datagrid/dataframe, gives a lot of power and responsibility. I think the interceptor concept could be applied to many widgets to control state.

Thanks for reading!

@schloerke
Copy link
Collaborator

schloerke commented Sep 20, 2024

Thank you for the input! Agreed, there are too many things to consider

Yes. #1449 's .update_data() will be needed to add / remove columns.

My summary of your post...

When pasting, let .set_patches_fn() handle all the edge cases. (.set_patches_fn() will return new patches (or even call .update_data() directly?) or raise an error.)

@schloerke schloerke changed the title Please consider treating datagrids as inputs as well as outputs ? Add copy / paste to @render.data_frame Oct 10, 2024
@schloerke
Copy link
Collaborator

Update: #1719 added support for .update_data(data) and .update_cell_value(value, *, row, col) for programmatic updating

@mdaeron
Copy link
Author

mdaeron commented Dec 19, 2024

Update: #1719 added support for .update_data(data) and .update_cell_value(value, *, row, col) for programmatic updating

Do I understand correctly that this feature is one of the building blocks needed to implement copy/pasting into tables, but that the copy/pasting feature is still in the future? If that's the case, do you have any idea how close this feature is to fruition? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data frame Related to @render.data_frame enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants