Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impute: allow user-defined value for "no data" #7002

Open
wvdvegte opened this issue Jan 22, 2025 · 5 comments
Open

Impute: allow user-defined value for "no data" #7002

wvdvegte opened this issue Jan 22, 2025 · 5 comments
Labels
meal This will take a day or two wish

Comments

@wvdvegte
Copy link

wvdvegte commented Jan 22, 2025

What's your use case?
In Orange, cells with missing data are shown as "?" in Data Table. These are also the cells that are affected by impute. But in some standards, missing data is represented by some fixed value. For instance, in files produced by Garmin tracking devices, a missing heart rate is shown as "255" (apparently this is a heart rate never measured in real humans, also the max. value fitting into 8 bits).

What's your proposed solution?
Allow user specification of the value that corresponds to "no data"

Are there any alternative solutions?
Convert the "no data" to "?" using Formula, e.g., float (nan) if heart_rate == 255 else heart_rate - however this is not a very straightforward solution.

@wvdvegte wvdvegte changed the title Impute: allow user-defined vaalue for "no data" Impute: allow user-defined value for "no data" Jan 22, 2025
@markotoplak
Copy link
Member

Did you try the Impute widget with its Fixed option?

@wvdvegte
Copy link
Author

Yes, but that replaces the "?" values with a fixed value. It still needs an 'official' "?" on the input.

@markotoplak
Copy link
Member

Oh, I misunderstood, OK, so you want the reverse of Impute. :)

@janezd
Copy link
Contributor

janezd commented Jan 24, 2025

I understand your frustration, I know this situation.

In general we avoid implementing data editing that can be done in Excel. This one is on the border.

We discussed different approaches, from the one that you mentioned (Formula) to changing Impute or defining the representation of the missing value in the File and CSV Importer. At the end we decided that the best place could be the Domain Editor because the presentation for the missing values is usually a property of a variable.

This is complicated to implement, though, so do not expect a quick solution...

@janezd janezd added wish meal This will take a day or two labels Jan 24, 2025
@wvdvegte
Copy link
Author

I'm not in a hurry - for now, the formula-based solution works fine and it is simple enough (though not obvious and therefore potentially user-unfriendly)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meal This will take a day or two wish
Projects
None yet
Development

No branches or pull requests

3 participants