Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENHANCEMENT] Multi-output regression #97

Open
vtaquet opened this issue Oct 7, 2021 · 10 comments
Open

[ENHANCEMENT] Multi-output regression #97

vtaquet opened this issue Oct 7, 2021 · 10 comments
Labels
Needs decision The MAPIE team is deciding what to do next. Regression Related to regression (excluding time series)

Comments

@vtaquet
Copy link
Member

vtaquet commented Oct 7, 2021

Following this paper : http://proceedings.mlr.press/v128/messoudi20a.html

@sdestercke
Copy link

We have a better calibrated version (using copulas to better adjust target-wise confidence levels): https://arxiv.org/abs/2101.12002. Code available here: https://github.com/M-Soundouss/CopulaConformalMTR

@gmartinonQM
Copy link
Collaborator

Interesting, thanks for sharing ! We will definitely have a look when we will start this development.

@gmartinonQM
Copy link
Collaborator

gmartinonQM commented Jan 17, 2022

Hi @sdestercke, your work is definitely very interesting ! What do you think about giving it more visibility by contributing to MAPIE ?

Here are my suggestions :

  • Create a new multi_output_regression.py module
  • Create a class MapieMultiOutputRegressor in it, with at least an __init__, a fit and a predict methods
  • There should be at least two options in the __init__ : cv and method. Following your papers, cv="prefit" is the only valid option for now (split-conformal), but we could imagine extension to cross-validation in the future. method could be "single", "mutli" or "copula_empirical" for example (reusing the notations of your two papers). Start by picking the simplest one.
  • Your output predictions returned by predict should be numpy arrays of indicative shape (n_samples, 3, n_targets), the 3 standing for prediction and lower/upper bound.
  • Create a small unit test illustrating the use of your class on a minimal toy dataset inmapie/tests/test_multi_output_regression.py

Beware that MAPIE, as its name indicates, is model-agnostic, so it should not be married with deep learning libraries like tensorflow/pytorch in the code base (even if any user can still use tensorflow models within MAPIE).

You can follow the code of MapieRegressor as a template to follow.

In case you have any doubts, would be very pleased to answer your questions in this discussion !

What do you think ?

@sdestercke
Copy link

Hi @gmartinonQM,

Many thanks for the suggestion and for the helpful tips on how to add our method to the MAPIE library.

I may not have the time right now to do it, but I think that @M-Soundouss could be equally if not more interested, with maybe a bit more time on her hand right now! There is also a priori no big issues in making it model agnostic. Two question points that directly pops up into my mind:

  • Basically, what we extend is the normalized version of conformal regression scores, so my guess is that the normalizing value should also be model-agnostic in the code?

  • Would it be okay to use the copulae python library within the code?

@gmartinonQM
Copy link
Collaborator

Hi @sdestercke, to answer your questions :

  • yes, normalizing value should be a user choice (but you can suggest a default model to use, within sklearn preferentially)
  • yes, feel free to add the copulae dependency, at least in setup.py (INSTALL_REQUIRES) to ensure that the dependency is resolved for the python package, and in the dev environment environment.dev.yml for developers

@gmartinonQM
Copy link
Collaborator

Any news about this @sdestercke @M-Soundouss ?

@sdestercke
Copy link

@gmartinonQM Not so far, @M-Soundouss is writing her thesis, which is time-consuming. Helping in implementing multi-output regression for MAPIE is still in our to-do list, though.

@vincentblot28
Copy link
Collaborator

Hi @sdestercke and @M-Soundouss, do you have any news about this issue ?

@M-Soundouss
Copy link

Hello @vincentblot28 ! I started working on it, I'll keep you updated. Thank you!

@Valentin-Laurent
Copy link
Collaborator

Hello @M-Soundouss, long time no see 😃
I know this is a pretty old discussion, but we're in the process of triaging/closing issues.
I believe you didn't find time to work on that?
Thank you

@Valentin-Laurent Valentin-Laurent added Needs decision The MAPIE team is deciding what to do next. and removed Discussion in progress Discussion ongoing between the Mapie team and the author. Enhancement labels Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs decision The MAPIE team is deciding what to do next. Regression Related to regression (excluding time series)
Projects
None yet
Development

No branches or pull requests

7 participants