Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unwrap expliticly given selector before generating the name for the pipeline #28

Closed
kachayev opened this issue Dec 2, 2023 · 5 comments
Labels
domain-aware api enhancement New feature or request good first issue Good for newcomers

Comments

@kachayev
Copy link
Collaborator

kachayev commented Dec 2, 2023

make_da_pipeline function right now gives an option to provide as a step estimator already wrapped into one of the selectors. When we use _name_estimators, it generates rather weird name. Better approach would be to unwrap estimator from the selector before naming it.

@kachayev kachayev added enhancement New feature or request good first issue Good for newcomers domain-aware api labels Dec 2, 2023
@kachayev kachayev changed the title Unwrap expliticly given selector before generating name for the pipeline Unwrap expliticly given selector before generating the name for the pipeline Dec 2, 2023
@kachayev
Copy link
Collaborator Author

kachayev commented Dec 2, 2023

Make sure #29 is merged before working on this issue.

@YanisLalou
Copy link
Collaborator

Can you give examples of "weird names"?

In this example:

{'otmappingadapter': Shared(base_estimator=OTMappingAdapter(), max_iter=100000, metric='sqeuclidean',
        norm=None),
 'perdomain-1': PerDomain(base_estimator=PCA(), copy=True, iterated_power='auto',
           n_components=None, n_oversamples=10,
           power_iteration_normalizer='auto', random_state=None,
           svd_solver='auto', tol=0.0, whiten=False),
 'perdomain-2': PerDomain(C=1.0, base_estimator=LogisticRegression(), class_weight=None,
           dual=False, fit_intercept=True, intercept_scaling=1, l1_ratio=None,
           max_iter=100, multi_class='auto', n_jobs=None, penalty='l2',
           random_state=None, solver='lbfgs', tol=0.0001, verbose=0,
           warm_start=False)}
           

Do we want to have instead of perdomain-1 and perdomain-2, pca and log_reg respectively ?

@rflamary
Copy link
Collaborator

perdomain_pca maybe?

@kachayev
Copy link
Collaborator Author

Maybe we can do perdomain_* for PerDomain selector but remove shared_* for Shared? I guess, if you put PerDomain explicitly, you know what you are doing and what to expect. But Shared is added "automatically" (most likely), so you won't know how to refer to your estimator in the pipeline.

@kachayev
Copy link
Collaborator Author

Closed in #51

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain-aware api enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants