Skip to content

Commit

Permalink
[FIX][ENH] Improve "DA methods" examples (#188)
Browse files Browse the repository at this point in the history
* different section for pipeline and adapter

* same for mapping methods

* typo

* update how to use skada

* oversight

* update how to use

* update README and users guide

* Update "DA methods" examples

* update

* update

---------

Co-authored-by: apmellot <[email protected]>
Co-authored-by: Rémi Flamary <[email protected]>
Co-authored-by: Théo Gnassounou <[email protected]>
Co-authored-by: tgnassou <[email protected]>
  • Loading branch information
5 people authored Jul 4, 2024
1 parent e2db517 commit 0d9c6b1
Show file tree
Hide file tree
Showing 4 changed files with 23 additions and 19 deletions.
6 changes: 3 additions & 3 deletions examples/methods/plot_dasvm.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,8 @@
# We generate our 2D dataset with 2 classes
# ------------------------------------------
#
# We generate a simple 2D dataset from moon distribution, where source and target
# are not taken from the same location in the moons. This dataset thus present
# We generate a simple 2D dataset from a moon distribution, where source and target
# are not taken from the same location in the moons. This dataset thus presents a
# covariate shift.

X, y, sample_domain = make_dataset_from_moons_distribution(
Expand Down Expand Up @@ -86,7 +86,7 @@
# The main problem here is that we only know the distribution of the points
# from the target dataset, our goal is to label it.
#
# The algorithm of the DASVM consist in fitting multiple base_estimator (SVC) by:
# The DASVM method consist in fitting multiple base_estimator (SVC) by:
# - Removing from the training dataset (if possible)
# `k` points from the source dataset for which the current
# estimator is doing well
Expand Down
10 changes: 5 additions & 5 deletions examples/methods/plot_reweighting.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,12 +35,12 @@

# %%
#
# The reweighting methods
# Reweighting Methods
# ------------------------------------------
# The goal of reweighting methods is to estimate some weights for the
# source dataset in order to then fit a estimator on the source dataset,
# while taking those weights into account, so that the fitted estimator is
# well suited to predicting labels from points drawn from the target distribution.
# The purpose of reweighting methods is to estimate weights for the source dataset.
# These weights are then used to fit an estimator on the source dataset, taking the
# weights into account. The goal is to ensure that the fitted estimator is suitable
# for predicting labels from the target distribution.
#
# Reweighting methods implemented and illustrated are the following:
# * :ref:`Density Reweighting<Illustration of the Density Reweighting method>`
Expand Down
19 changes: 10 additions & 9 deletions examples/methods/plot_subspace.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,14 +33,15 @@
# The subspaces methods
# ------------------------------------------
#
# Firstly we are in the case of unsupervised domain adaptation, it means that
# we know the labels of the source data but not from the target data.
# The goal of subspace is to project data from a d dimensional space
# into a d' dimensional space with d'<d.
# This kind of da method is especially good when we work with subspace
# shift, meaning that there is a subspace on which, when projected on it,
# the source and target data have the same distributions.
#
# Supspace methods are used in unsupervised domain adaptation.
# In this case, we have labeled data for the source domain but not for the target
# domain.
# The goal of subspace methods is to project data from a d-dimensional space
# into a lower-dimensional space with d' < d.
# Subspace methods are particularly effective when dealing with subspace shift,
# where the source and target data have the same distributions when projected onto a
# subspace.

# The Subspace methods implemented and illustrated are the following:
# * :ref:`Subspace Alignment<Illustration of the Subspace Alignment method>`
# * :ref:`Transfer Component Analysis<Illustration of the Transfer Component
Expand Down Expand Up @@ -339,7 +340,7 @@ def plot_subspace_and_classifier(
#
# In most of the previous works, we explored two learning strategies independently for
# domain adaptation: feature matching and instance reweighting. Transfer Joint Matching
# (TJM) aims to use both, by adding a constant to tradeoff between to two.
# (TJM) aims to use both, by adding a constant to tradeoff between the two.
#
# See [26] for details:
#
Expand Down
7 changes: 5 additions & 2 deletions examples/plot_how_to_use_skada.py
Original file line number Diff line number Diff line change
Expand Up @@ -97,9 +97,10 @@
# DA estimator in a pipeline
# -----------------------------
#

# SKADA estimators can be used as the final estimator of a scikit-learn pipeline.
# Again, the only difference is that the :code:`sample_domain` array must be
# passed by name during in fit.
# Again, the only difference is that the :code:`sample_domain` array must be passed
# by name during in fit.


# create a DA pipeline
Expand All @@ -120,6 +121,7 @@
# Here is an example with the CORAL and GaussianReweight adapters.
#
# .. WARNING::

# Note that as illustrated below for reweighting adapters, one needs a
# subsequent estimator that takes :code:`sample_weight` as an input parameter.
# This can be done using the :code:`set_fit_request` method of the estimator
Expand Down Expand Up @@ -222,6 +224,7 @@
print("Accuracy on source:", pipe.score(Xs, ys, sample_domain=sample_domain_s))
print("Accuracy on target:", pipe.score(Xt, yt)) # target by default


# %%
# Similarly one can use the PerDomain selector to train a different estimator
# per domain. This allows to handle multiple source and target domains. In this
Expand Down

0 comments on commit 0d9c6b1

Please sign in to comment.