Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sdba] Re-write of Extreme's adjustment #914

Merged
merged 11 commits into from
Nov 26, 2021
Merged

Conversation

aulemahal
Copy link
Collaborator

@aulemahal aulemahal commented Nov 11, 2021

Pull Request Checklist:

  • This PR addresses an already opened issue (for bug fixes / features)
  • Tests for the changes have been added (for bug fixes / features)
  • Documentation has been added / updated (for bug fixes / features)
  • HISTORY.rst has been updated (with summary of main changes)
    • Link to issue (:issue:number) and pull request (:pull:number) has been added.
  • bumpversion patch has been called on this branch
  • The relevant author information has been added to .zenodo.json

What kind of change does this PR introduce?

  • Rewrite of sdba.ExtremeValues, according to the Matlab version and comments from @RondeauG.

Does this PR introduce a breaking change?

No, the ExtremeValues was hidden in previous versions.

Other information:

I'm not sure what to test...

@aulemahal
Copy link
Collaborator Author

One thing that gosses me here is that with dask we must know the dimension sizes before the computation. In the trained ds, I have the px_hist and af along a "quantiles" dim (no coord). The only way I found for estimating it is (1 - q_thresh) * ref.time.size . This is the maximal theoritical size if cluster_thresh = 0. But that's usually a lot larger than what we need. In my tests it was 10 times too large, so ds.af jas 90% NaN... yish.

Secondly, I have implemented dist so it could be controlled if we wanted, but genpareto is hardcoded. Do we want to accept other distributions?

@coveralls
Copy link

coveralls commented Nov 11, 2021

Coverage Status

Coverage increased (+0.4%) to 90.7% when pulling 9bd9f25 on sdba-extremes-fix-789 into b3349dd on master.

@huard huard requested review from RondeauG and huard November 23, 2021 21:53
Copy link
Contributor

@RondeauG RondeauG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I've successfully compared the results from my Matlab code and this new PR. Extremes are the same. Differences exist for other values, but this is due to differences in the transfer function and the number of corrected values (different approach between this and the older code). In short: The fix seems good.
  2. My test fails when I have chunks. I'll send @aulemahal my code so we can further test it out and see if he has the same issues.

# sort them in Px order, and pad to have N values.
order = np.argsort(Px_hist)
px_hist = np.pad(Px_hist[order], ((0, N - af.size),), constant_values=np.NaN)
af = np.pad(af[order], ((0, N - af.size),), constant_values=np.NaN)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

constant_values=np.NaN causes NaNs to appear in the result of adjust if some values in sim are above max(Pcommon)

Suggested change
af = np.pad(af[order], ((0, N - af.size),), constant_values=np.NaN)
af = np.pad(af[order], ((0, N - af.size),), constant_values=af[order][-1])

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got this fixed by tweaking extrapolate_qm and interp_on_quantiles to ignore NaNs.

Maybe @huard could have a look on these changes. I modified the output of extrapolate_qm in the case of method='nan' (not really related to this PR). And interp_on_quantiles now drops ANY NaNs in the inputs EXCEPT for those expected extrapolated bounds.

Copy link
Collaborator

@huard huard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a superficial review.
I suppose it will be useful to get #424 done.

Copy link
Contributor

@RondeauG RondeauG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixes appear to be working as intended.

Copy link
Collaborator

@huard huard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy with the changes. Could you just confirm that these changes are exercised by the test suite ?

@aulemahal
Copy link
Collaborator Author

The last commit reactivates the Extreme's tests and removes a bit that was not necessarily true. The tests pass, but they don't do much more than checking if the method runs... I'm not sure of what to test?

@RondeauG Are the algorithms close enough that we could compare an output from the matlab version to the one here? Not value by value, but something like a general change in the distributions...

@huard
Copy link
Collaborator

huard commented Nov 26, 2021

I'm happy to leave testing against a reference value to a future PR. I would maybe advertise the algo as experimental until then.

@aulemahal aulemahal merged commit d3d5db9 into master Nov 26, 2021
@aulemahal aulemahal deleted the sdba-extremes-fix-789 branch November 26, 2021 20:50
raquelalegre added a commit to UCL/xclim that referenced this pull request Nov 30, 2021
…s/heat_index

* 'indices/heat_index' of github.com:UCL/xclim: (42 commits)
  Added french metadata for heat index.
  Apply changes to fraction indicator
  [sdba] Re-write of Extreme's adjustment (Ouranosinc#914)
  update history
  Update history
  [Ouranosinc#931] Fix indicator for Rxxp index
  Add missing type for `threshold_count`
  Add dev notes - add indexer example - other small changes
  run all available tests in one call for slow builds
  update makefile target black, only install coveralls in tox if coveralls is needed
  Indexing within indicators (Ouranosinc#934)
  Update history
  finish needs doctests
  Remove unnecessary keep_attrs
  Adapt to CF conventions 1.9 - remove gregorian calendar (Ouranosinc#935)
  Bump version: 0.31.2-beta → 0.31.3-beta
  update HISTORY.rst
  suggested corrections
  Apply suggestions from code review
  Resampling indicator (Ouranosinc#927)
  ...
raquelalegre added a commit to UCL/xclim that referenced this pull request Nov 30, 2021
…es/wetday_prop

* 'indices/wetday_prop' of github.com:UCL/xclim: (46 commits)
  Add moi-meme as 0.32 contributor
  do not redefine builtin next
  use array_almost_equal in tests
  Update history
  Improve quantile function
  Apply changes to fraction indicator
  [sdba] Re-write of Extreme's adjustment (Ouranosinc#914)
  update history
  Update history
  [Ouranosinc#931] Fix indicator for Rxxp index
  Add missing type for `threshold_count`
  Add dev notes - add indexer example - other small changes
  run all available tests in one call for slow builds
  update makefile target black, only install coveralls in tox if coveralls is needed
  Indexing within indicators (Ouranosinc#934)
  Update history
  finish needs doctests
  Remove unnecessary keep_attrs
  Adapt to CF conventions 1.9 - remove gregorian calendar (Ouranosinc#935)
  Bump version: 0.31.2-beta → 0.31.3-beta
  ...
raquelalegre added a commit to UCL/xclim that referenced this pull request Dec 2, 2021
…/xclim into indices/potential_evapotranspiration

* 'indices/potential_evapotranspiration' of github.com:UCL/xclim: (52 commits)
  Fix usage notebook
  update history
  allow a few more special chars
  Fix a few issues on rxxp
  add PR number
  use xarray.apply_ufunc to simplify the vectorization of statistical functions
  Add moi-meme as 0.32 contributor
  do not redefine builtin next
  use array_almost_equal in tests
  Update history
  Improve quantile function
  Apply changes to fraction indicator
  [sdba] Re-write of Extreme's adjustment (Ouranosinc#914)
  update history
  Update history
  [Ouranosinc#931] Fix indicator for Rxxp index
  Add missing type for `threshold_count`
  Add dev notes - add indexer example - other small changes
  run all available tests in one call for slow builds
  update makefile target black, only install coveralls in tox if coveralls is needed
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sdba.ExtremeValues sensitive to the threshold Faulty train()/adjust() behavior in sdba.ExtremeValues
4 participants