[Bugfix] Add LightningOptimizer parity test and resolve AMP bug #5191

tchaton · 2020-12-19T15:39:59Z

What does this PR do?

Before submitting

Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?
Did you verify new and existing tests pass locally with your changes?
If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified; Bugfixes should be including in bug-fix release milestones (m.f.X) and features should be included in (m.X.b) releases.

Did you have fun?

Make sure you had fun coding 🙃

…PyTorchLightning/pytorch-lightning into bugfix/5165_enable_pl_optimizer

tests/trainer/optimization/test_parity_automatic_optimization.py

pytorch_lightning/core/lightning.py

awaelchli · 2020-12-20T09:34:49Z

pytorch_lightning/trainer/training_loop.py

@@ -489,6 +489,10 @@ def optimizer_step(self, optimizer, opt_idx, batch_idx, train_step_and_backward_
                'native PyTorch amp and lbfgs are not compatible.'
                ' To request, please file a Github issue in PyTorch and tag @mcarilli')

+        if not isinstance(optimizer, LightningOptimizer):
+            # wraps into LightingOptimizer only for running step
+            optimizer = LightningOptimizer.to_lightning_optimizer(optimizer, self.trainer)


@tchaton you can move the if statement inside the to_lightning_optimizer function to achieve idempotence and avoid code duplication.

…thub.com/PyTorchLightning/pytorch-lightning into bugfix/5165_enable_pl_optimizer_refactor

awaelchli

the tests are a bit complex, I don't really understand them but I was able to verify that they fail on the master branch with the wrong loss values so I suppose the scaling bug is fixed 👍

tchaton · 2020-12-22T08:24:26Z

the tests are a bit complex, I don't really understand them but I was able to verify that they fail on the master branch with the wrong loss values so I suppose the scaling bug is fixed 👍

The test are performing parity comparison with vanilla training in several scenarios and making sure everything matches properly when using enable_pl_optimizer=True/False. We will need to add more parity test such as Apex, DDP modes, multi optimisers.

Best,
T.C

tchaton and others added 30 commits December 17, 2020 08:39

update

fbebccb

clean test

f84085c

still in progress

a309878

udpdate test

ae08761

Merge branch 'master' into bugfix/5165_enable_pl_optimizer

3ef910f

update

f5a5d1e

Merge branch 'bugfix/5165_enable_pl_optimizer' of https://github.com/…

7edec88

…PyTorchLightning/pytorch-lightning into bugfix/5165_enable_pl_optimizer

update

b4181ea

resolve flake

be48064

add test for zero_grad

379d2be

update

fd51f32

works without accumulated_grad

05a838e

update

82c2602

update

386f6d4

resolve amp

b007c9d

Merge branch 'master' into bugfix/5165_enable_pl_optimizer

5007e68

revert back to True

88c5c63

Merge branch 'bugfix/5165_enable_pl_optimizer' of https://github.com/…

3accce3

…PyTorchLightning/pytorch-lightning into bugfix/5165_enable_pl_optimizer

update

7fc56ee

clean tests

8d13893

cleaned out

e7abee6

typo

14475e7

update test

b47db5e

git repare bug

6a79921

remove print

c106828

udpate

85e4e96

Fix formatting/optimizer imports

40f7c54

Refactor the test for cleanliness

e6f9945

Add vanilla model to the test, better var names

9d4fd68

Fixed var names, let's clean up these mock tests

f71ce5d

tchaton added this to the 1.1.x milestone Dec 19, 2020

tchaton changed the title ~~Bugfix/5165 enable pl optimizer refactor~~ [Bugfix] Add LightningOptimizer parity test and resolve AMP bug Dec 19, 2020

tchaton added priority: 0 High priority task bug Something isn't working labels Dec 19, 2020

Borda approved these changes Dec 19, 2020

View reviewed changes

tests/trainer/optimization/test_parity_automatic_optimization.py Outdated Show resolved Hide resolved

tests/trainer/optimization/test_parity_automatic_optimization.py Outdated Show resolved Hide resolved

Borda and others added 2 commits December 19, 2020 22:13

Apply suggestions from code review

f5ec5f5

format

a9c1f7e

Borda mentioned this pull request Dec 20, 2020

Bugfix: accumulated grad batches [wip] #5182

Closed

11 tasks

Merge branch 'master' into bugfix/5165_enable_pl_optimizer_refactor

151790d

awaelchli reviewed Dec 20, 2020

View reviewed changes

williamFalcon approved these changes Dec 20, 2020

View reviewed changes

tchaton added 7 commits December 20, 2020 19:06

adress comments

b33ee49

Merge branch 'master' into bugfix/5165_enable_pl_optimizer_refactor

196d8b4

Merge branch 'bugfix/5165_enable_pl_optimizer_refactor' of https://gi…

1677b6c

…thub.com/PyTorchLightning/pytorch-lightning into bugfix/5165_enable_pl_optimizer_refactor

update on comments

02ded96

Merge branch 'master' into bugfix/5165_enable_pl_optimizer_refactor

94d3b4b

Merge branch 'master' into bugfix/5165_enable_pl_optimizer_refactor

1e8a11e

Merge branch 'master' into bugfix/5165_enable_pl_optimizer_refactor

6e68e31

awaelchli approved these changes Dec 22, 2020

View reviewed changes

tchaton enabled auto-merge (squash) December 22, 2020 08:21

Merge branch 'master' into bugfix/5165_enable_pl_optimizer_refactor

47d047c

tchaton disabled auto-merge December 22, 2020 11:32

tchaton enabled auto-merge (squash) December 23, 2020 07:50

tchaton added 2 commits December 23, 2020 08:50

Merge branch 'master' into bugfix/5165_enable_pl_optimizer_refactor

05678f5

Merge branch 'master' into bugfix/5165_enable_pl_optimizer_refactor

68ed65e

tchaton requested a review from SeanNaren December 23, 2020 09:31

SeanNaren approved these changes Dec 23, 2020

View reviewed changes

tchaton merged commit ae04311 into master Dec 23, 2020

Borda deleted the bugfix/5165_enable_pl_optimizer_refactor branch December 23, 2020 22:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] Add LightningOptimizer parity test and resolve AMP bug #5191

[Bugfix] Add LightningOptimizer parity test and resolve AMP bug #5191

tchaton commented Dec 19, 2020 •

edited

Loading

awaelchli Dec 20, 2020 •

edited by tchaton

Loading

awaelchli left a comment

tchaton commented Dec 22, 2020

[Bugfix] Add LightningOptimizer parity test and resolve AMP bug #5191

[Bugfix] Add LightningOptimizer parity test and resolve AMP bug #5191

Conversation

tchaton commented Dec 19, 2020 • edited Loading

What does this PR do?

Before submitting

PR review

Did you have fun?

awaelchli Dec 20, 2020 • edited by tchaton Loading

Choose a reason for hiding this comment

awaelchli left a comment

Choose a reason for hiding this comment

tchaton commented Dec 22, 2020

tchaton commented Dec 19, 2020 •

edited

Loading

awaelchli Dec 20, 2020 •

edited by tchaton

Loading