improve _optim_ckpt_wrapper so it is a drop in replacement of optimizer #2052

felipemello1 · 2024-11-22T15:04:24Z

Our recipes are cluttered with logic that checks "if optim_in_bwd".

With a bit of engineering, we can make it a drop in replacement of optimizer, and avoid code like this:

if not self._optimizer_in_bwd:
    self._optimizer.zero_grad()
else:
    for opt in self._optim_ckpt_wrapper.optim_map.values():
        opt.zero_grad()

That can be replaced with:

class MyOptWrapper:
	def __init__(self, optimizers):
		self.optimizers = optimizers

	def zero_grad():
		for opt in self.optimizers.optim_map.values():
        	opt.zero_grad()

optimizer = MyOptWrapper(optimizers)
optimizer.zero_grad()

It may break things from time to time, but good testing should avoid errors hitting prod. For overly complex situations, e.g. checkpointing, we can still do if/else, but we definitely don't need every if/else that we have today: A total of 8.

The text was updated successfully, but these errors were encountered:

RdoubleA · 2024-11-22T16:49:40Z

I've thought about this approach and I do like that it cleans up the recipe. But it adds some indirection and forces users to have to learn what the wrapper even does, and wouldn't it require all optimizers to be wrapped in this regardless of if they're using optimizer in bwd or not?

felipemello1 · 2024-11-22T17:58:43Z

wouldn't it require all optimizers to be wrapped in this regardless of if they're using optimizer in bwd or not?

I dont think so. The implementation would be something like this:

optimizer = config.instantiate(my_opt)
if opt_in_bwd:
	optimizer = MyOptWrapper(optimizer)

The idea is that, for example, when you call optimizer.zero_grad(), it doesn't need to know if its the optimizer or the wrapper, because the wrapper behaves like the optimizer.

krammnic · 2024-12-24T09:29:43Z

So the point is to avoid all conditions about opt_in_bwd except checkpointing one?

felipemello1 · 2024-12-24T15:40:50Z

@krammnic, yes! It litters our recipe in many places. It would make the code cleaner and easier to maintain.

felipemello1 added best practice Things we should be doing but aren't community help wanted We would love the community's help completing this issue labels Nov 22, 2024

felipemello1 changed the title ~~improve _optim_ckpt_wrapper to its a drop in replacement of optimizer~~ improve _optim_ckpt_wrapper so it is a drop in replacement of optimizer Nov 22, 2024

krammnic mentioned this issue Feb 9, 2025

[RFC]: Get rid of optim_bwd checks via wrapper. #2370

Open

felipemello1 removed the community help wanted We would love the community's help completing this issue label Feb 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improve _optim_ckpt_wrapper so it is a drop in replacement of optimizer #2052

improve _optim_ckpt_wrapper so it is a drop in replacement of optimizer #2052

felipemello1 commented Nov 22, 2024 •

edited

Loading

RdoubleA commented Nov 22, 2024

felipemello1 commented Nov 22, 2024 •

edited

Loading

krammnic commented Dec 24, 2024

felipemello1 commented Dec 24, 2024

improve _optim_ckpt_wrapper so it is a drop in replacement of optimizer #2052

improve _optim_ckpt_wrapper so it is a drop in replacement of optimizer #2052

Comments

felipemello1 commented Nov 22, 2024 • edited Loading

RdoubleA commented Nov 22, 2024

felipemello1 commented Nov 22, 2024 • edited Loading

krammnic commented Dec 24, 2024

felipemello1 commented Dec 24, 2024

felipemello1 commented Nov 22, 2024 •

edited

Loading

felipemello1 commented Nov 22, 2024 •

edited

Loading