Skip to content

hitachi-nlp/rec-adam

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 

Repository files navigation

RecAdam

Re-implementation of Sanyuan-Chen/RecAdam.

Release Branches (READ CAREFULLY to determine which branch suits you)

  • (New!) NeurIPS_2024 branch (2024-12)

Features

  • Simpler interfaces with less tuning parameters.
  • Compatible with deepspeed.

Installation

pip install git+https://github.com/hitachi-nlp/rec-adam.git@NeurIPS_2024

How to use

Initializing the optimizer using the factory method

from rec_adam import build_rec_adam_optimizer

model = (...)  # load your model, such as llama
optimizer = build_rec_adam_optimizer(
    model,
    learning_rate=1e-05,
    fisher_coef=2000,
)

The loss will become something like loss = loss_original + target_task_weight * (fisher_coef * l2_term). Note that target_task_weight works differently from the original implementation, where the loss is somthing like loss = (1 - target_task_weight) * loss_original + (...).

fisher_coef should be tuned for each model and task. Note that the default value of 2000 is good for training LLaMA-3.1-8B on FLD2.

Using the optimizer via huggingface's Trainer interface

from rec_adam import RecAdamTrainer
trainer = RecAdamTrainer(Trainer):
    training_args,
    rec_adam_fisher_coef=2000,
):

(Not recommended) Initializing the optimizer directly via its constructor

We do not recommend to initialize the optimizer directly using its constructor, as setting the suitable arguments is complex.

Although, you can do it like this:

from rec_adam import RecAdam
optimizer = RecAdam(...)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages