-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shapes mismatch triggered at modeling_flax_utils #14215
Comments
cc @patil-suraj |
It seems to work with |
Do we know which commit in Flax is responsible for this bug? |
Could you please elaborate ? Thanks! |
This looks like an issue with Flax (huggingface/transformers#14215)
Left a comment here, borisdayma/dalle-mini#99 (comment) |
Good day,
while using the MiniDalle repo at:
borisdayma/dalle-mini#99
we are suddenly getting this error which was not happening before:
"Trying to load the pretrained weight for ('decoder', 'mid', 'attn_1', 'norm', 'bias') failed: checkpoint has shape (1, 1, 1, 512) which is incompatible with the model shape (512,). Using
ignore_mismatched_sizes=True
if you really want to load this checkpoint inside this model."This is being triggered here:
https://huggingface.co/transformers/_modules/transformers/modeling_flax_utils.html
in this area:
# Mistmatched keys contains tuples key/shape1/shape2 of weights in the checkpoint that have a shape not # matching the weights in the model. mismatched_keys = [] for key in state.keys(): if key in random_state and state[key].shape != random_state[key].shape: if ignore_mismatched_sizes: mismatched_keys.append((key, state[key].shape, random_state[key].shape)) state[key] = random_state[key] else: raise ValueError( f"Trying to load the pretrained weight for {key} failed: checkpoint has shape " f"{state[key].shape} which is incompatible with the model shape {random_state[key].shape}. " "Using
ignore_mismatched_sizes=Trueif you really want to load this checkpoint inside this " "model." )
There is a way to avoid halting the execution by going into the code and adding "ignore_mismatched_sizes=True" in the call. However, this does not fix the problem. If we do that, the execution continues but the results obtained by the minidalle model are wrong all washed out and with the wrong colors and contrast (which was not happening some days ago, so something has changed that is producing this problem).
So this seems to be a bug coming from this file. Any tips are super welcome, thank you :)
The text was updated successfully, but these errors were encountered: