Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow attention processors to have different signatures #6915

Merged
merged 1 commit into from
Feb 10, 2024

Conversation

yiyixuxu
Copy link
Collaborator

@yiyixuxu yiyixuxu commented Feb 9, 2024

fix #6700
also solve this issue discussed here: #6814

allowing attention processor with different signatures to be mixed and mathed together

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@yiyixuxu yiyixuxu requested review from DN6 and sayakpaul February 9, 2024 00:27
logger.warning(
f"cross_attention_kwargs {unused_kwargs} are not expected by {self.processor.__class__.__name__} and will be ignored."
)
cross_attention_kwargs = {k: w for k, w in cross_attention_kwargs.items() if k in attn_parameters}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could I get an explanation on how this ensures the scale parameter from cross_attention_kwargs gets passed appropriately?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't get passed if the attention processor does not accept a scale parameter - it will throw a warning instead.

For example, IPAdapterAttnProcessor has a scale parameter in its signature, but it does not use it. We only added it so that we can use it together with the default attention processors

With this, we don't need to do that anymore. - Attention Processors will have fixed signatures that only include parameters that they actually use

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another example is in this PR https://github.com/huggingface/diffusers/pull/6847/files#diff-96aeb208ac8e69bdb368e0a481a0cd50173b3f4bf19e5da500c251614820c8cdR1201

it introduced a new parameter for IPAdapterAttnProcessor ip_adapter_mask - and they had to add same parameter to AttenProcessor.call in order for these two attention processors to work together. We don't need to do that too

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining. The PR is now a go.

@asomoza
Copy link
Member

asomoza commented Feb 9, 2024

thank you very much for your work, I really needed this. This also should close #6601

@spezialspezial
Copy link
Contributor

sd-0394123-4199712965

@yiyixuxu yiyixuxu merged commit 0071478 into main Feb 10, 2024
16 checks passed
@yiyixuxu yiyixuxu deleted the unused-attn-kwargs branch February 10, 2024 07:56
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Attention processors that do not expect a temb parameter still get passed them and cause TypeErrors
5 participants