Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about LabelEmbedder. #12

Open
ChangeFWorld opened this issue Sep 10, 2024 · 3 comments
Open

Question about LabelEmbedder. #12

ChangeFWorld opened this issue Sep 10, 2024 · 3 comments

Comments

@ChangeFWorld
Copy link

ChangeFWorld commented Sep 10, 2024

Thanks for your awesome work firstly.

Recently I have studied the code and found you use multiple LabelEmbedder in the U-DiT model. However, I am not sure if this approach is right because in classifier-free guidance, the LabelEmbedder has a class_dropout_prob. Since there are 3 LabelEmbedder in a forward pass, the probability of all of them drop out the label will be 0.1**3 = 0.001, which means that the labels are very likely to leak to the model in some latent resolution. I'm afraid that this will damage the conditional generation quality. In fact, I have tried the U-DiT-L-1000k steps model and find the visual quailty at cfg=1.5 seems worse than DiT-XL/2 at 7M steps(I haven't tried FID because generating 50k samples requires a lot of compute).

Do you have any idea about this? Thanks for your attention!

@YuchuanTian
Copy link
Owner

Thanks for your comments! I have inspected the code and I completely agree with your opinion. The inconsitency of label embedding could hamper the performance of conditional generation. I will fix this architectural bug and report fixed values asap.

@ChangeFWorld
Copy link
Author

Thanks for your comments! I have inspected the code and I completely agree with your opinion. The inconsitency of label embedding could hamper the performance of conditional generation. I will fix this architectural bug and report fixed values asap.

Thank you! I'm looking forward to your update. Recently we're trying to build more efficient diffusion backbones and your research inspired us a lot!

@wytcsuch
Copy link

Perhaps adjusting class_ropout_deb=0.5 is a simple and effective method

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants