-
Notifications
You must be signed in to change notification settings - Fork 294
Pull requests: pytorch/torchtitan
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Enable CP tests
CLA Signed
This label is managed by the Meta Open Source bot.
#906
opened Feb 28, 2025 by
fegin
Loading…
Enable FlexAttention for llama model
CLA Signed
This label is managed by the Meta Open Source bot.
fb-exported
#887
opened Feb 25, 2025 by
fegin
Loading…
[Not for land] Show This label is managed by the Meta Open Source bot.
no_sync
for microbatching/grad accum
CLA Signed
checkpoint folder by model name and flavor
CLA Signed
This label is managed by the Meta Open Source bot.
#879
opened Feb 24, 2025 by
K-H-Ismail
Loading…
Configure arbitrary frozen modules via config
CLA Signed
This label is managed by the Meta Open Source bot.
#869
opened Feb 20, 2025 by
lkhphuc
Loading…
[Not for landing] piggy back on titan for scale init test
CLA Signed
This label is managed by the Meta Open Source bot.
Add force_recompute_fp8_weight_in_bwd when FSDP
CLA Signed
This label is managed by the Meta Open Source bot.
#832
opened Feb 11, 2025 by
c0g
Loading…
profile with modules and stack
CLA Signed
This label is managed by the Meta Open Source bot.
#829
opened Feb 10, 2025 by
carmocca
Loading…
[cp] Add cudnn attention support to Context Parallel
CLA Signed
This label is managed by the Meta Open Source bot.
Make CheckpointManager friendlier to custom StorageWriter/StorageReader
CLA Signed
This label is managed by the Meta Open Source bot.
#789
opened Jan 12, 2025 by
dimdi-y
Loading…
Register backward hook for the whole optim_dict to enable working at multi schedule pp
CLA Signed
This label is managed by the Meta Open Source bot.
[Not for land] Integrate float8nocompile, an experimental feature for high performance
CLA Signed
This label is managed by the Meta Open Source bot.
#778
opened Jan 7, 2025 by
danielvegamyhre
Loading…
[PoC] Typed JobConfig
CLA Signed
This label is managed by the Meta Open Source bot.
#767
opened Jan 1, 2025 by
jaysonfrancis
Loading…
[MoE][PoC] Expert Parallel: tp and tp2ep
CLA Signed
This label is managed by the Meta Open Source bot.
First draft Auto-SAC workflow
CLA Signed
This label is managed by the Meta Open Source bot.
#710
opened Dec 2, 2024 by
sanketpurandare
•
Draft
[WIP] Allow benchmark between multiple configs
CLA Signed
This label is managed by the Meta Open Source bot.
#703
opened Nov 26, 2024 by
H-Huang
Loading…
[WIP] Adding OBELICS DataLoader
CLA Signed
This label is managed by the Meta Open Source bot.
#663
opened Oct 30, 2024 by
TJ-Solergibert
Loading…
[not for land] torch.compile individual linears
CLA Signed
This label is managed by the Meta Open Source bot.
#661
opened Oct 29, 2024 by
vkuzo
Loading…
Init weights only if not loading a checkpoint
CLA Signed
This label is managed by the Meta Open Source bot.
[DO NOT REVIEW] gaps to enable FDSP2 cpu offloading
CLA Signed
This label is managed by the Meta Open Source bot.
#622
opened Oct 16, 2024 by
weifengpy
Loading…
[not for land] TE experiments, take 2
CLA Signed
This label is managed by the Meta Open Source bot.
#614
opened Oct 14, 2024 by
vkuzo
Loading…
Previous Next
ProTip!
Follow long discussions with comments:>50.