-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Contributing: Decrease the size of the source code needed to be downloaded #29008
Comments
This command lists files that are larger than 1MB (requires
Results are at https://pastebin.com/hQaYcHwE Those includes files in I suspect there's a long tail of files Removing files from the history would break the hashes, but might be worth it in this case. |
@iandunn, that's very helpful. Thank you for doing a more in-depth investigation.
@hypest, I see a lot of mobile-related files with the highest impact that don't look like source code. Would it be possible to remove some of them from the git history so we could make the Gutenberg repository faster to download?
This one alone looks like a quick win if we erase the full history. |
Aha, I'm not familiar with what that file is but I'm sure there are savings to be made along those lines. @ceyhun , you think you can take a look when you get some chance, possibly after HACK week of March 2021? Thanks! |
I'm guessing someone could use that, then send a PR, and it wouldn't have any problems. I haven't tested that, though. The downside is that people would have to intentionally do it, since it's not the default behavior. Scripts and docs could be updated, though. If we do remove stuff from history, I'd recommend getting rid of everything all at once if possible. Changing the history will break lots of stuff, so we'd probably only want to do it once every few years, at most. |
I tested the |
@gziolo I found the commits for this file using @hypest I also checked the pastebin and saw a lot of mobile bundles and binaries. I think it's also fine to erase these ones as we're not using them and they can also be regenerated if needed:
|
@ceyhun, this is a great finding. How can we perform this cleanup? Is it something you could do yourself? |
+1 for removing the Android ones, the APK and the app.zip, no questions asked. For the iOS ones though, it will probably make trying out/debugging older WPiOS versions harder, right? Recreating them will probably be cumbersome. I actually don't like that we had to commit the JS bundles at all so, if you feel confident about iOS debugging without having the bundles readymade @ceyhun then I'm +1. |
@gziolo I'm not really sure what git magic is needed for this to happen 🪄 I also do not consider myself a git magician 😃 So any help would be appreciated.
@hypest I think WPiOS was always using the bundles on gutenberg-mobile repo and seems like that one goes back as far as 2018, so maybe that's enough? |
@gziolo @ceyhun based on this SO answer, |
We discussed options on WordPress Slack in the #meta channel (link requires registration at https://make.wordpress.org/chat/): @dd32 shared the following: Playing with
I started with the first step and rewrote the history of It looks like it mostly generates new bundle files for the Storybook instance available at https://wordpress.github.io/gutenberg/. Can you check if we can remove the mobile branches listed completely? |
Oh, right @ceyhun. I don't think WPiOS was ever using the bundle directly from Gutenberg's repo, only from gutenberg-mobile. I see what you mean now so yeah, no need for the native mobile (RN) bundle inside Gutenberg's repo 👍. |
@gziolo I went ahead and deleted the following mobile branches:
But I'm not sure about deleting this tag: Also on second thought, I think we can use gutenberg-mobile to view git history before monorepo as well. It would be harder to search and find a specific file from gutenberg repo in gutenberg-mobile back again just for its history, but I think it's possible. I also don't remember using |
Good point Ceyhun. The commit history is indeed available in gutenberg-mobile's repo, but I think it's quite hard to connect the dots as that repo has also moved on. All in all, I'd prefer if we keep the |
We're thinking of keeping a fork of Thanks @mchowning for coming up with that idea! |
We can wait a few more weeks, no worries. The smaller size of the download necessary to clone the repository is worth it 😄 Thank you for all the help so far 🙇🏻 |
@gziolo just created a fork wordpress-mobile/gutenberg-rnmobile-monorepo-commit-history to keep the history and deleted the |
This is great. The only remaining task would be to improve the GitHub workflow that uses |
@ockham, how much work it would be to run on git checkout — orphan latest_branch
git add -A
git commit -am “Initial commit message” #Committing the changes
git branch -D master #Deleting master branch
git branch -m master #renaming branch as master
git push -f origin master #pushes to master branch
git gc — aggressive — prune=all # remove the old files I don't remember what I used exactly before, but it was similar and it remove all git history for |
Looks like it shouldn't be too much work; basically, any workflow that uses For me, the bigger question seems to be if we really want to routinely rewrite the history of our
Wouldn't that maybe make more sense? If we've identified that:
... why not keep things nicely separated, create a dedicated |
Yes.
It was discussed as well. Whatever works best here 😄 |
I'm leaning towards the latter, TBH. Seems fairly straight-forward. The main questions are probably if creating a new |
Looks like we might even be able to continue using the same GH action we're using now: It supports both deploying to a different repo, and to a subdir (not entirely sure if those can be combined). For the different repo, we need a personal access token -- rather than Oh, I just noticed that if we wanna go ahead with pruning the history of the |
I see https://github.com/peaceiris/actions-gh-pages#%EF%B8%8F-force-orphan-force_orphan. This is exactly what we want and it makes it so much easier to approach this way. I will merge directly to |
It worked with d4bef28: We are now at 200-ish MB, which is 10% of the initial size: Many thanks to everyone involved. |
Similar to #26993.
What problem does this address?
It takes ages to finish:
At the moment the size of the repository is over 2GB!!!!
If you add to the mix that you need to run on every brand new repository:
It adds another 1GB of data that needs to be downloaded as reported in #26993.
What is your proposed solution?
It makes me think that maybe
gh-pages
branch is one of the reasons why the size of the repository has grown so much. We replace the content ofgh-pages
with the new build of Storybook on every commit to the main branch.I don't know how this sort of issues are usually solved in git-based repositories, but the comment from WordPress Slack (link requires registration at https://make.wordpress.org/chat/) authored by @ocean90 should be a good start:
https://wordpress.slack.com/archives/C5UNMSU4R/p1609864617204200?thread_ts=1609770083.149700&cid=C5UNMSU4R
The link included:
https://medium.com/@sangeethkumar.tvm.kpm/cleaning-up-a-git-repo-for-reducing-the-repository-size-d11fa496ba48
This article contains some techniques that could help with
gh-pages
where we don't care about history at all. There are also several interesting references to other similar articles that try to address similar issues.The text was updated successfully, but these errors were encountered: