How to remove old versions of media files from a git repository
Check the section on 'Removing Objects' in the chapter Maintenance and Data Recovery in the ProGit book. It provides steps about how to go about removing objects from the git repo. But be warned though that it is destructive.
As mentioned already, you will be re-writing history here, so you will have to get collaborators (if any) to do git rebase
.
As for stripping a particular file from history, Github has a nice walkthrough.
For a solution going forward, you should look at putting the binary files in a sub-module.
Git's submodule support allows a repository to contain, as a subdirectory, a checkout of an external project. Submodules maintain their own identity; the submodule support just stores the submodule repository location and commit ID, so other developers who clone the containing project ("superproject") can easily clone all the submodules at the same revision. Partial checkouts of the superproject are possible: you can tell Git to clone none, some or all of the submodules.
https://git-scm.com/docs/git-submodule
https://git-scm.com/book/en/v2/Git-Tools-Submodules
Old thread but in case someone else stumbles along here…
GitHub & Bitbucket both recommend using BFG Repo-Cleaner.
See:
GitHub: Remove Sensitive Data
Bitbucket: Reduce Repository Size &
Bitbucket: Maintaining a Git Repository
Example to remove files over 1 Megabyte, as well as jpgs, pngs and mp3s that are not in HEAD:
# First get the latest bfg.jar, then:
$ git clone --mirror git://example.com/some-big-repo.git
$ java -jar bfg.jar --strip-blobs-bigger-than 1M --delete-files '*.{jpg,png,mp3}' some-big-repo.git
$ cd some-big-repo.git
$ git reflog expire --expire=now --all && git gc --prune=now --aggressive
$ git push
Note: now you've pushed the updated revs, the remote repository should also run it's git gc
…else you won't see the size reduction. (see e.g. https://stackoverflow.com/a/28782154/3419541)
Finally, re-clone the repository to be sure that you don't accidentally re-commit the old media file blobs.
I have a script (github gist here) to remove a selection of unwanted folders from the entire history of a git repo, or to delete all but the latest version of a folder.
It's hard-coded to assume that all git repositories are in ~/repos
, but that's easy to change. It should also be easy to adapt to work with individual files.