git tree contains duplicate file entries
I finally fixed the repo by doing the following
- do a fresh clone from github, which only included commits before the problem occurred
- add my messed up repo from the filesystem as a remote on the new clone
painstakingly check out commits from the bad repo into the working copy of the new clone
git checkout fe3254FIRSTCOMMITAFTERORIGIN/MASTER/HEAD . // note the dot at the end // without the dot, you move your head to the commit instead of the commit // to the working copy, and seems to bring the corrupt object into your good clone
- commit each in turn, manually copying the commit message from the other repo
- remove the corrupt repo from remotes
garbage collect + prune
git gc --aggressive --prune=now
- weep happily as git fsck shows no duplicate file entries
checkout a new branch just before the problematic commit. now checkout the files from the problematic commit. Now add and commit them using the same message ( use the -C
option ). Repeat for the rest of the commits. After you're done, reset the other branch to point to this correct one. You can then push.
I used git-replace and git-mktree to fix this in the past. You essentially keep the broken tree object, but override all links and make them point to a new object.
First we grab the bad tree:
git ls-tree bad_tree_hash > tmpfile.txt
This writes out your bad tree. For example:040000·tree·3cdcc756ee0ed636c44828927126911d0ab28a18 → xNotAlphabetic 040000·tree·4ad0d8ef014b8cc09c95694399254eff43217bfb → EXT 040000·tree·d65085e4a05ea9ac8b79e37b87202dd64d402c2e → duplicateFolder 040000·tree·d65085e4a05ea9ac8b79e37b87202dd64d402c2e → duplicateFolder 040000·tree·fd0661d698ace91135a8473b26707892b7c89c32 → ToolTester 040000·tree·d65085e4a05ea9ac8b79e37b87202dd64d402c2e → duplicateFolder
NB, · & → are whitespace [space] and [tab]
Next, edit the text, removing the offending lines, and save with Unix-style endings (ie only LF, not CRLF). With this example, we make this:
040000·tree·4ad0d8ef014b8cc09c95694399254eff43217bfb → EXT 040000·tree·d65085e4a05ea9ac8b79e37b87202dd64d402c2e → duplicateFolder 040000·tree·fd0661d698ace91135a8473b26707892b7c89c32 → ToolTester 040000·tree·3cdcc756ee0ed636c44828927126911d0ab28a18 → xNotAlphabetic
Type
cat tmpfile.txt | git mktree
which will make a new, fixed tree object and save it, and return the new hash:a55115e4a05ea9ac8b79e37b872024d64d4r2c2e
a.k.a. for demo purposesnew_tree_hash
Next git replace will create a new reference, which forces all previously incident links to use the new, fixed object instead.
git replace bad_tree_hash new_tree_hash
This will solve your immediate problem. If you're interested, look at the overriding link in the .git/refs/replace
folder.
The bad tree object will continue to generate warnings whenever you do a check on your repository with git fsck
, but it can be ignored, and all your commits and other links will be consistent and working regardless.
8 year retrospective: There's probably a way to just delete the old, corrupt tree since git replace
should make it moot.
Further warning: This hack could also be rejected by a git service eg BitBucket or GitHub, since they could view it as corruption.
I had a problem of this ilk and all the solutions here and in other SO threads failed to fix it for me. In the end I used BFG repo cleaner to destroy all the commits which references the bad folder name, which was probably overkill but successfully repaired the repo.