What is the `git restore` command and what is the difference between `git restore` and `git reset`?
I have presented git restore
(which is still marked as "experimental") in "How to reset all files from working directory but not from staging area?", with the recent Git 2.23 (August 2019).
It helps separate git checkout
into two commands:
- one for files (
git restore
), which can covergit reset
cases. - one for branches (
git switch
, as seen in "Confused by git checkout"), which deals only with branches, not files.
As reset, restore and revert documentation states:
There are three commands with similar names:
git reset
,git restore
andgit revert
.
git-revert
is about making a new commit that reverts the changes made by other commits.git-restore
is about restoring files in the working tree from either the index or another commit.
This command does not update your branch.
The command can also be used to restore files in the index from another commit.git-reset
is about updating your branch, moving the tip in order to add or remove commits from the branch. This operation changes the commit history.
git reset
can also be used to restore the index, overlapping withgit restore
.
So:
To restore a file in the index to match the version in HEAD (this is the same as using
git-reset
)git restore --staged hello.c
or you can restore both the index and the working tree (this the same as using
git-checkout
)git restore --source=HEAD --staged --worktree hello.c
or the short form which is more practical but less readable:
git restore -s@ -SW hello.c
With Git 2.25.1 (Feb. 2020), "git restore --staged
" did not correctly update the cache-tree structure, resulting in bogus trees to be written afterwards, which has been corrected.
See discussion.
See commit e701bab (08 Jan 2020) by Jeff King (peff
).
(Merged by Junio C Hamano -- gitster
-- in commit 09e393d, 22 Jan 2020)
restore
: invalidate cache-tree when removing entries with --stagedReported-by: Torsten Krah
Signed-off-by: Jeff KingWhen "
git restore --staged
" removes a path that's in the index, it marks the entry withCE_REMOVE,
but we don't do anything to invalidate the cache-tree.
In the non-staged case, we end up incheckout_worktree()
, which callsremove_marked_cache_entries()
. That actually drops the entries from the index, as well as invalidating the cache-tree and untracked-cache.But with
--staged
, we never callcheckout_worktree()
, and theCE_REMOVE
entries remain. Interestingly, they are dropped when we write out the index, but that means the resulting index is inconsistent: its cache-tree will not match the actual entries, and running "git commit
" immediately after will create the wrong tree.We can solve this by calling
remove_marked_cache_entries()
ourselves before writing out the index. Note that we can't just hoist it out ofcheckout_worktree()
; that function needs to iterate over theCE_REMOVE
entries (to drop their matching worktree files) before removing them.One curiosity about the test: without this patch, it actually triggers a BUG() when running git-restore:
BUG: cache-tree.c:810: new1 with flags 0x4420000 should not be in cache-tree
But in the original problem report, which used a similar recipe,
git restore
actually creates the bogus index (and the commit is created with the wrong tree). I'm not sure why the test here behaves differently than my out-of-suite reproduction, but what's here should catch either symptom (and the fix corrects both cases).
With Git 2.27 (Q2 2020), "git restore --staged --worktree
" now defaults to take the contents out of "HEAD", instead of erring out.
See commit 088018e (05 May 2020) by Eric Sunshine (sunshineco
).
(Merged by Junio C Hamano -- gitster
-- in commit 4c2941a, 08 May 2020)
restore
: default to HEAD when combining --staged and --worktreeSigned-off-by: Eric Sunshine
Reviewed-by: Taylor BlauBy default, files are restored from the index for
--worktree
, and from HEAD for--staged
.When
--worktree
and--staged
are combined,--source
must be specified to disambiguate the restore source, thus making it cumbersome to restore a file in both the worktree and the index.(Due to an oversight, the
--source
requirement, though documented, is not actually enforced.)However, HEAD is also a reasonable default for
--worktree
when combined with--staged
, so make it the default anytime--staged
is used (whether combined with--worktree
or not).
So now, this works:
git restore --staged --worktree
git restore -SW
For your 1st question "What is git-restore?":
git-restore is a tool to revert non-commited changes. Non-commited changes are: a) changes in your working copy, or b) content in your index (a.k.a. staging area).
This command was introduced in git 2.23 (together with the git-switch) to separate multiple concerns previously united in git-checkout.
git-restore can be used in three different modes, depending on whether you like to revert work in the working copy, in the index, or both.
git restore [--worktree] <file>
overwrites <file> in your working copy with the contents in your index (*). In other words, it reverts your changes in the working copy. Whether you specify --worktree
or not does not matter because it is implied if you don't say otherwise.
git restore --staged <file>
overwrites <file> in your index with the current HEAD from the local repository. In other words, it unstages previously staged content. In so far, it is indeed equivalent to the old git reset HEAD <file>
.
To overwrite both, the working copy and the index with the current HEAD, use git restore --staged --worktree --source HEAD <file>
. This version does both: revert your working copy to HEAD and unstage previously staged work.
For your 2nd question "What's the difference between git-restore and git-reset?":
There are overlaps between these two commands, and differences.
Both can be used to modify your working copy and/or the staging area. However, only git-reset can modify your repository. In this sense, git-restore seems the safer option if you only want to revert local work.
There are more differences, which I can't enumerate here.
(*) A file not add
ed to the index is still regarded to be in the index, however in it's "clean" state from the current HEAD revision.
To add to VonC's answer, and bring into the picture all the relevant commands, in alphabetical order.
git checkout
git reset
git restore
git switch
I'll throw in one more, the misnamed git revert
, as well.
From an end-user perspective
All you need are git checkout
, git reset
, and git revert
. These commands have been in Git all along.
But git checkout
has, in effect, two modes of operation. One mode is "safe": it won't accidentally destroy any unsaved work. The other mode is "unsafe": if you use it, and it tells Git to wipe out some unsaved file, Git assumes that (a) you knew it meant that and (b) you really did mean to wipe out your unsaved file, so Git immediately wipes out your unsaved file.
This is not very friendly, so the Git folks finally—after years of users griping—split git checkout
into two new commands. This leads us to:
From a historical perspective
git restore
is new, having first come into existence in August 2019, in Git 2.23. git reset
is very old, having been in Git all along, dating back to before 2005. Both commands have the ability to destroy unsaved work.
The git switch
command is also new, introduced along with git restore
in Git 2.23. It implements the "safe half" of git checkout
; git restore
implements the "unsafe half".
When would you use which command?
This is the most complicated part, and to really understand it, we need to know the following items:
Git is really all about commits. Commits get stored in the Git repository. The
git push
andgit fetch
commands transfer commits—whole commits, as an all-or-nothing deal1—to the other Git. You either have all of a commit, or you don't have it. Other commands, such asgit merge
orgit rebase
, all work with local commits. Thepull
command runsfetch
(to get commits) followed by a second command to work with the commits once they're local.New commits add to the repository. You almost never remove a commit from the repository. Only one of the five commands listed here—checkout, reset, restore, revert, and switch—is capable of removing commits.2
Each commit is numbered by its hash ID, which is unique to that one particular commit. It's actually computed from what's in the commit, which is how Git makes these numbers work across all Gits eveywhere. This means that what is in the commit is frozen for all time: if you change anything, what you get is a new commit with a new number, and the old commit is still there, with its same old number.
Each commit stores two things: a snapshot, and metadata. The metadata include the hash ID(s) of some previous commit(s). This makes commits form backwards-looking chains.
A branch name holds the hash ID of one commit. This makes the branch name find that commit, which in turn means two things:
- that particular commit is the tip commit of that branch; and
- all commits leading up to and including that tip commit are on that branch.
We're also going to talk about Git's index in a moment, and your working tree. They're separate from these but worth mentioning early, especially since the index has three names: Git sometimes calls it the index, sometimes calls it the staging area, and sometimes—rarely these days—calls it the cache. All three names refer to the same thing.
Everything up through the branch name is, I think, best understood via pictures (at least for most people). If we draw a series of commits, with newer commits towards the right, using o
for each commit and omitting some commits for space or whatever, we get something like this:
o--o---o <-- feature-top
/ \
o--o--o--o--...--o---o--o <-- main
\ /
o--o--...--o--o <-- feature-hull
which, as you can see, is a boat repository. There are three branches. The mainline branch holds every commit, including all the commits on the top row and bottom (hull) row. The feature-top
branch holds the top three commits and also the three commits along the main line to the left, but not any of the commits on the bottom row. All the connectors between commits are—well, should be but I don't have a good enough font—one-way arrows, pointing left, or down-and-left, or up-and-left.
These "arrows", or one way connections from commit to commit, are technically arcs, or one-way edges, in a directed graph. This directed graph is one without cycles, making it a Directed Acyclic Graph or DAG, which has a bunch of properties that are useful to Git.
If you're just using Git to store files inside commits, all you really care about are the round o
nodes or vertices (again two words for the same thing), each of which acts to store your files, but you should at least be vaguely aware of how they are arranged. It matters, especially because of merges. Merge commits are those with two outgoing arcs, pointing backwards to two of what Git calls parent commits. The child commit is the one "later": just as human parents are always older than their children, Git parent commits are older than their children.
We need one more thing, though: Where do new commits come from? We noted that what's in a commit—both the snapshot, holding all the files, and the metadata, holding the rest of the information Git keeps about a commit—is all read-only. Your files are not only frozen, they're also transformed, and the transformed data are then de-duplicated, so that even though every commit has a full snapshot of every file, the repository itself stays relatively slim. But this means that the files in a commit can only be read by Git, and nothing—not even Git itself—can write to them. They get saved once, and are de-duplicated from then on. The commits act as archives, almost like tar or rar or winzip or whatever.
To work with a Git repository, then, we have to have Git extract the files. This takes the files out of some commit, turning those special archive-formatted things into regular, usable files. Note that Git may well be able to store files that your computer literally can't store: a classic example is a file named aux.h
, for some C program, on a Windows machine. We won't go into all the details, but it is theoretically possible to still get work done with this repository, which was probably built on a Linux system, even if you're on a Windows system where you can't work with the aux.h
file directly.
Anyway, assuming there are no nasty little surprises like aux.h
, you would just run git checkout
or git switch
to get some commit out of Git. This will fill in your working tree, populating it from the files stored in the tip commit of some branch. The tip commit is, again, the last commit on that branch, as found by the branch name. Your git checkout
or git switch
selected that commit to be the current commit, by selecting that branch name to be the current branch. You now have all the files from that commit, in an area where you can see them and work on them: your working tree.
Note that the files in your working tree are not actually in Git itself. They were just extracted from Git. This matters a lot, because when git checkout
extracts the files from Git, it actually puts each file in two places. One of those places is the ordinary everyday file you see and work on / with. The other place Git puts each file is into Git's index.
As I mentioned a moment ago, the index has three names: index, staging area, and cache. All refer to the same thing: the place Git sticks these "copies" of each file. Each one is actually pre-de-duplicated, so the word "copy" is slightly wrong, but—unlike much of the rest of its innards—Git actually does a really good job of hiding the de-duplication aspect. Unless you start getting into internal commands like git ls-files
and git update-index
, you don't need to know about this part, and can just think of the index as holding a copy of the file, ready to go into the next commit.
What this all means for you as someone just using Git is that the index / staging-area acts as your proposed next commit. When you run git commit
, Git is going to package up these copies of the file as the ones to be archived in the snapshot. The copies you have in your working tree are yours; the index / staging-area copies are Git's, ready to go. So, if you change your copies and want the changed copy to be what goes in the next snapshot, you must tell Git: Update the Git copy, in the Git index / staging-area. You do this with git add
.3 The git add
command means make the proposed-next-commit copy match the working-tree copy. It's the add
command that does the updating: this is when Git compresses and de-duplicates the file and makes it ready for archiving, not at git commit
time.4
Then, assuming you have some series of commits ending with the one with hash-N
:
[hash1] <-[hash2] ... <-[hashN] <--branch
you run git commit
, give it any metadata it needs (a commit log message), and you get an N+1'th commit:
[hash1] <-[hash2] ... <-[hashN] <-[hashN+1] <--branch
Git automatically updates the branch name to point to the new commit, which has therefore been added to the branch.
Let's look at each of the various commands now:
git checkout
: this is a large and complicated command.We already saw this one, or at least, half of this one. We used it to pick out a branch name, and therefore a particular commit. This kind of checkout first looks at our current commit, index, and working tree. It makes sure that we have committed all our modified files, or—this part gets a bit complicated—that if we haven't committed all our modified files, switching to that other branch is "safe". If it's not safe,
git checkout
tells you that you can't switch due to having modified files. If it is safe,git checkout
will switch; if you didn't mean to switch, you can just switch back. (See also Checkout another branch when there are uncommitted changes on the current branch)But
git checkout
has an unsafe half. Suppose you have modified some file in your working tree, such asREADME.md
oraux.h
or whatever. You now look back at what you changed and think: No, that's a bad idea. I should get rid of this change. I'd like the file back exactly as it was before.To get this—to wipe out your changes to, say,
README.md
—you can run:git checkout -- README.md
The
--
part here is optional. It's a good idea to use it, because it tells Git that the part that comes after--
is a file name, not a branch name.Suppose you have a branch named
hello
and a file namedhello
. What does:git checkout hello
mean? Are we asking Git to clobber the file
hello
to remove the changes we made, or are we asking Git to check out the branchhello
? To make this unambiguous, you have to write either:git checkout -- hello (clobber the file)
or:
git checkout hello -- (get the branch)
This case, where there are branches and files or directories with the same name, is a particularly insidious one. It has bitten real users. It's why
git switch
exists now. Thegit switch
command never means clobber my files. It only means do the safe kind ofgit checkout
.(The
git checkout
command has been smartened up too, so that if you have the new commands and you run the "bad" kind of ambiguousgit checkout
, Git will just complain at you and do nothing at all. Either use the smarter split-up commands, or add the--
at the right place to pick which kind of operation you want.)More precisely, this kind of
git checkout
, ideally spelledgit checkout -- paths
, is a request for Git to copy files from Git's index to your working tree. This means clobber my files. You can also rungit checkout tree-ish -- paths
, where you add a commit hash ID5 to the command. This tells Git to copy the files from that commit, first to Git's index, and then on to your working tree. This, too, means clobber my files: the difference is where Git gets the copies of the files it's extracting.If you ran
git add
on some file and thus copied it into Git's index, you needgit checkout HEAD -- file
to get it back from the current commit. The copy that's in Git's index is the one yougit add
-ed. So these two forms ofgit checkout
, with a commit hash ID (or the nameHEAD
), the optional--
, and the file name, are the unsafe clobber my files forms.git reset
: this is also a large and complicated command.There are, depending on how you count, up to about five or six different forms of
git reset
. We'll concentrate on a smaller subset here.git reset [ --hard | --mixed | --soft ] [ commit ]
Here, we're asking Git to do several things. First, if we give a
commit
argument, such asHEAD
orHEAD~3
or some such, we've picked a particular commit that Git should reset to. This is the kind of command that will remove commits by ejecting them off the end of the branch. Of all the commands listed here, this is the only one that removes any commits. One other command—git commit --amend
—has the effect of ejecting the last commit while putting on a new replacement, but that one is limited to ejecting one commit.Let's show this as a drawing. Suppose we have:
...--E--F--G--H <-- branch
That is, this branch, named
branch
, ends with four commits whose hash IDs we'll callE
,F
,G
, andH
in that order. The namebranch
currently stores the hash ID of the last of these commits,H
. If we usegit reset --hard HEAD~3
, we're telling Git to eject the last three commits. The result is:F--G--H ??? / ...--E <-- branch
The name
branch
now selects commitE
, not commitH
. If we did not write down (on paper, on a whiteboard, in a file) the hash IDs of the last three commits, they've just become somewhat hard to find. Git does gives a way to find them again, for a while, but mostly they just seem to be gone.The
HEAD~3
part of this command is how we chose to drop the last three commits. It's part of a whole sub-topic in Git, documented in the gitrevisions manual, on ways to name specific commits. The reset command just needs the hash ID of an actual commit, or anything equivalent, andHEAD~3
means go back three first-parent steps, which in this case gets us from commitH
back to commitE
.The
--hard
part of thegit reset
is how we tell Git what to do with (a) its index and (b) our working tree files. We have three choices here:--soft
tells Git: leave both alone. Git will move the branch name without touching the index or our working tree. If you rungit commit
now, whatever is (still) in the index is what goes into the new commit. If the index matches the snapshot in commitH
, this gets you a new commit whose snapshot isH
, but whose parent isE
, as if commitsF
throughH
had all been collapsed into a single new commit. People usually call this squashing.--mixed
tells Git: reset your index, but leave my working tree alone. Git will move the branch name, then replace every file that is in the index with the one from the newly selected commit. But Git will leave all your working tree files alone. This means that as far as Git is concerned, you can startgit add
ing files to make a new commit. Your new commit won't matchH
unless yougit add
everything, so this means you could, for instance, build a new intermediate commit, sort of likeE+F
or something, if you wanted.--hard
tells Git: reset your index and my working tree. Git will move the branch name, replace all the files in its index, and replace all the files in your working tree, all as one big thing. It's now as if you never made those three commits at all. You no longer have the files fromF
, orG
, orH
: you have the files from commitE
.
Note that if you leave out the
commit
part of this kind of (hard/soft/mixed)reset
, Git will useHEAD
. SinceHEAD
names the current commit (as selected by the current branch name), this leaves the branch name itself unchanged: it still selects the same commit as before. So this is only useful with--mixed
or--hard
, becausegit reset --soft
, with no commit hash ID, means don't move the branch name, don't change Git's index, and don't touch my working tree. Those are the three things this kind ofgit reset
can do—move the branch name, change what's in Git's index, and change what's in your working tree—and you just ruled all three out. Git is OK with doing nothing, but why bother?git reset [ tree-ish ] -- path
This is the other kind of
git reset
we'll care about here. It's a bit like a mixed reset, in that it means clobber some of the index copies of files, but here you specify which files to clobber. It's also a bit unlike a mixed reset, because this kind ofgit reset
will never move the branch name.Instead, you pick which files you want copied from somewhere. The somewhere is the
tree-ish
you give; if you don't give one, the somewhere isHEAD
, i.e., the current commit. This can only restore files in the proposed next commit to the form they have in some existing commit. By defaulting to the current existing commit, this kind ofgit reset -- path
has the effect of undoing agit add -- path
.6There are several other forms of
git reset
. To see what they all mean, consult the documentation.
git restore
: this got split off fromgit checkout
.Basically, this does the same thing as the various forms of
git checkout
andgit reset
that clobber files (in your working tree and/or in Git's index). It's smarter than the oldgit checkout
-and-clobber-my-work variant, in that you get to choose where the files come from and where they go, all in the one command line.To do what you used to do with
git checkout -- file
, you just rungit restore --staged --worktree -- file
. (You can leave out the--
part, as withgit checkout
, in most cases, but it's just generally wise to get in the habit of using it. Likegit add
, this command is designed such that only files named-whatever
are actually problematic.)To do what you used to do with
git reset -- file
, you just rungit restore --worktree -- file
, or even justgit restore -- file
since--worktree
is the default here.Note that you can copy a file from some existing commit, to Git's index, without touching your working tree copy of that file:
git restore --source commit --staged -- file
does that. You can't do that at all with the oldgit checkout
but you can do that with the oldgit reset
, asgit reset commit -- file
. This overlap exists becausegit restore
is new, and this kind of restore makes sense; probably, ideally, we should always usegit restore
here, instead of using the oldgit reset
way of doing things, but Git tries to maintain backwards compatibility.git switch
: this just does the "safe half" ofgit checkout
. That's really all you need to know. Usinggit switch
, without--force
, Git won't overwrite your unsaved work, even if you make a typo or whatever. The oldgit checkout
command could overwrite unsaved work: if your typo turns a branch name into a file name, for instance, well, oops.git revert
(I added this for completeness): this makes a new commit. The point of the new commit is to back out what someone did in some existing commit. You therefore need to name the existing commit that revert should back out. This command probably should have been namedgit backout
.If you back out the most recent commit, this does revert to the second-most-recent snapshot:
...--G--H <-- branch
becomes:
...--G--H--Ħ <-- branch
where commit
Ħ
(H-bar) "undoes" commitH
and therefore leaves us with the same files as commitG
. But we don't have to undo the most recent commit. We could take:...--E--F--G--H <-- branch
and add a commit
Ǝ
that undoesE
to get:...--E--F--G--H--Ǝ <-- branch
which may not match the source snapshot of any previous commit!
1Git is, slowly, growing a facility to "partly get" a commit so that you can deal with huge repositories with huge commits without having to wait for the entire commit all at once, for instance. Right now that's not something ordinary users will ever see, and when it does come to regular users, it's meant as an add-on to the basic "all or nothing" mode of a commit. It will turn this from "you either have a commit, or not" to "you have a commit—either all of it, or part of it with the promise to deliver the rest soon—or not; if you have part of a commit, you can work with the part, but that's all".
2Even then, a "removed" commit is not gone yet: you can get it back. This answer won't cover how to do that, though. Also, git commit --amend
is a special case, which we will mention, but not really cover properly here.
3To remove the file from both your working tree and Git's index, you can use git rm
. If you remove the file from your working tree, then run git add
on that file name, Git will "add" the removal, so that works too.
4If you use git commit -a
, Git will, at that time, run git add
on all the files. This is done in a tricky way that can break some poorly-written pre-commit hooks. I recommend learning the two step process, in part because of those poorly-written hooks—though I'd try to avoid or fix them if possible—and in part just because if you try to avoid learning about Git's index like the authors of those poorly-written hooks did, Git is going to give you more trouble later.
5The reason this is a tree-ish and not a commit-ish is that you can use anything that specifies some existing internal Git tree object. Each commit has a saved snapshot, though, that is suitable for here, and is what you'd normally put here.
6As with all these other Git commands, you can use the --
between the add
command and the paths to add. It's actually a good habit to get into, as this means that you can add a path named -u
, if you have such a path: git add -- -u
means add the file named -u
but git add -u
doesn't mean that at all. Of course, files whose names match option sequences are less common and less surprising than files whose names match branch names: it's really easy to have a dev
branch and a set of files named dev/whatever
. Since file paths will match using directories, for add, checkout, reset, and restore, these can get mixed up. The add
command doesn't take a branch name though, so it's safer in that respect.