Is it possible to remote count object and size of git repository?
I think there are a couple problems with this question: git count-objects
doesn't truly represent the size of a repository (even git count-object -v
doesn't really); if you're using anything other than the dumb http transport, a
new pack will be created for your clone when you make it; and (as VonC pointed
out) anything you do to analyze a remote repo won't take into account the
working copy size.
That being said, if they are using the dumb http transport (github, for example, is not), you could write a shell script that used curl to query the sizes of all the objects and packs. That might get you closer, but it's making more http requests that you'll just have to make again to actually do the clone.
It is possible to figure out what git-fetch
would send across the wire (to a
smart http transport) and send that to analyze the results, but it's not really
a nice thing to do. Essentially you're asking the target server to pack up
results that you're just going to download and throw away, so that you can
download them again to save them.
Something like these steps can be used to this effect:
url=https://github.com/gitster/git.git
git ls-remote $url |
grep '[[:space:]]\(HEAD\|refs/heads/master\|refs/tags\)' |
grep -v '\^{}$' | awk '{print "0032want " $1}' > binarydata
echo 00000009done >> binarydata
curl -s -X POST --data-binary @binarydata \
-H "Content-Type: application/x-git-upload-pack-request" \
-H "Accept-Encoding: deflate, gzip" \
-H "Accept: application/x-git-upload-pack-result" \
-A "git/1.7.9" $url/git-upload-pack | wc -c
At the end of all of this, the remote server will have packed up master/HEAD and all the tags for you and you will have downloaded the entire pack file just to see how big it will be when you download it during your clone.
When you finally do a clone, the working copy will be created as well, so the entire directory will be larger than these commands spit out, but the pack file generally is the largest part of a working copy with any significant history.
[ update 21 Sep 2021 ]
It seems that the link will now be redirected to another URL, so we need to add -L
to curl to follow the redirection.
curl -sL https://api.github.com/repos/Marijnh/CodeMirror | grep size
[ Old answer ]
For github repository, it now offer API to check file size. It works!
This link: see-the-size-of-a-github-repo-before-cloning-it gave the answer
Command: (answer from @VMTrooper)
curl https://api.github.com/repos/$2/$3 | grep size
Example:
curl https://api.github.com/repos/Marijnh/CodeMirror | grep size
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 5005 100 5005 0 0 2656 0 0:00:01 0:00:01 --:--:-- 2779
"size": 28589,
Doesn't give the object count, but if you use Google Chrome browser and install this extension
It adds the repo size to the home page:
One little kludge you could use would be the following:
mkdir repo-name
cd repo-name
git init
git remote add origin <URL of remote>
git fetch origin
git fetch
displays feedback along these lines:
remote: Counting objects: 95815, done.
remote: Compressing objects: 100% (25006/25006), done.
remote: Total 95815 (delta 69568), reused 95445 (delta 69317)
Receiving objects: 100% (95815/95815), 18.48 MiB | 16.84 MiB/s, done.
...
The steps on the remote end generally happen pretty fast; it's the receiving step that can be time-consuming. It doesn't actually show the total size, but you can certainly watch it for a second, and if you see "1% ... 23.75 GiB" you know you're in trouble, and you can cancel it.