Projects within projects using Git

I've used git to stitch together my own github hosted project and an external UI library that I wanted to use. The library is hosted in a subversion repository on sourceforge.

I used git-submodule and git-svn and it worked reasonably well. The downsides were:

  1. In order to keep up to date with the library repository, I had to perform a new commit to update the submodule git hash "pointer". This is because git submodules, unlike svn:externals, are pinned to a particular commit id. This may not be an actual downside if you actually want to pin a stable version, I was working with code that was WIP.

  2. The initial pull of a git repo with submodules requires an additional step with "git submodule init". This is not an issue for you, but for others using your code they will have to remember or be told to perform this step before compiling/running/testing your code.

  3. If you use the command line it is easy to screw up your repository with git-add. This is because you type git add subm<tab> to complete to git add submodule, but it auto-completes to git add submodule/ - note the trailing slash. If you execute the command with the trailing slash, then it blitzes the submodule and adds all its contained files instead. This is mitigated by using git-gui, git add . or just training yourself to delete the slash (it happened to me enough times that I trained myself to remove it)

  4. Submodules commits can mess up git rebase -i. I forget the exact details, but it is especially bad if you have a "dirty" submodule and you run a rebase-interactive. Normally with a dirty tree you can't rebase, but submodules are not checked. Having several submodule commits in a rebase group also causes problems. The last submodule hash gets committed to the first pick on your list, and this is pretty tricky to fix later. This can be worked around with a more careful workflow (i.e. carefully deciding when to do your submodule commits...) but can be a PITA.

The steps to set this up were something along the lines of:

  1. Run git svn clone https://project.svn.sourceforge.net/svnroot/project/project/trunk
  2. Push that as a "real" git project to e.g. github
  3. Now in your own git repository, run git submodule init
  4. git submodule add git://github.com/project subproject
  5. Push that out too, to your own repo this time.

That is it, more or less. You will have a new directory "subproject", which in your case would be the geomapping library.

Each time you need to update the geomapping code, you would run something like:

cd subproject
git svn rebase
git svn push  # this updates the git mirror of the subproject
cd ..
git add subproject # careful with the trailing slash!
git commit -m "update subproject"
git push # this pushes the commit that updates the subproject

I've not seen to many tutorials on a git submodule work flow, so I hope this helps you decide.


I haven't found submodules to be particularly useful on the (small) projects I've worked on. Once you've set them up, working on the whole project requires adding additional parameters to almost every command and the syntax isn't completely regular. I imagine if I worked on larger projects with more submodules, I'd see it as a more beneficial tradeoff.

There are two possibilities that keep the sub-projects as independent git repos that you pull from into your main (integration) repo:

  • Using subtree merge to bring your external projects into separate subdirectories in your main repo that includes your core files. This makes it easy to update the main project from the external projects, but complicated to send changes back to the external projects. I think of this as a good way to include project dependencies, but it wouldn't work so well with shared files. Another simple explanation (link fixed).

  • Set up each project as a remote branch in your main repo and merge from each of them into your master (integration) branch that also contains your core files. This requires some discipline: if you make any changes to the external projects in your main repo, they must be made in the branch and then merged into the master; and you never want to merge into the project branches. This makes it easy to send changes back to the external projects and is a perfectly acceptable use of branches in Git.

    Your shared scripts can be handled as another independent branch in your main directory which your external partners can pull from and push to as a remote branch.

If you try to run SVN & Git in the same directory, you make it very hard to use branching in either system, because SVN does branching by copying file directories while Git tracks pointers. Neither system would automatically see branches you make in the other. I think that 'solution' is more trouble than it is worth.