LaTeX + Git: Bibliography

Sure. Its almost what I do.

In answer to sub Q1: this is an excellent idea (even if I do say so myself)

In answer to sub Q2: ...

Put the .bib in your Local texmftree

What you would like is a local texmf tree, which in in one git repo. A local texmf tree is for putting various package-like-artifacts, that are not proper packages managed through your package manager (e.g. Miktex in windows).

The local texmf tree means that you never have to use absolute, or even relative paths for your bibliography. Its also very useful as many conferences and journals distribute their own templates and styles not through CTAN, but just as .sty and .bst files.

How to setup a local tex tree is beyond the scope of this answer but it isn't hard. See the following questions:

  • Create a local texmf tree in MiKTeX (Windows)
  • Create a local texmf tree in Ubuntu (TexLive)

Once you have created that tree, add it to git. Put your master.bib file in localtexmf\bibtex\bib\master_bibliography\master.bib. (I also have my JabRef XML setting in this directory, and a .tex file to make an annotated bibliography)

Then when you go to work in a project you just add \bibliography{master} and it will find it. Better than an absolute or a relative path.

Then when ever you work on a new computer check out the repo from git, and tell your tex distribution to know about the local texmf tree.


Put the Local texmf tree in Git

Now where to put it.

One repo for all work, including the texmftree

I just have one git rep for all my PhD work, with a bunch of folders. Its not a great system, but it is simple. So my texmf tree is in phd\Resources\tools\localtexmf\. Where as my papers are in phd\documents_prepared\Journal Papers\PaperName. My experimental prototypes are in \phd\prototypes\ProjectName. Often I'll have the prototype dump out its data into a CSV, into the a data folder within the Paper folder, then I can produce plots with PGFPlots. (its taken a while for my tex foo to reach that level, but now that it has, it is good)

That's easy and simple. and since it is all on repo, you don't need to worry about committing multiple times.

Keep a git repo for your current work, and a separate for the texmf tree

This is not too complex. There is no need to nest one in the other, as you are not using relative paths or anything (There is also no need not to; see next section). So that is fine. You just need to remember to commit and pull both. You could script that.

Use Git sub-modules to nest them

You can have git repos in git repos. This is a sub-module (Documentation) . I don't particularly see the advantage, except that it does mean someone can just checkout your paper repo, and have it pull the texmf tree git repo as a submodule.

The disadvantage is now you have 1 texmf tree per paper repo, locally. But still only one in git. But keeping them both synced when you are working on multiple projects seems unfun.

git sub-modules don't work the way you might think they do; and for this I don't like them. They might not the least intuitive thing in git, but that is a very competitive playing field (:-P). This post goes into some details. I suggest doing a bit of research before doing this.


P.S. You can use the Bibtex Annote field to store annotations. If you are using a tool like Jabref, that helps to search and index and generally GUI your .bib file, you may need to enable Annote on all Bibentry types.


I use the approach detailed in http://andrius.velykis.lt/2012/06/master-bibtex-file-git-submodules/ for my own work.

So I have one repository "bib" that contains my BibTeX files. For this repository, I also have a private remote at BitBucket for convenience and portability.

In each of my papers and thesis, I use git submodule add <bitbucket-URL> biblio to make a submodule directory with my document. In you document you can then just specify to use the BIB files in that directory.

Typically, I try to only edit my main BIB repository (that's the only one that is loaded within JabRef for me), push those changes to bitbucket and then pull in the changes from within each document repository. However, in some cases I just want to change e.g. how the authors are formatted, don't show URLs in the bibliography for a paper, ... The proper way is to tinker with BST files and the like, but in the heat of the moment it's often a lot faster to just tweak the "biblio" module of a paper.

I think there are a few advantages to this general approach:

  • your bibliography is versioned, so you can easily track what exact version of a BIB file you used,
  • you can transfer changes from your main "bib" repository to each document, or the other way around,
  • you can tweak the bibliography that are specific to one document and not push them back to the main repository,
  • everything can be done using relative paths, so, it is portable across computers, users, OSes, ...

But also some disadvantages:

  • When you add many references, you end up pushing and pulling quite often: push from the main one, pull in the document.
  • Git submodules make your document repository – and how to work with it – more complicated. So you should already feel comfortable with git before you use this a workflow.

Remark about network access While my approach normally relies on bitbucket, I also like to add another remote in the submodule. That extra remote points to the location of my main bibliography repository. This allows me to also push/pull to the main repository when I don't have network access. That extra remote might not migrate well across computers (due to different file paths), but since it's just there "in case of no network", I don't mind it too much.

Remark about submodule In the git world there has been a long-standing discussion whether submodule is a good thing or not. The alternative is to use git subtree, I don't use it, but I suppose the concept of this workflow can be adapted to use subtree instead of submodule.