[Tex/LaTex] What are good working practices for VCS with LaTeX documents

best practicesrevision controlworkflow

I am interested in using a VCS to track changes as I write short LaTeX documents in my daily research. Essentially, I use LaTeX as a whiteboard for developing ideas, and then I have documents to present at weekly meetings with my research advisor.

I have some experience with the basic DVCS operations, but I'm having trouble determining good working practices for my setting. Most of the online sources that I've read refer exclusively to longer-lived programming projects. Does anyone have suggestions of good working practices in my use case?

Some specific questions:

  • How often should I commit? When I'm developing an idea, it's not always clear where the various units of work are. For example, unlike programming, it's not easy to divide into API vs. UI or commit as functions are changed.

  • Should I use something like versioned Mercurial patch queues to track the evolution of an idea, and use standard commits when it's clear that an idea is here to stay?

  • I often have several files that reside in a single repository, but are more or less independent. For example, these may be two different approaches to the same problem. How should I handle this? Branches? Multiple patch queues? Subrepositories? Different repositories altogether?

Best Answer

First off, you may not want to use LaTeX directly if you want to use it as a whiteboard to develop ideas. You may be better off using Org-mode to draft your paper. You can embed LaTeX in Org-mode and you will probably be more effective than using LaTeX directly. The details on using Org-mode for drafting are not hard to learn, especially if you already use Emacs. Since Org-mode stores files as plain text they can be tracked with version control.

I use git to track some documents I work on and I mostly use it as a safety net. Knowing that every change I do may be recovered is great because I can make big changes to documents without being afraid of losing anything.

When using version control privately, i.e. you are the only one who commits, it may be thought of as using a system to name groups of changes to documents. So to use it effectively you should make changes of one type or to a particular part of your document and describe the change accordingly in the commit message, e.g. commit after you made a couple of changes to your introduction and make another commit for the changes of your conclusion instead of one commit for everything. However you should not be too pedantic with version control, it should not take over your primary work of writing the text. It is more important that you work effectively and write a good text than that you make the most appropriate commits.

My typical work flow is that I make a repository only after I have document with some substantial content. Then I branch off the master by making a dev branch. I commit all new changes to the dev branch and when I have read the whole document or come to a particular stage at which the document is more stable, e.g. when I turn it into my supervisor, I merge the changes in dev to the master. Then for any new changes I branch into a deb branch again. I also use the tag feature of git to mark particular stages, e.g. a version I sent to someone, which makes it easier to find those stages if I need.

To view changes I use different tools. I both use giggle, tig and git log to see the history of a repository. To compare versions I mostly use meld.

Use one type of project per repository. Grouping things that do not belong together is confusing. Branching is good for making different version of a document, e.g. to split a project into an article and a presentation.

Finally, here is what inspired me to start using version control: http://www.charlietanksley.net/philtex/using-a-version-control-system/