[Tex/LaTex] configure git to work with latex (i.e. use dots instead of lines as reference points)

gitrevision control

Normally git is great, but if I use it for a written type of document (report/novel) and do a correction reading, it has to rewrite almost the entire document, because most paragraphs are likely to change. But still only 10% if the sentences get changed. Therefore I was thinking it would be great if one could make git reference sentences rather than lines. Is that possible?

edit: some more motivation:

Having git using newlines as reference works quite well with academic articles, because paragraphs tend to be relatively short (hence changes are still locally captured if git replaces a paragraph). For a more prosaic text, however, the correction (especially at later state) might be subtle but the paragraphs tend to be quite long. Therefore replacing an entire paragraph in git makes it difficult to spot the correction later on in the logs.

Regarding making a new line after each sentence, I think that destroys the optical structure of the text appearing in the latex document. Especially if the editor has wordwrap enabled.

Just have a look of the text above split into new lines:

Having git using newlines as reference works quite well with academic articles, 
because paragraphs tend to be relatively short 
(hence changes are still locally captured if git replaces a paragraph). 
For a more prosaic text, however, 
the correction (especially at later state) might be subtle but the paragraphs tend to be quite long. 
Therefore replacing an entire paragraph in git makes it difficult to spot the correction later on in the logs.

Regarding making a new line after each sentence, 
I think that destroys the optical structure of the text appearing in the latex document.
Especially if the editor has wordwrap enabled.

Best Answer

This isn't an answer so much as a strong and popular recommendation.

Reconfiguring git might be the wrong way to go about it; the thing to change here would be how you write. Remember that a great many languages do not concern themselves with line breaks at all! (LISP, C/C++, Java; the list goes on and on.) Which is easier to read?

#include <stdio>
int main(char* count, char** values) { for (int i = 0; i < count; i++) { puts(values[i]); } return 0;}

versus

#include <stdio>
int main(char* count, char** values) {
    for (int i = 0; i < count; i++) {
        puts(values[i]);
    }
    return 0;
}

To C, it doesn't much matter how many lines the source is spread out on; that is something we add in for VCS and for readability.

TeX (as it is normally used) is nearly one of those languages, with only a few deviations for paragraph marking and grouping. So, as far as TeX is concerned, you could write your paragraphs like

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis tempor commodo vestibulum. Mauris auctor leo sit amet enim posuere, vitae pretium mauris pharetra. Etiam mattis ante a dolor facilisis iaculis. Morbi placerat adipiscing pharetra. Suspendisse at posuere tortor, quis ornare nisi. Vivamus porttitor risus quis justo convallis dictum. Nulla id nisl ultrices, imperdiet enim eu, hendrerit nulla. In convallis, tortor id semper commodo, neque elit bibendum tortor, et interdum tortor libero a tortor. Aenean eget sodales lorem, at condimentum neque.

or like

Lorem ipsum dolor sit amet,
  consectetur adipiscing elit.
Duis tempor commodo vestibulum.
Mauris auctor leo sit amet enim posuere,
  vitae pretium mauris pharetra.
Etiam mattis ante a dolor facilisis iaculis.
Morbi placerat adipiscing pharetra.
Suspendisse at posuere tortor,
  quis ornare nisi.
Vivamus porttitor risus quis justo convallis dictum.
Nulla id nisl ultrices,
  imperdiet enim eu,
  hendrerit nulla.
In convallis,
  tortor id semper commodo,
  neque elit bibendum tortor,
  et interdum tortor libero a tortor.
Aenean eget sodales lorem,
  at condimentum neque.

and TeX would not care. Thus, by keeping each phrase (or sentence, as @Torbjørn suggests) on its own line, you are able to localize changes the way git would expect them.

The only difference is, admittedly, readability. While it is a little different to read like this, I was able to get used to it pretty fast.

If you really wanted to read text from screen edge to edge, I would rather ask for an editor feature than a VCS customization. (Emacs' visual-line-mode comes to mind.)

(Hint: the markdown source of this post is a good example of the concept.)


By virtue of TeX the format, it's not surprising that two new lines start a new paragraph.

In order to take advantage TeX's ability to be managed by a VCS, you are going to have to make some apparent sacrifices. It does take a little getting used to (around a day or so of good writing), but you will find that it is actually easier to read your work when you do this. I obviously cannot know if this is the case for everyone, but when I read, I tend to 'speak the words in my mind', as it were. In my honest opinion, having a line break in these pauses is actually ideal for two reasons:

  • you can have your 'pause' as your eye takes the time to track to the next phrase, and
  • you can ensure, at least for others who will actually be reading your paper, that there is never too long a sentence that could not be understood as a whole.

    (Have you ever tried to read a sentence that was just so long and had no apparent breaks in it that could give your mind a chance to rest and prepare for the next thought that its chances for actual and meaningful reader comprehension suffered dramatically? Well, now you have.)

Thus, for your example, I would actually edit it it so:

Having git using newlines as reference works well with academic articles, 
  because paragraphs tend to be relatively short
%
% I would say overly-short paragraphs are bad style regardless of your audience 
%
  (hence changes are still locally captured if git replaces a paragraph). 
For more prosaic text, however, the correction
  (especially at later stages in the editing process)
  might be subtle, but
  the paragraphs tend to be quite long. 
Therefore replacing an entire paragraph in git
  makes it difficult to spot the correction later on in the logs.

Regarding making a new line after each sentence, 
  I think that it destroys the optical structure of the text
  appearing in the LaTeX document.
Especially if the editor has wordwrap enabled.
%
% Well, then, I would disable it. :-)
%