[Tex/LaTex] Random unwanted space between paragraphs

paragraphsspacing

I generate LaTeX code from XML for a book I'm writing, using pdflatex (Version 3.1415926-1.40.10 (Web2C 2009)) and memoir. Very infrequently, there appears to be an extra blank line between two paragraphs. I have not been able to spot any pattern among these occurrences.

All attempts to isolate the problem in a small example result in the problem disappearing (which is possibly a clue, though I haven't been able to make anything of it). Another clue is that the unwanted space seems to appear after all the paragraphs on the page when it happens (but it happens so infrequently, that might be a fluke). I suppose it's even possible LaTeX/memoir is deliberately adding the vertical space, though I can't see any benefit to the current or next page from doing so.

Lacking a MWE, I've made an example that doesn't show the problem, but does when in the context of the full document (mainly in the hopes there's something funky in there I've done but just can't spot). It includes the paragraphs before and after the one that has unwanted vertical space appended. The odd conventions are due to being machine-generated from XML. Problem seems to occur on perhaps 2% of the hundreds of pages in the book. Note that in this example in context, there was no danger of an orphan if LaTeX had squeezed a couple more lines on the page — there were 9 more lines of paragraph pushed onto the next page.

Any ideas for how I might track this down appreciated.

\documentclass{memoir}% http://ctan.org/pkg/memoir
\usepackage[strict=true]{csquotes}
\usepackage{refcount}
\begin{document}
\par{This problem permeates the U.S. today, now aided and abetted by all
    the wonderful inventions of computer programmers.
    In the past, your doctor was the gatekeeper for information on medical research;
    now, you can read a nearly hourly spew of breaking medical research results,
    which so often contains contradictions that many people throw up their
    hands and conclude that everything known to man is unhealthy.
    In the past, peer-reviewed journals were the gatekeeper for information
    on a science as specialized as climatology;
    now, oil companies can
    (\label{id155638}in a modernization of the techniques tobacco companies used
    to refute the connection of cancer to cigarettes\setcounter{pagenote}{0}\pagenote[\pageref{id155638}]{{\itshape in a modernization of the techniques tobacco companies used
    to refute the connection of cancer to cigarettes}: \citep[]{b:ClimateCoverUp}})
    fund and publicize anyone with the slightest scientific credentials
    who happens to support the view that preserves their profits.
    The ever-increasing specialization of science and technology
    long ago overwhelmed the ability of television/radio/print
    journalism to act as information gatekeepers,
    so it's even possible by financing alone to successfully
    project the view that there is little or no consensus on
    anything in climatology!
    Whether you would like to believe that men have never walked on the moon,
    that aliens are routinely abducting us for humiliating probes,
    or that the government's High Frequency Active Auroral Research Program
    (HAARP) is actually controlling both the weather and people's minds,\footnote{\label{FN:id153808}Google reports 288,000 results for the search phrase \enquote{foil hat,} a popular device for defeating mind controlling radiation
        allegedly broadcast by both aliens and the government.
        Featured was \label{FN:id153819}the provocative MIT study showing that foil hats may actually \textit{amplify} RF signals
        in specific frequency ranges controlled by the government and multi-national corporations.
        Google quite fairly also features \label{FN:id153830}the author of a book on tin foil hat construction who disputes the MIT finding.
        Thus, Google does act as a kind of gatekeeper for information,
        but only in a narrow technical sense;
        coherence and meaning-making are not part of their business model,
        and would likely only produce sub-optimal profits anyway.}
    you are just a few mouse clicks away from vast amounts of information
    (and forums of like-minded believers) supporting your inclinations. \setcounterref{pagenote}{FN:id153808}\pagenote[\pageref{FN:id153819}]{{\itshape the provocative MIT study showing that foil hats}: \citep[]{w:EffectivenessFoilHelmets}}\setcounterref{pagenote}{FN:id153808}\pagenote[\pageref{FN:id153830}]{{\itshape the author of a book on tin foil hat construction}: \citep[]{w:AFDBEffectiveness}} More disturbing,
    you can now find on the Internet social support for your
    darkest impulses; for example, \label{id153840}your teenager can learn that anorexia
    is not a psychological disorder, but a healthy lifestyle choice.\setcounter{pagenote}{0}\pagenote[\pageref{id153840}]{{\itshape your teenager can learn that anorexia
    is not a psychological disorder, but a healthy lifestyle choice.}: \citep[pp. 199-201]{b:VirtuallyYou}}}%
\par{But the picture of less-educated masses being flummoxed by
    too much uncontrolled information from on high is patronizing and incomplete.
    The information tsunami disorients all levels of education and expertise.
    For example, \label{id153853}the oncogene theory
    (that particular cancers are caused by mutations in particular genes)
    is a theory in crisis\setcounter{pagenote}{0}\pagenote[\pageref{id153853}]{{\itshape the oncogene theory
    (that particular cancers are caused by mutations in particular genes)
    is a theory in crisis}: \citep[pp. 45-104]{b:WhatIsLifePertinence}}, being overwhelmed by a flood of new
    molecular information.
    Alleged oncogenes are now found everywhere, and vary not just
    from one cancer type to another, but from one patient to another
    (which helps explain why the genetic revolution has been a dismal
    failure at reducing cancer mortality).
    There is a growing sense, amidst the relentless stream of new information,
    that researchers have lost sight of the forest for the trees.
    Many modern problems, like cancer,
    likely involve complex networks that contain feedback systems,
    and information floods make us prone to a neverending
    twitchy response to each individual reductionist
    piece of the problem as it is uncovered,
    unable to stop and see how it all fits together
    (partly because it fits together into a complex, messy,
    difficult-to-control system.)}%
\par{\label{id153868}Physicist Robert Laughlin
    points out that the economics of information is changing,\setcounter{pagenote}{0}\pagenote[\pageref{id153868}]{{\itshape Physicist Robert Laughlin
    points out that the economics of information is changing,}: \citep[]{b:CrimeOfReason}} a crescendo that has become much more audible in the years since Postman's death.
    When information, once acquired, can be transferred so easily
    (often without paying the creator of that information)
    then there is more incentive to create less valuable, disposable information.
    If the profits from my hard-won, published deep thoughts about programming
    decay more quickly due to file sharing,
    then the sensible thing to do is publish much
    more easily obtained shallow knowledge much more often,
    a learned response clearly visible at any local bookstore,
    and amplified in the electronic book arena,
    where prices are often driven down to the logical extreme of \$0.00.
    Evidence of disposable information from the more than
    one million titles in Amazon's \enquote{Kindle Store} includes e-books about murderers whose trials are still ongoing.
    Software, of course, is also a form of knowledge,
    and economics is also driving increasingly disposable information there.
    Why spend years trying to create software that could aid
    medical research when you can spend a few weeks creating a game
    for an \enquote{app store} that might turn out to be the next \enquote{Angry Birds}?
    Although some developers chafe at the degree of information control
    Apple exerts on their \enquote{app store} (a game that used the built-in accelerometer to let you play \enquote{How High Can I Throw My iPhone?} was rejected),
    the fact that Apple approved 14 iPhone fart applications
    (with more than one being named \enquote{Pull My Finger})
    in a single day reveals the real state of affairs.}%
\end{document}

unwanted vspace

Best Answer

I ran your code. (By the way, your not-so-minimal example doesn't compile unless the natbib and pagenote packages are loaded as well.) At least in the example you provide, there's indeed a problem of "overstretched inter-paragraph glue", confirming @Marco's guess. I can think of three solutions to this problem:

  • You could issue the following command in your document's preamble:

    \raggedbottom

    This will let the height of the textblock vary from page to page. If your document is going to be typeset two-sided (aka bookstyle) with facing pages, the result may look poor from a typographic perspective.

  • A more elegant solution (from a typographic perspective) consists of inserting the command

    \usepackage[bottom]{footmisc}

    (also in the preamble). Doing so will insert additional whitespace between the main text and the footnote rule rather than over-stretching the inter-paragraph glue. Of course, if a given page doesn't have any footnotes and features a single paragraph break, this approach won't help.

  • A third and, in my view at least, even better solution would be to break up the text into more but shorter paragraphs. Doing so would increase the text's intelligibility and lessen the risk of an occurrence of the problem you describe.

    The example code you provide has an entire page consisting of only two, extremely long paragraphs, plus a rather long footnote. If you were to break up each of these two long paragraphs into two -- better yet, three -- shorter paragraphs, LaTeX would have many more degrees of freedom to play with while building up the page, thereby much reducing the likelihood that LaTeX will have to resort to over-stretching the inter-paragraph glue.