Split Table to Next Page – LuaLaTeX

lualuacodeluatex

I have need to split the table to next page without using longtable package. Because longtable package is not supporting directly float.

MY MWE is:

\documentclass{article}
\usepackage{luacode}


\begin{document}

\begin{luacode*}
local domobject = require "luaxml-domobject"
sample = [[
<?xml version="1.0" encoding="utf-8"?>
<art>
<title>Scattering of flexural waves an electric current</title>
<p>From these observations, another important result is that the individual masses of such observed black hole–black hole (BBHs) binaries can be much larger than what were expected previously both theoretically and observationally [14], and various scenarios have been proposed [<xref ref-type="bibr" rid="cqgab7bbabib2">2</xref>, <xref ref-type="bibr" rid="cqgab7bbabib26">26</xref>]. In particular, observations of the same signal in two different detectors provides an efficient independent way to cross check and validate the instruments, which is particularly valuable for a space-based detector behavior of these parameters is presented in the table <xref ref-type="table" rid="cqgab7bbat1">1</xref>.</p>
<table-wrap id="tab1" position="float" tab-row-break="5"><label>Table 1.</label><caption id="tab1"><p>The large <italic>x</italic> behavior for different <italic>w</italic>, where <italic>F</italic> means finite.</p></caption><table><colgroup><col align="left"/><col align="center"/><col align="center"/><col align="center"/><col align="center"/><col align="center"/><col align="center"/></colgroup><thead><tr><th>Parameters:</th><th>e<sup>2<italic>γ</italic></sup></th><th><italic>r</italic><sup>2</sup></th><th><italic>ρ</italic></th><th><italic>L</italic></th><th><italic>V</italic></th><th><italic>E</italic></th></tr></thead><tbody><tr><td>A1</td><td>∞</td><td>∞</td><td>0</td><td>∞</td><td>∞</td><td><italic>F</italic></td></tr><tr><td>B2</td><td>0</td><td>∞</td><td><italic>F</italic></td><td><italic>F</italic></td><td><italic>F</italic></td></tr><tr><td>C3</td><td>∞</td><td>0</td><td><italic>F</italic></td><td><italic>F</italic></td><td><italic>F</italic></td></tr><tr><td>D4</td><td>0</td><td>∞</td><td><italic>F</italic></td><td><italic>F</italic></td><td><italic>F</italic></td><td><italic>F</italic></td></tr><tr><td>E5</td><td>0</td><td>∞</td><td>∞</td><td><italic>F</italic></td><td><italic>F</italic></td><td><italic>F</italic></td></tr><tr><td>F6</td><td>∞</td><td>∞</td><td>0</td><td>∞</td><td>∞</td><td><italic>F</italic></td></tr><tr><td>G7</td><td>∞</td><td>∞</td><td>0</td><td>∞</td><td>∞</td><td><italic>F</italic></td></tr><tr><td>H8</td><td>∞</td><td>∞</td><td>0</td><td>∞</td><td>∞</td><td><italic>F</italic></td></tr><tr><td>I9</td><td>∞</td><td>∞</td><td>0</td><td>∞</td><td>∞</td><td><italic>F</italic></td></tr><tr><td>J10</td><td>∞</td><td>∞</td><td>0</td><td>∞</td><td>∞</td><td><italic>F</italic></td></tr></tbody></table></table-wrap>

</art>
]]

local dom = domobject.parse(sample)
\end{luacode*}
\end{document}

If table is attribute is having tab-row-break='5', I have need to count the table-row from tbody and not in thead and need to give close tabular and open new tabular like \end{tabular}\pagebreak\begin{tabular}{...}.

How to achieve this?

Best Answer

You can transform your XML using the luaxml-transform library, but this task is made more difficult by the request to split the table into two floats, based on the tab-row-break attribute. We can use the luaxml-domobject library to preprocess XML, split the table and then we can easily use the transform library.

This is the full code, I will describe it bellow:

\documentclass{article}
\usepackage{fontspec}
\setmainfont{Linux Libertine O}
\usepackage{luacode}

% this is a modified version of \@makecaption from LaTeX classes
% it doesn't print : between table number and caption
\makeatletter
\newcommand\tablecaption[2]{%
  \vskip\abovecaptionskip
  \sbox\@tempboxa{\textbf{#1} #2}%
  \ifdim \wd\@tempboxa >\hsize
    \textbf{#1} #2\par
  \else
    \global \@minipagefalse
    \hb@xt@\hsize{\hfil\box\@tempboxa\hfil}%
  \fi
  \vskip\belowcaptionskip
}
\makeatother

\begin{document}

\begin{luacode*}
local domobject = require "luaxml-domobject"
local transform = require "luaxml-transform"
sample = [[
<?xml version="1.0" encoding="utf-8"?>
<art>
<title>Scattering of flexural waves an electric current</title>
<p>From these observations, another important result is that the individual masses of such observed black hole&#x2013;black hole (BBHs) binaries can be much larger than what were expected previously both theoretically and observationally [14], and various scenarios have been proposed [<xref ref-type="bibr" rid="cqgab7bbabib2">2</xref>, <xref ref-type="bibr" rid="cqgab7bbabib26">26</xref>]. In particular, observations of the same signal in two different detectors provides an efficient independent way to cross check and validate the instruments, which is particularly valuable for a space-based detector behavior of these parameters is presented in the table <xref ref-type="table" rid="cqgab7bbat1">1</xref>.</p>
<table-wrap id="tab1" position="float" tab-row-break="5"><label>Table 1.</label><caption id="tab1"><p>The large <italic>x</italic> behavior for different <italic>w</italic>, where <italic>F</italic> means finite.</p></caption><table><colgroup><col align="left"/><col align="center"/><col align="center"/><col align="center"/><col align="center"/><col align="center"/><col align="center"/></colgroup><thead><tr><th>Parameters:</th><th>e<sup>2<italic>&#x3b3;</italic></sup></th><th><italic>r</italic><sup>2</sup></th><th><italic>&#x3c1;</italic></th><th><italic>L</italic></th><th><italic>V</italic></th><th><italic>E</italic></th></tr></thead><tbody><tr><td>A1</td><td>&#x221e;</td><td>&#x221e;</td><td>0</td><td>&#x221e;</td><td>&#x221e;</td><td><italic>F</italic></td></tr><tr><td>B2</td><td>0</td><td>&#x221e;</td><td><italic>F</italic></td><td><italic>F</italic></td><td><italic>F</italic></td></tr><tr><td>C3</td><td>&#x221e;</td><td>0</td><td><italic>F</italic></td><td><italic>F</italic></td><td><italic>F</italic></td></tr><tr><td>D4</td><td>0</td><td>&#x221e;</td><td><italic>F</italic></td><td><italic>F</italic></td><td><italic>F</italic></td><td><italic>F</italic></td></tr><tr><td>E5</td><td>0</td><td>&#x221e;</td><td>&#x221e;</td><td><italic>F</italic></td><td><italic>F</italic></td><td><italic>F</italic></td></tr><tr><td>F6</td><td>&#x221e;</td><td>&#x221e;</td><td>0</td><td>&#x221e;</td><td>&#x221e;</td><td><italic>F</italic></td></tr><tr><td>G7</td><td>&#x221e;</td><td>&#x221e;</td><td>0</td><td>&#x221e;</td><td>&#x221e;</td><td><italic>F</italic></td></tr><tr><td>H8</td><td>&#x221e;</td><td>&#x221e;</td><td>0</td><td>&#x221e;</td><td>&#x221e;</td><td><italic>F</italic></td></tr><tr><td>I9</td><td>&#x221e;</td><td>&#x221e;</td><td>0</td><td>&#x221e;</td><td>&#x221e;</td><td><italic>F</italic></td></tr><tr><td>J10</td><td>&#x221e;</td><td>&#x221e;</td><td>0</td><td>&#x221e;</td><td>&#x221e;</td><td><italic>F</italic></td></tr></tbody></table></table-wrap>

</art>
]]

local dom = domobject.parse(sample)

-- prepare tables
for _, wrap in ipairs(dom:query_selector("table-wrap[tab-row-break]")) do
  local row_break = wrap:get_attribute("tab-row-break")
  local tables = wrap:query_selector("table") or {}
  -- we assume that there is just one table as <table-wrap> children
  local tbl = tables[1] 
  if tbl then
    -- convert <col> elements to latex specification and save it as an attribute
    local align = {}
    local align_convert = {left = "l", right = "r", center = "c"}
    for _, col in ipairs(tbl:query_selector("col")) do 
      local al = col:get_attribute("align") or "left"
      align[#align+1] = align_convert[al]
    end
    tbl:set_attribute("align", table.concat(align, " "))
  end
  -- create floats
  -- first save children
  local children = wrap:get_children()
  -- it will contain floats
  wrap._children = {}
  -- this is needed to fix a bug in LuaXML
  local function fix_parents(el)
    for k,v in ipairs(el._children or {}) do
      if v:is_element() then
        v._parent = el
        fix_parents(v)
      end
    end
  end
  for i = 1, 2 do
    local float = wrap:create_element("float")
    wrap:add_child_node(float)
    -- add saved children
    for _, child in ipairs(children) do
      -- save copy of the original child
      float:add_child_node(child:copy_node())
    end
    fix_parents(float)
  end
  -- now split tables
  local tbl = wrap:query_selector("table") 
  -- there should be two tables now
  if #tbl == 2 then
    local tbody = tbl[1]:query_selector("tbody tr")
    -- remove spurious rows from the first table
    for i = row_break + 1, #tbody do
      tbody[i]:remove_node()
    end
    -- remove spurious lines from the second table
    local tbody = tbl[2]:query_selector("tbody tr")
    for i = 1, row_break do
      tbody[i]:remove_node()
    end
  end
  -- place (continued...) text to second caption
  local captions = wrap:query_selector("caption")
  local caption = captions[2]
  if caption then
    -- remove original text
    caption._children = {}
    local par = caption:create_element("p")
    local text = par:create_text_node("(continued...)")
    par:add_child_node(text)
    caption:add_child_node(par)
  end
end

transformer = transform.new()
transformer:add_action("title", "\\section{@<.>}")
transformer:add_action("p", "@<.>\n\n")
transformer:add_action("table-wrap float", "\\begin{table}\n@<.>\n\\end{table}\n")
-- you need to define this command in your TeX file
transformer:add_action("table-wrap label", "\\tablecaption{@<.>}")
-- this is a second argument to \tablecaption
transformer:add_action("table-wrap caption", "{@<.>}\n\n")
transformer:add_action("italic", "\\textit{@<.>}")
transformer:add_action("sub", "\\textsubscript{@<.>}")
transformer:add_action("sup", "\\textsuperscript{@<.>}")
transformer:add_action("table", "\\begin{tabular}{@{align}}\n@<.>\\end{tabular}\n")
transformer:add_action("tr", "@<.>\\\\\n")
transformer:add_action("th", "@<.> &")
transformer:add_action("th:last-of-type", "@<.>")
transformer:add_action("tbody td", "@<.> &")
transformer:add_action("tbody td:last-of-type", "@<.>")

local content = transformer:process_dom(dom)
-- print(content)
-- print the transformed XML to LaTeX
transform.print_tex(content)
-- print(dom:serialize())
 
\end{luacode*}
\end{document}

This is the part that deals with tables:

local dom = domobject.parse(sample)

-- prepare tables
for _, wrap in ipairs(dom:query_selector("table-wrap[tab-row-break]")) do
  local row_break = wrap:get_attribute("tab-row-break")
  local tables = wrap:query_selector("table") or {}
  -- we assume that there is just one table as <table-wrap> children
  local tbl = tables[1] 
  if tbl then
    -- convert <col> elements to latex specification and save it as an attribute
    local align = {}
    local align_convert = {left = "l", right = "r", center = "c"}
    for _, col in ipairs(tbl:query_selector("col")) do 
      local al = col:get_attribute("align") or "left"
      align[#align+1] = align_convert[al]
    end
    tbl:set_attribute("align", table.concat(align, " "))
  end
  -- create floats
  -- first save children
  local children = wrap:get_children()
  -- it will contain floats
  wrap._children = {}
  -- this is needed to fix a bug in LuaXML
  local function fix_parents(el)
    for k,v in ipairs(el._children or {}) do
      if v:is_element() then
        v._parent = el
        fix_parents(v)
      end
    end
  end
  for i = 1, 2 do
    local float = wrap:create_element("float")
    wrap:add_child_node(float)
    -- add saved children
    for _, child in ipairs(children) do
      -- save copy of the original child
      float:add_child_node(child:copy_node())
    end
    fix_parents(float)
  end
  -- now split tables
  local tbl = wrap:query_selector("table") 
  -- there should be two tables now
  if #tbl == 2 then
    local tbody = tbl[1]:query_selector("tbody tr")
    -- remove spurious rows from the first table
    for i = row_break + 1, #tbody do
      tbody[i]:remove_node()
    end
    -- remove spurious lines from the second table
    local tbody = tbl[2]:query_selector("tbody tr")
    for i = 1, row_break do
      tbody[i]:remove_node()
    end
  end
  -- place (continued...) text to second caption
  local captions = wrap:query_selector("caption")
  local caption = captions[2]
  if caption then
    -- remove original text
    caption._children = {}
    local par = caption:create_element("p")
    local text = par:create_text_node("(continued...)")
    par:add_child_node(text)
    caption:add_child_node(par)
  end
end

First of all, it converts the column alignment information to LaTeX tabular specification and saves it as an attribute for the <table> element. It makes the transformation easier. It then makes two float elements, and copies the original table content to them. We then remove spurious table rows from both copies, and set (continued...) as table caption for the second float.

This DOM object can then be transformed using the following rules:

transformer = transform.new()
transformer:add_action("title", "\\section{@<.>}")
transformer:add_action("p", "@<.>\n\n")
transformer:add_action("table-wrap float", "\\begin{table}\n@<.>\n\\end{table}\n")
-- you need to define this command in your TeX file
transformer:add_action("table-wrap label", "\\tablecaption{@<.>}")
-- this is a second argument to \tablecaption
transformer:add_action("table-wrap caption", "{@<.>}\n\n")
transformer:add_action("italic", "\\textit{@<.>}")
transformer:add_action("sub", "\\textsubscript{@<.>}")
transformer:add_action("sup", "\\textsuperscript{@<.>}")
transformer:add_action("table", "\\begin{tabular}{@{align}}\n@<.>\\end{tabular}\n")
transformer:add_action("tr", "@<.>\\\\\n")
transformer:add_action("th", "@<.> &")
transformer:add_action("th:last-of-type", "@<.>")
transformer:add_action("tbody td", "@<.> &")
transformer:add_action("tbody td:last-of-type", "@<.>")

There is nothing too special, only see that we need to handle last items in rows, in order to prevent insertion of the extra & character, it would cause compilation error. This we need the tbody td:last-of-type rule.

Also note that we use the align attribute that we defined by the DOM processing function earlier, to set the correct tabular declaration.

We also expect that <label> and <caption> elements are next to each other, because they produce the \tablecaption command, and it would break if they weren't at their expected places.

Lastly, see that you need to use a font that supports all special characters, like Linux Libertine in my example.

This is the result:

enter image description here

Related Question