[Tex/LaTex] R Markdown/pandoc: Change how code blocks are converted to LaTeX

knitrmarkdownpandocr

I use rmarkdown to generate a pdf file (which uses pandoc, which in turn uses pdflatex). By enabling keep_tex option, I noticed that code blocks without language type always become verbatim, e.g.:

```
my code block
```

becomes:

\begin{verbatim}
my code block
\end{verbatim}

Is it possible to change conversion rules (e.g. in case I want to change background color for those blocks)?

Best Answer

Thanks to @AlanMunn, I found at least 2 solutions that work for code chunks with undefined language. Both are pandoc-specific, and don't actually require rmarkdown (examples for both can be found here):

1. Call pandoc with --listings option

In rmarkdown, it can be passed like this:

---
output:
  pdf_document:
    pandoc_args: [ "--listings" ]
---

This will make it so all code blocks (including ones without defined language) will instead be wrapped in lstlisting like this:

\begin{lstlisting}
my code block
\end{lstlisting}

With this, you have the option to customize appearance of those blocks by doing the following:

  1. Define your own language with \lstdefinelanguage to use in default style
  2. Define default style with \lstset to change appearance of chunks with unspecified language

2. Use pandoc filter

In rmarkdown, it can be passed like this:

---
output:
  pdf_document:
    pandoc_args: [ "--filter", "./my-cool-filter" ]
---

"Filter" is an executable written in any language that can process JSON. pandoc will convert the entire document into JSON dictionary, and pipe it through the filter, then work with the result. So, a filter can modify output in any arbitrary way.

In this particular case, you basically need to find all elements with "CodeBlock" type, and convert them into "RawBlock" with "tex" subtype, and wrap text value into any LaTeX macro of your liking.

UPD: It's even easier (and, apparently, faster, according to documentation), to use --lua-filter option. The whole code of the filter is just this:

function CodeBlock(elem)
  return pandoc.RawBlock('tex', '\\begin{my-cool-macro}\n' .. elem.text .. '\n\\end{my-cool-macro}')
end