[Tex/LaTex] advanced string highlighting in listings

listingslstdefinestyle

I'm using listings to display ruby code with highlighting. I have the following test document:

\documentclass{article}

\usepackage{xcolor}
\usepackage{listings}

\definecolor{dkgreen}{rgb}{0,0.6,0}
\definecolor{mauve}{rgb}{0.58,0,0.82}

\lstdefinestyle{Ruby} {
    aboveskip=3mm,
    belowskip=3mm,
    showstringspaces=false,
    columns=flexible,
    basicstyle={\footnotesize\ttfamily},
    numberstyle={\tiny},
    numbers=left,
    keywordstyle=\color{blue},
    commentstyle=\color{dkgreen},
    stringstyle=\color{mauve},
    breaklines=true,
    breakatwhitespace=true,
    tabsize=2, 
    sensitive = true
}

\lstset{language=Ruby}

\begin{document}

\begin{lstlisting}[style=Ruby,float=ht,caption={...},label={lst:sourcecode},captionpos=b]
def some_function
  File.open(filename, 'w+') do |f|
    [...]
    # a comment
    f.puts "whatever #{some_variable} another string part"
    f.puts 'this string contains apostrophes: \'random word\''
    [...]
  end
end
\end{lstlisting}

\end{document}

Which looks like this:

example output

Of course, #{some_variable} is highlighted in purple/mauve because I set it as the stringstyle, but that's not really correct, since the syntax #{} will execute the content instead of interpreting this block as string (only if inside " ", not with ' ', but I'd be willing to ignore this subtlety).

My question is, is there a way to configure the highlighting to correctly represent this, so that #{some_variable} has the default color?

EDIT: with the answer presented by SDF, it now looks like this:

slightly wrong solution

If you compare the two pictures, you'll see that now the escaped apostrophes around random word aren't getting ignored like before (which was the correct behavior).

EDIT 2: while i was able to solve this problem by omitting string=[d]{'},, I noticed two more problems. The example now looks like this:

\documentclass{article}

\usepackage{xcolor}
\usepackage[procnames]{listings}

\definecolor{dkgreen}{rgb}{0,0.6,0}
\definecolor{gray}{rgb}{0.5,0.5,0.5}
\definecolor{mauve}{rgb}{0.58,0,0.82}
\definecolor{light-gray}{gray}{0.25}

\lstdefinestyle{Ruby} {
    aboveskip=3mm,
    belowskip=3mm,
    showstringspaces=false,
    columns=flexible,
    basicstyle={\footnotesize\ttfamily},
    numberstyle={\tiny},
    numbers=left,
    keywordstyle=\color{blue},
    commentstyle=\color{dkgreen},
    stringstyle=\color{mauve},
    breaklines=true,
    breakatwhitespace=true,
    tabsize=2, 
    sensitive = true,
    morestring=*[d]{"},
    morestring=[s][]{\#\{}{\}},
    procnamekeys={def},
    procnamestyle=\color{red},
}

\lstset{language=Ruby}

\begin{document}

\begin{lstlisting}[style=Ruby,float=ht,caption={...},label={lst:sourcecode},captionpos=b]
def some_function
  File.open(filename, 'w+') do |f|
    [...]
    # a comment
    f.puts "whatever #{some_variable} another string part"
    f.puts 'this string contains apostrophes: \'random word\''
    f.puts 'i do love keywords like class'
    f.puts "i do love keywords like class"
    f.puts "now single quotes 'inside #{double quotes}'"
    [...]
  end
end
\end{lstlisting}

\end{document}

worng keyword highlights and nested quote problem

Keywords inside double quotes are now getting highlighted, and also single quotes inside double quotes cause the original problem to resurface.

This is slowly geting out of hand… Maybe I should really switch to minted.

Best Answer

Note: I updated the whole answer to take the two edits into account. There are a lot of little hacks, but I'm afraid the more precise we want to be using listings, the more hacks we'll need to add. See at the end of the answer for an alternative solution using minted.

Solving the initial issue using listings

You can allow listings to detect delimiters inside another delimiter by adding a * in its definition:

morestring=*[d]{"}

Then we define #{ and } as special delimiters. We give them their own style by adding a second pair of square brackets:

morestring=[s][]{\#\{}{\}}

Here, we add empty brackets, which means the default style will be used. Also, don't forget to escape special characters such as #, {, etc. For more detailed explanations, have a look at listings documentation, section 3.3.

Remark: s option means that the beginning and ending delimiters are different, d that they are the same. One has to use b instead of d to enable backslash escaping. I made that mistake in my original answer. It's also worth noting that Ruby, like most languages, already has a basic definition, which includes most strings, so there's no need to re-define it all (unless we want to override it, and we will).

This is the \lstset that produces the output as seen in the OP's first edit:

\lstdefinestyle{Ruby} {
    aboveskip=3mm,
    belowskip=3mm,
    showstringspaces=false,
    columns=flexible,
    basicstyle={\footnotesize\ttfamily},
    numberstyle={\tiny},
    numbers=left,
    keywordstyle=\color{blue},
    commentstyle=\color{dkgreen},
    stringstyle=\color{mauve},
    breaklines=true,
    breakatwhitespace=true,
    tabsize=2, 
    morestring=[d]{'}, % wrong: should be [b]
    morestring=*[d]{"},
    morestring=[s][]{\#\{}{\}},
}

Solving additional issues

Keywords inside strings are getting highlighted

As Daniel said in a comment, the star in morestring=*[d]{"} causes it to look further for more strings and keywords. That's what we want regarding "#{-} strings", but it also happens for keywords. listings doesn't allow to specify what exactly we'll be looking for inside the strings, so we'll have find another work-around.

Now, listings offers a ** option so that the styles of the string and its special content can be cumulated. For example, when we do this:

morestring=**[d][\color{mauve}]{"},
keywordstyle=\bfseries,

listings will make keywords inside double-quotes both bold and mauve. Thing is, we need to "cumulate" colors.

morestring=**[d][\color{mauve}]{"},
keywordstyle=\color{blue},

In this case, keywords inside strings are processed with \color{mauve} \color{blue}, and that's bad: the keyword style overrides the string style. My hack was to replace the keyword style with a new command that checks the current color and sets it to blue if it's not already mauve:

\def\bluecolorifnotalreadymauve{%
    \extractcolorspec{.}\currentcolor
    \extractcolorspec{mauve}\stringcolor
    \ifx\currentcolor\stringcolor\else
        \color{blue}%
    \fi
}

(Thanks to this answer for the solution.)

Now we also lose our original #{} fix, because its (empty) style is "cumulated" with the \color{mauve} from "". Let's cumulate it back:

morestring=[s][\color{black}]{\#\{}{\}},

Single quotes cause the #{} problem to resurface

Just like keywords, single-quotes strings are re-processed inside double-quotes strings. And listings hasn't been told to look inside single-quotes strings, so we'll have to change them the same way:

morestring=**[d]{'},

And now we lose backslash escaping. For an unknown reason, b option doesn't work well with **. Well, while we're at it…

morestring=[d]{\\'},

Full updated MWE

\documentclass{article}

\usepackage{xcolor}
\usepackage[procnames]{listings}

\definecolor{dkgreen}{rgb}{0,0.6,0}
\definecolor{gray}{rgb}{0.5,0.5,0.5}
\definecolor{mauve}{rgb}{0.58,0,0.82}
\definecolor{light-gray}{gray}{0.25}

\def\bluecolorifnotalreadymauve{%
    \extractcolorspec{.}\currentcolor
    \extractcolorspec{mauve}\stringcolor
    \ifx\currentcolor\stringcolor\else
        \color{blue}%
    \fi
}

\lstdefinestyle{Ruby} {
    aboveskip=3mm,
    belowskip=3mm,
    showstringspaces=false,
    columns=flexible,
    basicstyle=\footnotesize\ttfamily,
    numberstyle=\tiny,
    numbers=left,
    keywordstyle=\bluecolorifnotalreadymauve,
    commentstyle=\color{dkgreen},
    stringstyle=\color{mauve},
    breaklines=true,
    breakatwhitespace=true,
    tabsize=2,
    moredelim=[s][\color{black}]{\#\{}{\}}, % same as morestring in this case
    morestring=**[d]{'},
    morestring=[d]{\\'},
    morestring=**[d]{"},
    procnamekeys={def}, % bonus, for function names
    procnamestyle=\color{red},
}

\lstset{language=Ruby}

\begin{document}

\begin{lstlisting}[style=Ruby,float=ht,caption={...},label={lst:sourcecode},captionpos=b]
def some_function
  File.open(filename, 'w+') do |f|
    [...]
    # a comment
    f.puts "whatever #{some_variable} another string part"
    f.puts 'this string contains apostrophes: \'random word\''
    f.puts 'i do love keywords like class'
    f.puts "i do love keywords like class"
    f.puts "now single quotes 'inside #{double quotes}'"
    [...]
  end
end
\end{lstlisting}

\end{document}

Output:

updated Ruby code with listings

Alternate approach: using minted

minted already does everything you want, and so much more! Here is a MWE:

\documentclass{article}

\usepackage{minted}

\begin{document}

\begin{listing}[H]
  \begin{minted}[fontsize=\footnotesize, linenos]{Ruby}
  def some_function
    File.open(filename, 'w+') do |f|
      [...]
      # a comment
      f.puts "whatever #{some_variable} another string part"
      f.puts 'this string contains apostrophes: \'random word\''
      f.puts 'i do love keywords like class'
      f.puts "i do love keywords like class"
      f.puts "now single quotes 'inside #{double quotes}'"
      [...]
    end
  end
  \end{minted}
  \caption{...}
\end{listing}

\end{document}

This is the output with the default style:

updated Ruby code with minted

The main downside of minted is that it relies on Pygments to do the processing, which means:

  1. It can be a bit tricky to install.

  2. It's harder to customize. (But once we know how to, it can be very powerful.)

Related Question