[Tex/LaTex] Using an exercises package to build lots of Math/Calculus exercise lists and tests

exercisespackagesworkflow

I have been using the exercise.sty package for a while now, mostly to build exercise lists for my students. After a few months from when I started, the code became very complicated. Today, it is a living nightmare.

But first, let me explain what I am using it for:

I am building Math, Precalculus and Calculus exercise lists and tests for my students. They range from Second Degree Equations, Integration Techniques, Volumes by Disks, you name it. That said, each type of class I teach demands a separate material "filtered" so that it best fits the class: level of dificulty, topic, number of questions on the exercise list and so on.

The problem is: I have close to 500 exercises and I'm not sure where to start. I would like to start the right way, so that I don't fall into a hole (like I am today).

My questions:

  • How can I type these exercises, the most efficient way possible, so that I can easily build these pdfs filled with the exercises that correspond to the given topic? Should I write these 500 exercises in a single .tex file?
  • Is there a package that allows two different types of solution for the same question? Let me explain this one: sometimes I just want to show the final solution, and sometimes the complete (worked-out) solution.
  • Some of my students are native from US so is there a way to, for the same exercise, create two (or more) different descriptions and answers (one in Portuguese and one in English)?
  • Lest's say at one point I reach 10,000 exercises. Can I use a single .tex file with all of them to build, for example, an exercise list with 40 "quadratic equations" exercises? In other words: use the "filters" or "labels" from these packages to make this filtering process?

I understand that these packages' capabilities are HUGE so I would love if someone could shine a light in a very efficient approach.

Thank you.

PS.: If I wasn't very clear on my explanations please let me know.

Best Answer

Since your question is about storing, maintaining and referencing a large set of exercises (potentially in the order of 10,000), I'm going to concentrate on that, so the style here is very basic.

It's possible to define conditionals using \newif (or through commands provided by packages such as etoolbox). For example:

\newif\ifsolutions
\newif\ifcomplete

These default to false, but can be switched on:

\solutionstrue
\completetrue

It's also useful to provide syntactic commands to mark the solution. For example:

\newcommand{\solutionname}{Solution}
\newcommand{\solution}{\par\textbf{\solutionname}:\par}

As has been mentioned in one of the other answers, it's also possible to use environments and the comment package. For multilingual support, the caption hooks can be used to redefine \solutionname as appropriate. For example:

\usepackage[USenglish]{babel}

\addto\captionsUSenglish{%
  \renewcommand\solutionname{Solution}%
}

Now an exercise can be written using these commands. For example:

$y = \sin(2x)$
\ifsolutions
 \solution
 \ifcomplete
  Intermediate steps, further details etc.
 \fi
 $y' = 2\cos(2x)$
\fi

Environments provide a more LaTeXy feel, but let's concentrate on storing and accessing the questions.

The simple method, which has already been suggested, is to put each question in a separate file and load it with \input. For example, if this exercise is in the file exercises/calculus/easy/dsin.tex then the following MWE works:

\documentclass{article}

\newif\ifsolutions
\newif\ifcomplete

\solutionstrue
\completetrue

\newcommand{\solutionname}{Solution}
\newcommand{\solution}{\par\textbf{\solutionname}:\par}

\begin{document}
\begin{enumerate}
\item \input{exercises/calculus/easy/dsin}

\end{enumerate}
\end{document}

This is a relatively generic method, which can easily be translated to other TeX formats. For example, the Plain TeX equivalent is:

\newif\ifsolutions
\newif\ifcomplete

\solutionstrue
\completetrue

\def\solutionname{Solution}
\long\def\solution{\par{\bf\solutionname}:\par}

\newcount\questionnum

\long\def\question{%
 \par
 \advance\questionnum by 1\relax
 \number\questionnum.
}

\question \input exercises/calculus/easy/dsin

\bye

The problem is that, although this structure is fine for a small number of questions, it can become unmanageable for 10,000. I mentioned datatooltk in the comments, which can read and write .dbtex files (datatool's internal format), but I don't recommend using this format directly. These files just contain LaTeX code that defines the internal registers and control sequences used by datatool to store the required data. There's no compression and it takes up a huge amount of resources. The datatooltk application works better as an intermediary that can pull filtered, shuffled or sorted data from external sources in a way that can easily be input in the document. (See the datatool performance page that compares build times for large databases.)

There are switches, such as --shuffle or --sort, which instructs datatooltk to shuffle or sort the data after it's been pulled from the data source. This uses Java, which is more efficient than TeX, but if the data is stored in a SQL database, it's even more efficient to include these steps in the actual --sql switch. (Currently, datatooltk is only configured for MySQL, but it may be possible to use something else if the necessary .jar file can be added to the class path.)

SQL databases can be optimized to improve performance. Suppose you want to randomly select 20 questions from 500. How do you perform that selection in LaTeX? First you'd need to use the shell to find out all the available files (or have an index file that can be parsed). Then you need to shuffle the list. That will take a while to do with TeX. It's more efficient to do this with SQL. (See, for example, MySQL select 10 random rows from 600K rows fast.)

If you decide to use SQL, the next thing to consider is the table structure.

  • You'll need a unique id field. With this you'll be able to specifically select certain questions rather than have a random selection. (An auto increment primary key is best.)
  • A field containing the question. (Let's call it Question.)
  • A field containing the brief answer. (Let's call it Answer.)
  • A field containing the extended answer. (Let's call it ExtendedAnswer.)
  • A field identifying the difficulty level. (Let's call it Level.) This could be an integer (1 = easy) or an enumeration (easy, medium, hard).
  • A field identifying the topic. (Let's call it Topic.) An enumeration is probably the simplest type (for example, calculus, settheory).

I'm not quite sure about the language. There are two approaches that I can think of: have fields for the other language (For example, QuestionPortuges, AnswerPortuges and ExtendedAnswerPortuges) or have a separate entry for the question in a different language with an extra field for the language.

So the above exercise example, could have

  • Question => $y = \sin(2x)$
  • Answer => $y' = 2\cos(2x)$
  • ExtendedAnswer => Intermediate steps, further details etc. \[y' = 2\cos(2x)\]
  • Level => 1
  • Topic => calculus
  • Language => english or ExtendedAnswerPortuges => Passos intermédios, etc. \[y' = 2\cos(2x)\]

Note that this doesn't include the syntactic command \solution or the conditionals \ifsolutions and \ifcomplete, which makes it easier to arrange the various parts of the question and answer.

It may be that some exercises require a particular package (such as amsmath or graphicx), so perhaps there could also be a field for the required packages. For example Packages => graphicx,amsmath.

Any images or verbatim text must be stored outside the database somewhere on the file system. They could be on TeX's path or the database table could have a field with a list of external resources or the question/answer could simply use the full path.

The datatooltk call can be done before the LaTeX run or using the shell escape. There's also a datatooltk rule for arara users. Let's suppose, I use datatooltk to pull a random selection of questions and save the results in a file called exercises.dbtex. This can then be loaded in the document using:

\DTLloaddbtex{\exercisedb}{exercises.dbtex}

If the data includes the Packages field, you can make sure all the required packages are loaded by adding the following to the preamble:

\DTLforeach*{\exercisedb}{\Packages=Packages}
{\DTLifnullorempty{\Packages}{}{\usepackage{\Packages}}}

In the main part of the document:

\begin{enumerate}
\DTLforeach*{\exercisedb}% data base
{\Question=Question,\Answer=Answer,\ExtendedAnswer=ExtendedAnswer}% assignment list
{%
  \item \Question
  \ifsolutions
   \solution
   \ifcomplete
    \ExtendedAnswer
   \else
     \Answer
   \fi
  \fi
}
\end{enumerate}

Further reading: Using the datatool Package for Exams or Assignment Sheets