[Math] How to distribute the source of programs used in a paper

mathematical-writing

I have written a paper, which includes an appendix discussing how to obtain numerical evidence for the result of the paper. Now the computation essentially works as follows:

  • Create a large tridiagonal matrix.
  • Compute its eigenvalues.
  • Compute the difference of consecutive eigenvalues, and output it.

The implementation of such an algorithm is rather straightforward, but in order to look at large matrices, I started using algorithms from a package called LAPACK, which turned out to be faster then regular algorithms provided by Matlab. (I'm no specialist, so not exactly sure what happens).

I am curious if one should provide the source code for such a computation, and if yes in what form. I cam up with the following options:

  • Pseudocode (as above)
  • Simplified matlab, that works with any installation of matlab, but is too slow to actually do the computations.
  • The real code, which most people will not be able to get to run without some effort.

I am also curious if one should include some sort of source code in the paper, and if yes, in what form? Or what people have done in such a case…

The simplified code is available at:
http://math.rice.edu/~hk7/ftp/matlab_code/SkewSpecDense.m

I have not put the real code online, because it requires external packages, and I am not sure how easy it is to install them…

Best Answer

My preference is detailed pseudocode, at a high-enough level of abstraction to allow understanding the algorithm.

Of course, as pointed out by Ryan Budney's comment, it depends strongly on what the journal requirements are and in which journal you publish. However, I feel strongly that the complete code-set which you use should be available from some resource, either through the journal article's publsher, or from your own website, your academic website, or via Arxiv.

If the pseudo-code is detailed enough to allow reimplementing the algorithm straightforwardly by another mathematician, then that should be sufficient.

If the pseudo-code has to leave out certain details which are germane to the computation, then the interpreted code which implements the algorithm in a numerical computational package (such as Maple, Matlab, Sage, or Octave or Scilab (download link ) which are free open source software packages capable of running code similar to or equivalent to matlab) should be provided.

Why not provide both? -- If you can provide a link to your own webpage for the paper, or for its supporting supplemental materials, I don't see why you couldn't provide both the interpreted code and the compileable C or C++ code on your webpage, unless there are copyright issues involved such as if you did not write all of the code yourself and do not have the right to release all of the code source. I am a supporter of free open-source software and the Gnu organization's GPL licensing, which would allow others to benefit from your code and to contribute back to it via incremental improvements.

I suggest that you specify which version of software package, operating system, compiler, and/or library you used in running your program or in creating the binary application from your code. This is necessary because different versions of Octave (2.3 vs. 3.0) or Matlab (R10, R13, etc.) or any software package may implement or include different routines and may not be capable of correctly running your software program.

I would recommend that if particular packages are necessary in order to run the interpreted code in Octave or Matlab that you list which packages they are. In the same vein, if your C or C++ code requires particular libraries such as LAPACK or BLAS, make sure to list them in a text file or in a header file. If you know how to use the make program, you can create a makefile to help others in compiling your software.

The make program, the Gnu compiler collection, and many other development tools are all standard parts of Gnu/Linux distributions, such as Debian.

My preference is detailed pseudocode, at a high-enough level of abstraction to allow understanding the algorithm.

Related Question