[Tex/LaTex] How to publish a package that includes scripts and/or executables

ctanpackage-writingpythontexlive

I'm preparing to publish my first package, and am not sure about the best way to proceed. I understand the standard CTAN publishing approach, but I'm dealing with a somewhat atypical case since my package involves Python scripts and (for Windows) batch files. I'm not sure what special treatment these may require.

Background

My package allows Python code to be typeset within a document and/or saved to an external file from which it is executed (and then the output can be pulled back into LaTeX). Basically, a combination of most features of python.sty, SageTeX, and minted, with a lot of emphasis on speed (everything is hashed, only run when changed, and run in parallel) and usability (all Python errors can be reported in the LaTeX editor, with correct line numbers).

Getting this to work involves running latex file.tex, pythontex.bat file (or pythontex.py file, depending on operating system; the .bat basically just calls the .py), and then latex file.tex. The problem is that the package needs to be installed such that pythontex.bat (or pythontex.py) can be run without specifying the full path (that is, be somewhere that is in the system path). They need to behave as if they were binaries in TeX Live's bin/ folder. This is necessary so that they can easily be invoked by a LaTeX editor or from the command line.

There is one additional complication. The package requires two additional Python files, pythontex_utils.py and pythontex_types.py. These need to be able to be imported by any Python script. That means that either they need to be installed as a Python package (so Python can automatically find them), or I need a way to automatically determine the full path to their location (so they can be imported using the path).

Possible approaches

  • SageTeX bundles all the Sage/Python stuff within Sage. It is up to the user to make the .sty file accessible. I could take a similar approach: put my code in a Python package, and hopefully come up with some kind of automatic installer for the .sty file. That would make it easier to deal with the Python side of things (version 2.x versus 3.x, etc.), but would also mean that the package isn't on CTAN.
  • I could put everything on CTAN except for the Python stuff that must be importable, and publish that separately as a Python package. The main disadvantage would be that users would have to install two things.
  • I could put everything on CTAN, if there are satisfactory ways of dealing with all of my questions. That approach is preferable.

Summary

  • I would like a way to install a .py or .bat file as part of a package installation, so that it can be launched just as if it were an executable in TeX Live's bin/ folder (since that folder is added to the system path).
  • This may be a restatement of the first point, but any elucidation of how TeX Live handles scripts would be very useful. For example, sty2dtx is a Perl script, but at least in the Windows installation of TeX Live, there is also a sty2dtx.exe under bin/, which launches the Perl script. It would be nice to know if these binaries are coming from package authors or TeX Live. I suspect TeX Live (at least for sty2dtx, there is no .exe on CTAN).
  • Unless I also publish a separate Python package, I need a way to get the full path to where a LaTeX package is installed, based just on knowing the package name, so my programs can find the Python files that must be imported.

Best Answer

Based on the functionality of your package it seems to me that it belongs more to TeX than Python realm, so I agree with you that CTAN is the best place to put it. There are several different aspects in your question, I will try to address them below as thoroughly and generally as I can.

Scripts vs. binaries

Firstly, the good news for you is that scripts are much easier to integrate into TeX distributions than compiled programs. To get the latter into TeX Live or MiKTeX it is best to get in touch with the distribution maintainer(s) through the appropriate mailing list.

Submission to CTAN

You need to decide whether to submit your package to CTAN as a flat .zip archive or in a .tds.zip TDS format (see also TDS submission guidelines). TDS avoids ambiguity in file layout, but be sure to adhere to the specification - a flat .zip with no subdirectories is preferred over messed up .tds.zip. Test your .tds.zip before submitting, e.g., install into TEXMFHOME and see if everything works (see below on how to test the executable scripts). Here's the layout I would suggest for your package pythontex:

doc/
   +-- latex/ 
            +-- pythontex/
                         +-- pythontex.pdf
                         +-- README
scripts/
       +-- pythontex/
                    +-- pythontex.bat
                    +-- pythontex.py
                    +-- pythontex_types.py
                    +-- pythontex_utils.py
source/
      +-- latex/ 
               +-- pythontex/
                            +-- pythontex.dtx
tex/
   +-- latex/ 
            +-- pythontex/
                         +-- pythontex.sty

You should include all package files in the archive, not just the source files (e.g., not only .dtx, but also the .sty file derived from it). You should also include a short and clear README file specifying the purpose of the package, its license (needs to be free to include in TL), its contents (files and their purpose) and any other requirements needed to install and use the package (e.g., external dependencies like Python).

Executable scripts

Directories with executables (i.e., those added to PATH) are not included in TDS specification, but as a package author you don't need to worry about that. Just put your scripts under scripts/<package name> and make clear in the package README, which script is the main program to be executed. TeX distros will then add a symlink (TL on Unix) or a launching wrapper (win32, TL and MiKTeX) in the bin directory.

Whether to include a wrapper for launching the script is up to you. In principle, this is not needed nowadays for TeX Live and MiKTeX - both have their own, specialized wrappers for this purpose. However, some users may need to install your package directly from CTAN (e.g., to use with older TL version), so adding at least a .bat wrapper for Windows (see example below) may be nice. For Unix just start your main script with #!/usr/bin/env python (for system portability /usr/bin/env is recommended over hardcoding interpreter's absolute path).

For windows I can suggest the following wrapper (if saved as pythontex.bat it will execute pythontex.py script).

@echo off
setlocal enableextensions
rem assuming the main script is in the same directory
if not exist "%~dpn0.py" (
  echo %~nx0: main script "%~dpn0.py" not found>&2
  exit /b 1
)
rem check if interpreter is on the PATH
for %%I in (python.exe) do set "PYTHONEXE=%%~$PATH:I"
if not defined PYTHONEXE (
  echo %~nx0: Python interpreter not installed or not on the PATH>&2
  exit /b 1
)
"%PYTHONEXE%" "%~dpn0.py" %*

As I mentioned, TeX Live and MiKTeX use their own methods of launching scripts, though I'm only familiar with TL's side of things. TeX Live uses runscript.tlu utility for this and users can make use of it also for their own custom or manually installed scripts. This can be also used by package authors for testing, e.g., you could test if your .tds.zip works correctly. For details see the output of runscript -h (add -v switch to learn all the gory details of the actual implementation). Here's an excerpt from it:

The following script types and their file extensions are currently
supported and searched in that order:

  Lua      (.tlu;.texlua;.lua) --  included
  Perl     (.pl)               --  included
  Ruby     (.rb)               --  requires installation
  Python   (.py)               --  requires installation
  Tcl      (.tcl)              --  requires installation
  Java     (.jar)              --  requires installation
  VBScript (.vbs)              --  part of Windows
  JScript  (.js)               --  part of Windows
  Batch    (.bat;.cmd)         --  part of Windows

Finally, Unix-style extensionless scripts are searched as last and
the interpreter program is established based on the she-bang (#!)
specification on the very first line of the script.  This can be
an arbitrary program but it must be present on the search path.

It is recommended to write new utilities in Lua, if only possible, since Lua interpreter is now available out of the box on all platforms thanks to LuaTeX. A close second is Perl, which is shipped with TL on win32. Anything else has to be installed separately on Windows.

Finding script/package resources

Complex scripts might be spread over multiple files and there is no silver bullet solution to how to locate such files. The standard way of finding files in TeX Live (which now works in MiKTeX too) is to use Kpathsea and its kpsewhich utility, e.g., kpsewhich -format texmfscripts pythontex_utils.py will output the full path to pythontex_utils.py if it finds it in scripts subdirectory under one of the TEXMF trees. In LuaTeX, Kpathsea library is built-in and can be accessed directly. There might be some other, perhaps better ways, which are specific to Python, Perl, etc., but this should be asked elsewhere.

Related Question