[Tex/LaTex] Specify -job-name option to pdflatex with special characters

input-encodingsjobnamepdftex

I'm working on a python script that will compile a series of latex documents according to a work list. Basically, the script reads the file name and title of a document, adds a couple of things that should always be included and sends a pdflatex call to command line. The command has the following form:

>pdflatex -job-name="The title" "\somestuff\input{thefile}"

(I'm using a couple of options for output directories as well, but they have been ommitted here for brevity…)

Now, I have a couple of documents in the list that need to have special characters in the title. When that happens, the file names are corrupted. If I loop through the file names with python, reading them as utf8 shows the same thing as in my Windows Explorer window, but reading them as latin1 gives the correct names.

Currently ÅÄÖåäö are the only characters in question, but I'd like support for the entire utf8 charset. Is there any way I can make pdflatex encode these file names correctly?

I am moving the entire system to a server system with another OS when I'm done with it, so it's possible I'll need to encode these things in some other encoding then. Therefore, a solution that is encoding generic gets bonus points =)

Best Answer

What you want is currently not possible. Microsoft's implementation of the C standard library (which is what pdfTeX ultimately uses) never uses UTF-8 for the standard C functions. Do as Aditya suggests and use only ASCII characters for all file names when calling TeX.

Reading the file names in Python works because the Python runtime circumvents the C standard library on Windows. Working with Unicode file names on Windows always requires either a compatibility layer such as Cygwin or Windows-specific code, and I'm not aware that any of those is implemented by any engine (but it could be a valuable suggestion for LuaTeX).