[Tex/LaTex] Calculate the hash (MD5 or otherwise) of a string

macrosprogramming

I'm trying to cache the results of a macro, similar to that question, but the argument of the macro can contain arbitrary characters and it's not suitable to be inside \csname ... \endcsname.

So, I'm wondering: surely some of the TeX core or packages contain functionality to calculate a hash, MD5 or otherwise (I don't really care which). But… I can't find it. The only result I've got from grep'ing my TeXlive tree is pdfmdfivesum, but it only works on files, not strings.

So: are there ready-made hash calculation macros/packages available somewhere?

Best Answer

\pdfmdfivesum also works on arbitrary strings:

\pdfmdfivesum{Hello World}

Result:

B10A8DB164E0754105B7A99BE72E3FE5

The hex string can be decoded to save space:

\pdfunescapehex{\pdfmdfivesum{Hello World}}

\pdfmdfivesum is expandable and can be used inside \edef.

\pdfmdfivesum works on file only, when the keyword file is given:

\pdfmdfivesum file {<filename>}

Package pdftexcmds

\pdfmdfivesum is available in pdfTeX in both modes DVI and PDF. Package pdftexcmds defines missing pdfTeX primitives in LuaTeX. The package also works in plain TeX (\input pdftexcmds.sty). The command names are using the prefix \pdf@ instead of \pdf:

\pdf@mdfivesum{Hello World}

XeTeX

  • Older versions, e.g., XeTeX (3.14159265-2.6-0.99992), do not support MD5 sums.
  • \pdffivesumm was added around version 0.99993 from pdfTeX and later renamed to \mdfivesum. Thus, the current version (3.14159265-2.6-0.99996) calculates MD5 sums via \mdfivesum. Also keyword file is supported as in pdfTeX.