There is function unicode.utf8.char
for direct unicode character inserting in Lua functions:
% !TEX TS-program = lualatex
\documentclass{article}
\usepackage{fontspec}
\setmainfont[Ligatures=NoCommon]{Latin Modern Roman}
\usepackage{luacode,luatexbase}
\begin{luacode}
local uchar = unicode.utf8.char
function dosub ( s )
s = string.gsub ( s , 'ff', uchar(64256))
return ( s )
end
\end{luacode}
\AtBeginDocument{%
\luaexec{luatexbase.add_to_callback ( "process_input_buffer", dosub, "dosub" )}%
}
\begin{document}
off \directlua{ tex.sprint ( dosub ( \luastring{off} ) ) } off
\end{document}

But the main issue in your code is that the callback is inserted too early and it probably replaces ff
chars in some macros loaded in \AtBeginDocument
. So other solution is to insert the callback in \AtBeginDocument
as well, which reduces the risk of such collision (you should do that even in the first method):
% !TEX TS-program = lualatex
\documentclass{article}
\usepackage{fontspec}
\setmainfont[Ligatures=NoCommon]{Latin Modern Roman}
\usepackage{luacode,luatexbase}
\begin{luacode}
function dosub ( s )
s = string.gsub ( s , 'ff', '\\char64256{}')
return ( s )
end
\end{luacode}
\AtBeginDocument{%
\luaexec{luatexbase.add_to_callback ( "process_input_buffer", dosub, "dosub" )}%
}
\begin{document}
off \directlua{ tex.sprint ( dosub ( \luastring{off} ) ) } off
\end{document}
Edit:
There is also another catch, what if your document body include some macro with ff
in a name? To fix that, we can use such function:
\begin{luacode}
local uchar = unicode.utf8.char
function dosub ( s )
local x = s:gsub('(\\?)([%a%@]+)', function(back,text)
if back~="" then
return back .. text
end
return text:gsub ( 'ff', uchar(64256))
end)
print("x", x)
return x
end
luatexbase.add_to_callback ( "process_input_buffer", dosub, "dosub" )
\end{luacode}
with s:gsub('(\\?)([%a%@]+)', function(back,text)
we catch all words, including macros. If variable back
is not empty string, the current word is a macro and we need to return it unprocessed. Otherwise, we can apply ff
replacing regexp.
Note that in this case add_to_callback
is used without AtBeginDocument
, because when \offer
macro is defined in the preamble, it's text wouldn't be replaced. Because we now skip macros, it shouldn't matter.
And as closing remarks I would add that node processing callbacks are much better for this kind of hacks, exactly because of these problems with macros.
For instance the following code:
local uchar = unicode.utf8.char
local fchar = string.byte("f")
local glyph_id = node.id("glyph")
local glue_id = node.id("glue")
local function next_status(n, node_table)
local node_table = node_table or {}
table.insert(node_table, n)
if not n then return false end
if n.id == glyph_id and n.char == fchar then
return true, node_table
elseif n.id == glyph_id or n.id == glue_id then
return false
else
return next_status(n.next, node_table)
end
end
local function node_dosub(nodes)
for n in node.traverse(nodes) do
if n.id == glyph_id and n.char == fchar then
local next, node_table = next_status(n.next)
if next == true then
n.char = 64256
for _, x in ipairs(node_table) do
node.remove(nodes, x)
end
end
end
end
return nodes
end
luatexbase.add_to_callback ( "pre_linebreak_filter", node_dosub, "node_dosub" )
it is more complicated, because we can't operate on string level, but on individual nodes. lot of node types exists, glyph
nodes with node.id
37 are important for us. every glyph node has char
field, holding the character code. When glyph with f
character is found, we peek next nodes to find whether there is another f
glyph next to this one. when it is found, we replace current character with code for ff
ligature and delete next f
glyph.
Best Answer
Until
xypic
is updated, you can use theluatex85
compatibility package:with log: