[Tex/LaTex] How to generate compounded diacritical fonts for Sanskrit with XeTeX and LuaTeX

fontsfontspecluatexxetex

I'm trying to use diacritical marks for Sanskrit with a font that does not natively support all the marks (Minion Pro). So they are compounded somehow by XeTeX. This works almost fine in using:

\documentclass{scrartcl}
\usepackage{xunicode,xltxtra}
\usepackage{fontspec,newunicodechar}
\defaultfontfeatures{Ligatures=TeX}
\setmainfont{Minion Pro}

\newunicodechar{Ṛ}{\d{R}}
\newunicodechar{ṛ}{\d{r}}
\newunicodechar{Ṝ}{\={\d{R}}}
\newunicodechar{ṝ}{\={\d{r}}}
\newunicodechar{Ḷ}{\d{L}}
\newunicodechar{ḷ}{\d{l}}
\newunicodechar{Ḹ}{\={\d{L}}}
\newunicodechar{ḹ}{\={\d{l}}}
\newunicodechar{ṃ}{\d{m}}
\newunicodechar{ḥ}{\d{h}}
\newunicodechar{Ṭ}{\d{T}}
\newunicodechar{ṭ}{\d{t}}
\newunicodechar{Ḍ}{\d{D}}
\newunicodechar{ḍ}{\d{d}}
\newunicodechar{Ṅ}{\.{N}}
\newunicodechar{ṅ}{\.{n}}
\newunicodechar{Ṇ}{\d{N}}
\newunicodechar{ṇ}{\d{n}}
\newunicodechar{Ṣ}{\d{S}}
\newunicodechar{ṣ}{\d{s}}

\begin{document}
a  A

ā  Ā

i  I

ī  Ī

u  U

ū  Ū

ṛ  Ṛ

ṝ  Ṝ

ḷ  Ḷ

ḹ  Ḹ

e  E

ai  Ai

o  O

au  Au

ṃ ḥ

k  K

c  C

ṭ  Ṭ

t  T

p  P

kh  Kh

ch  Ch

ṭh  Ṭh

th  Th

ph  Ph

g  G

j  J

ḍ  Ḍ

d  D

b  B

gh  Gh

jh  Jh

ḍh  Ḍh

dh  Dh

bh  Bh

ṅ  Ṅ

ñ  Ñ

ṇ  Ṇ

n  N

m  M

y  Y

r  R

l  L

v  V

ś  Ś

ṣ  Ṣ

s  S

h  H

\end{document}

But the macrons above Ḹ, ḹ, Ṝ, and ṝ are misplaced and thiner than the ones above ā, ī and so on. What can I do about that?

Second problem: This fails completely in LuaTeX. xltxtra seems to be essential for this to work. But this is not supported in LuaTeX. I tried this using this code: https://tex.stackexchange.com/a/20791/19458

But the dots below are bigger than the i-dot. This is not the case with the solution above using XeTeX. Any idea how to do this in a better way in LuaTeX? Is it possible at all?

Best Answer

Here is something that seems to work:

\documentclass{scrartcl}
\usepackage{fontspec,newunicodechar}
\defaultfontfeatures{Ligatures=TeX}
\setmainfont{Minion Pro}

\UndeclareUTFcomposite[\UTFencname]{x1E0C}{\d}{D}
\UndeclareUTFcomposite[\UTFencname]{x1E0D}{\d}{d}
\UndeclareUTFcomposite[\UTFencname]{x1E25}{\d}{h}
\UndeclareUTFcomposite[\UTFencname]{x1E36}{\d}{L}
\UndeclareUTFcomposite[\UTFencname]{x1E37}{\d}{l}
\UndeclareUTFcomposite[\UTFencname]{x1E43}{\d}{m}
\UndeclareUTFcomposite[\UTFencname]{x1E46}{\d}{N}
\UndeclareUTFcomposite[\UTFencname]{x1E47}{\d}{n}
\UndeclareUTFcomposite[\UTFencname]{x1E5A}{\d}{R}
\UndeclareUTFcomposite[\UTFencname]{x1E5B}{\d}{r}
\UndeclareUTFcomposite[\UTFencname]{x1E62}{\d}{S}
\UndeclareUTFcomposite[\UTFencname]{x1E63}{\d}{s}
\UndeclareUTFcomposite[\UTFencname]{x1E6C}{\d}{T}
\UndeclareUTFcomposite[\UTFencname]{x1E6D}{\d}{t}

\UndeclareUTFcomposite[\UTFencname]{x1E44}{\.}{N}
\UndeclareUTFcomposite[\UTFencname]{x1E45}{\.}{n}

\makeatletter
\let\d\relax
\DeclareRobustCommand{\d}[1]
   {\hmode@bgroup
    \o@lign{\relax#1\crcr\hidewidth\ltx@sh@ft{-1ex}.\hidewidth}\egroup}
\let\.\relax
\DeclareRobustCommand{\.}[1]{\accent"02D9#1}
\DeclareRobustCommand{\MACRON}[1]{\accent"AF#1}
\makeatother

\newunicodechar{Ḍ}{\d{D}}
\newunicodechar{ḍ}{\d{d}}
\newunicodechar{ḥ}{\d{h}}
\newunicodechar{Ḷ}{\d{L}}
\newunicodechar{ḷ}{\d{l}}
\newunicodechar{ṃ}{\d{m}}
\newunicodechar{Ṇ}{\d{N}}
\newunicodechar{ṇ}{\d{n}}
\newunicodechar{Ṛ}{\d{R}}
\newunicodechar{ṛ}{\d{r}}
\newunicodechar{Ṣ}{\d{S}}
\newunicodechar{ṣ}{\d{s}}
\newunicodechar{Ṭ}{\d{T}}
\newunicodechar{ṭ}{\d{t}}

\newunicodechar{Ṅ}{\.{N}}
\newunicodechar{ṅ}{\.{n}}

\newunicodechar{Ḹ}{\d{\MACRON{L}}}
\newunicodechar{ḹ}{\d{\MACRON{l}}}
\newunicodechar{Ṝ}{\d{\MACRON{R}}}
\newunicodechar{ṝ}{\d{\MACRON{r}}}

\begin{document}
\parbox{.5\textwidth}{
a  A
ā  Ā
i  I
ī  Ī
u  U
ū  Ū
ṛ  Ṛ
ṝ  Ṝ
ḷ  Ḷ
ḹ  Ḹ
e  E
ai  Ai
o  O
au  Au
ṃ ḥ
k  K
c  C
ṭ  Ṭ
t  T
p  P
kh  Kh
ch  Ch
ṭh  Ṭh
th  Th
ph  Ph
g  G
j  J
ḍ  Ḍ
d  D
b  B
gh  Gh
jh  Jh
ḍh  Ḍh
dh  Dh
bh  Bh
ṅ  Ṅ
ñ  Ñ
ṇ  Ṇ
n  N
m  M
y  Y
r  R
l  L
v  V
ś  Ś
ṣ  Ṣ
s  S
h  H
}
\end{document}

enter image description here

As you see it's necessary to undo some of the work done by xunicode (which is automatically loaded by fontspec and needn't to be loaded explicitly). Also some of the standard accents must be redefined, or they wouldn't use the main document font.

Update 2017

The macros above work provided fontspec is loaded with the euenc option. On the other hand, the new default TU encoding doesn't declare composites with \d or \.N and \.n, so the code is simpler.

\documentclass{scrartcl}
\usepackage{fontspec}
\usepackage{newunicodechar}
\setmainfont{Minion Pro}

\makeatletter
\let\d\relax
\DeclareRobustCommand{\d}[1]
   {\hmode@bgroup
    \o@lign{\relax#1\crcr\hidewidth\ltx@sh@ft{-1ex}.\hidewidth}\egroup}
\let\.\relax
\DeclareRobustCommand{\.}[1]{\accent"02D9#1}
\DeclareRobustCommand{\MACRON}[1]{\accent"AF#1}
\makeatother

\newunicodechar{Ḍ}{\d{D}}
\newunicodechar{ḍ}{\d{d}}
\newunicodechar{ḥ}{\d{h}}
\newunicodechar{Ḷ}{\d{L}}
\newunicodechar{ḷ}{\d{l}}
\newunicodechar{ṃ}{\d{m}}
\newunicodechar{Ṇ}{\d{N}}
\newunicodechar{ṇ}{\d{n}}
\newunicodechar{Ṛ}{\d{R}}
\newunicodechar{ṛ}{\d{r}}
\newunicodechar{Ṣ}{\d{S}}
\newunicodechar{ṣ}{\d{s}}
\newunicodechar{Ṭ}{\d{T}}
\newunicodechar{ṭ}{\d{t}}

\newunicodechar{Ṅ}{\.{N}}
\newunicodechar{ṅ}{\.{n}}

\newunicodechar{Ḹ}{\d{\MACRON{L}}}
\newunicodechar{ḹ}{\d{\MACRON{l}}}
\newunicodechar{Ṝ}{\d{\MACRON{R}}}
\newunicodechar{ṝ}{\d{\MACRON{r}}}

\begin{document}
\parbox{.5\textwidth}{
a  A
ā  Ā
i  I
ī  Ī
u  U
ū  Ū
ṛ  Ṛ
ṝ  Ṝ
ḷ  Ḷ
ḹ  Ḹ
e  E
ai  Ai
o  O
au  Au
ṃ ḥ
k  K
c  C
ṭ  Ṭ
t  T
p  P
kh  Kh
ch  Ch
ṭh  Ṭh
th  Th
ph  Ph
g  G
j  J
ḍ  Ḍ
d  D
b  B
gh  Gh
jh  Jh
ḍh  Ḍh
dh  Dh
bh  Bh
ṅ  Ṅ
ñ  Ñ
ṇ  Ṇ
n  N
m  M
y  Y
r  R
l  L
v  V
ś  Ś
ṣ  Ṣ
s  S
h  H
Ḹ  ḹ
Ṝ  ṝ
}
\end{document}

enter image description here

Related Question