To reconcile both, note that $\bf x$ is obtained by stacking the columns of $\bf \Gamma$. For the cost, $\bf c$ is obtained by stacking the columns of the cost matrix $\bf C$.
Also, let $\mathbb I_n$ be an $n\times n$ identity matrix. Therefore, make:
$$
{\bf A} = \begin{bmatrix}
{\bf 1}_n^\top \otimes \mathbb I_m\\
\mathbb I_n \otimes {\bf 1}_m^\top
\end{bmatrix}
$$
Finally, the constraint ${\bf x}\geq 0$ is the same as the constraint that the transport plan $\bf \Gamma$ must be positive. This can be found in page 38 in the book Computational Optimal Transport, by Peyré and Cuturi. The pdf of the book is freely avaiable.
Just to support the response of Giacomo, you can use a generic Optimal Transport solver for the cost $c(x, y) = \langle A(x-y), x-y \rangle$. For instance, supposing empirical measures $\hat{\mu}$ and $\hat{\nu}$, with support $X = [x_{1}, x_{2}, \cdots, x_{n}]$ (resp. $Y$) and sample weights $\{a_{i}\}_{i=1}^{n}$ (resp. $\{b_{j}\}_{j=1}^{m}$), these are defined as,
$$\hat{\mu}(x) = \sum_{i=1}^{n}a_{i}\delta(x-x_{i}),$$
$$\hat{\nu}(y) = \sum_{i=1}^{m}b_{j}\delta(y-y_{j}),$$
You can check that using these definitions, the set $\Pi(\hat{\mu}, \hat{\nu})$ is finite (so we may substitute the inf by a min), and the optimization problem becomes:
$$\pi^{\star} = \underset{\pi \in \Pi(\hat{\mu}, \hat{\nu})}{\text{min}}\sum_{i=1}^{n}\sum_{j=1}^{m}C_{ij}\pi_{ij}$$
Now, as you may be aware, this is a (finite) linear program on the entries $\pi_{ij}$, for which there are known algorithms for solving it (e.g. Dantzig's algorithm). Sinkhorn's algorithm is a clever solution for approximating the solution $\pi^{\star}$. As Giacomo mentioned, "Computational Optimal Transport" presents a broad discussion on the subject.
Now, answering your question concerning the cost function, you can easily implement the cost on a programming language of your choice. All you need to note is that,
\begin{align*}
C_{ij} &= c(x_{i},y_{j})\\
&= \biggr(A(x_{i}-y_{j})\biggr)^{T}(x_{i}-y_{j})\\
&= (x_{i}-y_{j})^{T}A^{T}(x_{i}-y_{j})\\
&= (x_{i}-y_{j})^{T}A(x_{i}-y_{j})
\end{align*}
Personally, I recommend using Python, as there are good toolboxes for Optimal Transport, such as Python Optimal Transport (POT).
Side-topic: Python implementation
You may be wondering why I wrote $C_{ij} = (x_{i}-y_{j})^{T}A(x_{i}-y_{j})$. Well, the lib Scipy of Python has a high-performance function for computing pairwise distances, called cdist. Reading the documentation of this function, note that the expression for $C_{ij}$ looks a lot like the Mahalanobis distance. Indeed, $A$ in the latter expression is the $VI$ argument in the function. Thus, you can automatically program your cost function as,
from functools import partial
from scipy.spatial.pairwise import cdist
A = ... # definition of your positive-definite matrix of shape (p, p)
a = ... # Sample weights of \hat{\mu}, with shape (n,)
b = ... # Sample weights of \hat{\mu}, with shape (m,)
X = ... # Support of \hat{\mu}, with shape (n, p)
Y = ... # Support of \hat{\nu}, with shape (m, p)
C = cdist(X, Y, metric='mahalanobis', VI=A) # Cost matrix with shape (n, m)
pi = ot_solver(a, b, C) # optimal transport plan with shape (n, m)
```
Best Answer
These two are actually equivalent up to a constant when $\pi$ is a coupling of $\alpha$ and $\beta$. I'll assume that $\pi,\alpha, \beta$ all have densities. We can then write:
$$ H(\pi||\alpha\otimes \beta) = \int\ln\left(\frac{d\pi}{d\alpha d\beta} \right)d\pi = \int \pi(x,y) \ln\left(\frac{\pi(x,y)}{\alpha(x)\beta(y)} \right) dx dy $$
Note that $\pi(x,y)$ is the density with respect to the Lebesgue measure, and the same can be said for $\alpha(x)$ and $\beta(y)$. Therefore:
$$ H(\pi||\alpha\otimes \beta) = \int\pi(x,y)\ln \pi(x,y) dx dy - \int\pi(x,y)\ln(\alpha(x))dxdy - \int\pi(x,y)\ln(\beta(y))dxdy =\\ = \int \pi(x,y) \ln\pi(x,y) dx dy - \int\alpha(x)\ln\alpha(x) dx -\int \beta(y) \ln \beta(y) dy = H(\pi) - H(\alpha) - H(\beta) $$
Since $\alpha$ and $\beta$ are fixed, we get $H(\pi) + C$, where $C$ is a constant.