For understanding this I always prefer the cholesky-decomposition of the correlation-matrix.
Assume the correlation-matrix R of the three variable $X.Y.Z$ as
$$ \text{ R =} \left[ \begin{array} {rrr}
1.00& -0.29& -0.45\\
-0.29& 1.00& 0.93\\
-0.45& 0.93& 1.00
\end{array} \right]
$$
Then the cholesky-decomposition L is
$$ \text{ L =} \left[ \begin{array} {rrr}
X\\ Y \\ Z \end{array} \right] = \left[ \begin{array} {rrr}
1.00& 0.00& 0.00\\
-0.29& 0.96& 0.00\\
-0.45& 0.83& 0.32
\end{array} \right]
$$
The matrix L gives somehow the coordinates of the three variables in an euclidean space if the variables are seen as vectors from the origin, where the x-axis is identified with the variable/vector X and so on.
Then the correlations of X and Y is $\newcommand{\corr}{\rm corr} \corr(X,Y)=x_1 y_1 + x_2 y_2 + x_3 y_3 $ and we see immediately it it $\corr(X,Y)=-0.29 $ because of the zeros and the unit-factor. We see also immediately the correlation $\corr(X,Z)=-0.45$ again because of the zeros and the unit-cofactor. However, the correlation between Y and Z is $\corr(Y,Z) = -0.29 \cdot -0.45 + 0.96 \cdot 0.83$ The partial correlation (after X is removed) is that part for which no variance in the X-variable is present, so $\corr(Y,Z)._X = 0.96 \cdot 0.83 $. Now imagine, the value $0.83$ would be $-0.83$ instead. Then the partial correlation would be negative and the correlation between Y and Z were $ 0.29 \cdot 0.45 - 0.96 \cdot 0.83$
What we see is, that the partial correlations are partly independent from the overall correlations (though within some bounds)
It seems to me that the only unanswered part of your question is the part cited below:
Also, is there any robust version of partial correlation (like
kendall's 𝜏 τ /Spearman's rank correlation to Pearson's correlation)?
The same way you can have partial Pearson correlation coefficient, you can have partial Spearman correlation coefficient and also Kendall. See some R code below with the package ppcor that helps you with partial correlation.
library(ppcor)
set.seed(2021)
N <- 1000
X <- rnorm(N)
Y <- rnorm(N)
Z <- rnorm(N)
pcor.test(X, Y, Z, method='pearson')
You will be given an estimate of $-0.01175714$. If you rank the variables, that would be equivalent to the Spearman correlation.
pcor.test(rank(X), rank(Y), rank(Z), method='pearson')
And this way you get a partial spearman correlation of $0.008965395$. But you don't have to do this, you can just changed to spearman in the parameter of the function.
pcor.test(X, Y, Z, method='spearman')
And here we go, $0.008965395$ again. If you want to do the partial Kendall correlation, just changed the method parameter again.
pcor.test(X, Y, Z, method='Kendall')
This time, we got a partial Kendall correlation of $0.006344739$.
If by robust you mean not depending on the distribution of the random variables, among other things, and most importantly, a measure of independence, I recommend you to read about Mutual Information.
Best Answer
Note that correlation conditional on $Z$ is a variable that depends on $Z$, whereas partial correlation is a single number.
Furthermore, partial correlation is defined based on the residuals from linear regression. Thus, if the actual relationship is nonlinear, the partial correlation may obtain a different value than the conditional correlation, even if the correlation conditional on $Z$ is a constant independent of $Z$. On the other hand, it $X,Y,X$ are multivariate Gaussian, the partial correlation equals the conditional correlation.
For an example where constant conditional correlation $\neq$ partial correlation: $$Z\sim U(-1,1),~X=Z^2+e,~Y=Z^2-e,~e\sim N(0,1),e\perp Z.$$ No matter which value $Z$ takes, the conditional correlation will be -1. However, the linear regressions $X|Z$,$Y|Z$ will be constants 0, and thus the residuals will be the values $X$,$Y$ themselves. Thus, the partial correlation equals the correlation between $X$,$Y$; which does not equal -1, as clearly the variables are not perfectly correlated if $Z$ is not known.
Apparently, Baba and Sibuya (2005) show the equivalence of partial correlation and conditional correlation for some other distributions besides multivariate Gaussian, but I did not read this.
The answer to your question 2 seems to exist in the Wikipedia article, the second equation under Using recursive formula.