Caveat: I'm not particularly well-versed in Granger causality, but I am generally statistically competent and I have read and mostly understood Judea Pearl's Causality, which I recommend for more info.
Is my interpetation directionaly correct
Yes. The fact that first hypothesis was rejected and second was not means that you can use $X$ to forecast $Y$.
What key insights have I overlooked
The really important thing to know in terms of key insights is that Granger-causation is only equivalent to causation (in the more common use of the term) under a fairly restrictive assumption, viz, that there are no other potential causes. If this assumption is not satisfied then Granger-causality is actually Granger-usefulness-for-forecasting. For example, if there is a variable $Z$ that causally influences both $X$ and $Y$, then the conclusion that $Y$ Granger-causes $X$ can be explained as the influence of $Z$ being felt in $Y$ before it's felt in $X$.
The p-value of .76 allows me to accept the null for X = f(Y)
Warning: esoteric bullshtatistical blathering follows. Technically, in the test of $X = f(Y)$ you can't "accept the null". You can "fail to reject the null" -- that is, you didn't find evidence that would warrant rejecting the null. This is the Fisherian view. Alternatively, you can take the Neymanian view: you don't assert the truth of the null; you just choose to act as if the null were true. (Personally I'm a Jaynesian, but let's not get into that.)
I'm a little rusty on my F-test
The point of the F-test is that it checks that the lagged values of $X$ jointly improve the forecast of $Y$ (or vice versa). One can imagine predicting $Y$ with two predictors $X_1$ and $X_2$ where $X_2$ is just $X_1$ with a bit of added noise. The F-test would compare a model with just $X_1$ (or just $X_2$) with the model containing both and find no evidence of improved prediction in the larger model.
I'm also not sure how to interpret the CCF graph
The plots of the auto-correlation and cross-correlation functions provide a rough graphical equivalent to the t-tests used in the testing procedure. In order to understand what is being plotted, it's first necessary to understand correlation as a measure of the linear relationship between two random variables. The cross-correlation function is just the correlation of one time series versus a lagged version of the other, and the auto-correlation is just the cross-correlation of a function and itself. Thus these plots show the time structure of the strength of the linear relationships both internally (auto) and from one to the other (cross). I can see from the autocorrelation plots, for example, that $Y$ is reasonably smooth but has no other particularly strong internal structure, whereas $X$ has an oscillation with a peak-to-peak period of about 120 time steps (because it is negatively correlated with itself at about 60 time steps).
One method would be to look at correlations, and, there are two cases to consider.
That, in turn, depends on whether or not we can assume independence. The worst case scenario is correlation of independent processes. In that case, we could be comparing for A & B, dollar values on the US Dow Jones market and the Canadian exchange in Toronto, to C & D representing temperatures in Osaka and Tokyo, Japan. The second case would be correlated correlations, where we could be comparing values on the the US and Canadian exchanges to the values on the Osaka and Tokyo exchanges.
In the first case, we could use a comparison of independent R-values from regression where in each regression we obtain our correlations from $R^2=1-\dfrac{SSE}{SST} $, and then test the significance of difference of R-values. For the second case, correlated correlations, smaller differences in R-values are more easy to detect.
Discussion So what if the correlation from Granger Causality is better for the North American markets than for the temperatures in Japan? There are two problems with this. First is that we are left with no real valid conclusions because we are comparing "apples to oranges" in the figurative sense. Second is because the prevailing winds in Osaka are sometimes from the direction of Tokyo and sometimes to Tokyo, so that Granger Causality could underestimate the actual relationship if seasonal variations in temperature and prevailing winds are ignored. In the second case, correlated correlations, one could find that the changes in the the larger markets (New York and Tokyo) precede changes in the smaller markets (Toronto and Osaka), but, why would we compare correlations anyway? Why wouldn't we, for example, do a round robin and Granger Causality compare New York and Tokyo markets directly, if that is what we want to know?
Best Answer
I am not sure there is a standard terminology regarding the strength of Granger causality. We could refer to "effect size" from regression modelling and define strength of Granger causality as the relevant effect size. Then to learn about the strength of Granger causality from $X$ to $Y$ you would look at the point estimates of the coefficients on lags of $X$ in the equation of $Y$. Large (in absolute value) point estimates signify a pronounced Granger-causal relationship.
As Henry said in the comments, $p$-values do not directly measure the strength of a relationship. They account for measurement precision in addition to strength. A relationship with a small point estimate can still have a small $p$-value if the estimation precision is high.