The acf is the ratio of the covariance to the variance. If you have pulses/level shifts/seasonal pulses and/or local time trends (my guess is YES!) then both the covariance and the variance are affected. The net result is that there is a downward bias to the calculated acf. I suggest that you post your time series and I will put it under a microscope. The problem with the DW test is that it only tests for lag 1 autocorrelation and of course following my discussion above is seriously downwards biased when anomalies are present. Yes Virginia, there is no Santa Claus! In closing I believe that more advanced(correct) tools are required.
1.
A regression implies that $Y$ is actually a function of $X$ (that is, $Y(X)$), but not the other way around ($X(Y)$), right? (since $X$ is assumed to be exogenous to the model, and $Y$ is the endogenous variable.)
Yes, a regression treats $Y$ as a function of $X$ and not the other way around. There is an additive random noise component there, too, which you forgot to include on the right hand side of the equation in the first line of your post.
2.
I do not think Durbin-Watson test is used for assessing exogeneity. You probably mixed it up with Durbin-Wu-Hausman test (or just the Hausman test).
Regarding testing exogeneity in time series, the Hausman test is among the better-known ones; here is a thread explaining how it works. There is another short thread here explaining why it is difficult to test for exogeneity; essentially, you have to examine all possible sources of endogeneity and reject all of them to establish exogeneity, but this is not possible in practice. I may add that this is similar to testing independence: you can never empirically prove that two variables are independent, you can just reject a particular form of dependence between them.
Besides the Hausman test, you may also look at Granger's block exogeneity test mentioned here.
Since the Durbin-Watson test was mentioned, let me add something, even though it is unrelated to testing exogeneity. The Durbin-Watson test might be too specific as it tests for autocorrelation at lag order 1 and not higher order lags; more general tests such as Breusch-Godfrey or in some instances Ljung-Box could be used instead; here is a good overview and comparison of the two tests. But this is all about testing for autocorrelation, not exogeneity.
Best Answer
In general with cross-sectional data random sampling guarantees that different error terms are mutually independent, and autocorrelation is not an issue. However, when the data are collected at different hierarchical level, e.g. students within schools, or patients within hospitals, the error terms within higher-level groups may be correlated.
I'd guess that the cognitive scale depends on some factors that could be viewed as grouping factors, e.g. schooling.