Firstly, your supervisor should explain factor analysis to you. That's why he gets paid the big bucks.
But I guess it's up to old CV to plug the gaps of the educational system.
It would be nice if you could get AMOS with your SPSS system, or possibly use sem or lavaan in R, since I think your research question should probably be addressed through confirmatory factor analysis. What SPSS offers is just an exploratory analysis. So far, that seems to have worked well, since it looks as if the analysis produced the three categories that you believe are operative. Note that Varimax will always produce uncorrelated factors. That's what it does.
So what is factor analysis doing? You have a questionnaire with items, but what really interests you are certain underlying characteristics or "categories" that you can't measure directly. You measure these indirectly through the items of the questionnaire. You want the questionnaire to detect those categories. So perhaps questions 1-3 target the first category; 4-6 target the second.
If this model is correct, then the variance matrix of the 9 items will have a particular structure, reflective of the underlying categories. Confirmatory factor analysis lets you test that hypothesis.
Alternative hypotheses could be that all 9 items reflect only 1 category ... or at the other extreme, that there are no underlying categories that can simplify the variance structure. Confirmatory factor analysis would then check that these categories are relevant to the demographic you have.
Factor loadings are sort of the regression coefficients of the items against the underlying factors or categories, if in fact, you could measure those underlying factors. What you get from SPSS, I believe, assumes that the factors are scaled to have variance 1.
I'm not sure that high loadings from a category mean that the category is "important" to your demographic. It does suggest that the factor is present and well manifested by the questions. It also implies that people's responses are very much governed by the factor, and less by randomness. It might help if you specified what these categories are.
The short answer to your question is...
... in a multivariate context, there is much less of a distinction between hypothesis test statistics (e.g. F in a univariate context) and effect size estimates (e.g. eta-squared in a univariate context), so it often works out that they reflect each other, as you note. This is like noticing that R-squared an eta-squared agree; it's because they're both using the same information. In a MANOVA, Pillai's trace and multivariate partial eta squared (depending on how it's calculated) may just reflect the same information.
The longer answer:
As a review, here's a nice summary of some measures of effect size for univariate ANOVAs, inclduing both eta-squared and partial eta-squared: http://www.theanalysisfactor.com/effect-size/ The most important take-home for this particular question is that eta squared is (roughly) the variance explained divided by total varianace, while partial eta squared is the variance explained by a given predictor divided by the variance explained by that predictor plus the unexplained variance. Another way to articulate it is the variance explained by a predictor divided by the total variance minus variance explained by any of the other predictors. Jumping the gun a bit and answering your second question first, you're correct that "partial eta-squared" is the same as just "eta squared" when the only predictor in your model is a single categorical variable with 2 levels. But interpreting that statistic is still a little tricky since (partial) eta squared in a multivariate context is a little different from what you're used to thinking about with univariate ANOVAs.
There are several common statistics that summarize a MANOVA, one of which is the one you mention, Pillai's trace (which is itself based on Roy's test statistic). Two other common ones are Wilk's lambda and the Hotelling-Lawley Trace. They are all designed to roughly fill the role of an F test --- i.e. they should yield extreme values when the null hypothesis is violated --- but extended to matrices. It is not at all straightforward to calculate the multivariate sums of squares for the model and the error, though, so what we end up with isn't really an F test in any of these cases (although there are transformations to covert them to more or less fit F distributions, to allow the calculation of a p value). It just so happens that one way to reasonably substitute for an F test is with a measure of effect size, like eta-squared, so several of these measures are actually closer to eta-squared than F in terms of interpretation (e.g. Wilk's lambda can be interpreted as the proportion of variance attributable to error, i.e. 1 - eta-squared).
Just as there is no way to calculate an actual F test for a MANOVA (just several options for statistics that approximate it), there's no obvious way to calculate (partial) eta-squared. You can calculate something like an eta-squared for a MANOVA several different ways (see this article for several examples, as well as details in Rencher 2002, Kline 2004, Huberty & Olejnik 2006, and Cohen 1988). Here is a list of effect size statistics SPSS calculates for a MANOVA (although the equations as presented there are not at all easy to parse, since the notation is incomplete).
Since IBM doesn't releatse the SPSS code, it can be difficult to tell exactly what calculations are being used under the hood; in this case, I'm not sure from the documentation how the statistic you see as partial eta-squared is being calculated, but I'm not surprised it turns out the same as Pillai's trace, given how closely related all of these measures are. In fact, one common multivariate measure of effect size is just Roy's statistic itself. Pillai's trace is Roy's calculated on each eigenvalue (instead of just the largest one), so if one eigenvalue can capture close to all of the variance in the data Pillai's and Roy's will be redundant.
Additional reading:
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.
Huberty, C. J., & Olejnik, S. (2006). Applied MANOVA and discriminant analysis. New York: Wiley.
Rencher, A. C. (2002). Methods of multivariate analysis (2nd ed.). New York: Wiley.
Best Answer
The primary test statistics are the four values given: Pillai's trace, Wilks' lambda, Hotelling's trace and Roy's largest root. The distribution of each of these is, in general, complicated and so workers derived from each of them an approximation using the $F$ statistic. under some circumstances the approximation is exact and you will note these are flagged in the output. The $F$ is not an extra statistic but only a transformation for convenience. Compare this with other situations like, for example, Mann-Whitney $U$ where it is transformed to a $z$ to enable use of the normal. That $z$ is not a second statistic but the first transformed. In the good old days before electronic computers all we had were tables of the normal, $t$, $\chi^2$ and $F$ so this sort of transformation was essential.