What is the rationale of applying an exploratory/unsupervised method (PCA or FA with VARIMAX rotation) after having tested a confirmatory model, especially if this is done on the same sample?
In your CFA model, you impose constraints on your pattern matrix, e.g. some items are supposed to load on one factor but not on the others. A large modification index indicates that freeing a parameter or removing an equality constraint could result in better model fit. Item loadings are already available through your model fit.
On the contrary, in PCA or FA there is no such constraint, even following an orthogonal rotation (whose purpose is just to make factor more interpretable in that items would generally tend to load more heavily on a factor than on several ones). But, it is worth noting that these models are conceptually and mathematically different: the FA model is a measurement model, where we assume that there is some unique error attached to each item; this is not the case under the PCA framework. It is thus not surprising that you failed to replicate your factor structure, which may be an indication that there are possible item cross-loading, low item reliability, low stability in your factor structure, or the existence of a higher-order factor structure, that is enhanced by your low sample size.
In both case, but especially CFA, $N=96$ is a very limited sample size. Although some authors have suggested a ratio individuals:items of 5 to 10, this is merely the number of dimensions that is important. In your case, the estimation of your parameters will be noisy, and in the case of PCA you may expect fluctuations in your estimated loadings (just try bootstrap to get an idea of 95% CIs).
Unlike factor analysis, you cannot just put eight variables into a "regression test" and treat them all equally. One variable has to be the response variable and the others explanatory variables.
Your eight factors have been specifically designed to be orthogonal to eachother. I suspect you have put seven of the factors into a regression as explanatory variables (sometimes called "independent variables") with the eighth as the response variable (sometimes called "dependent variable"). Certainly you would find a low $R^2$ value, and estimates of the coefficient parameters close to zero, in this case.
Using factors as explanatory variables in a regression is sometimes justifiable (although I have some qualms myself - see @whuber's answer here for one reason why it might be questionable). However, the response variable for the regression needs to have been kept out of the original factor analysis. So you can only use your eight factors in a regression if the intent is to explain a ninth variable, one that was not in the original factor analysis.
Best Answer
Why are sets of items from different constructs loading on the same factor?
In my experience, many psychological tests have multiple scales where the correlations between some scales can be quite high (e.g., .6 to .8 correlations). In such a case, there may not be a huge difference between a model where all these items load on one factor versus a model where the items load on different factors. This is further compounded by various other issues: (1) the noise of measurement means that the sample factor structure is an imperfect representation of the population factor structure, especially with small sample sizes (e.g., < 100 or 200); (2) imposing a varimax rotation may hide these intercorrelated factors; (3) other influences on the factor structure beyond the actual constructs of interest may be influencing the factor structure (e.g., item stems, whether an item is reversed, etc.).
Alternatively, you may just be wrong about the factor structure for your test, and what you thought were separate constructs are essentially interchangeable in terms of measurement and the item-level.
Is this a problem?
The first step is to understand why it is occurring. Is it due to writing items that don't sufficiently discriminate the two constructs? Are the two constructs inherently the same at the measurement level? Are they just highly intercorrelated factors which only separate when you allow for more factors? Are there particular items that might be preventing the factors from splitting?
What can you do?
More generally, use this an opportunity to learn about the factor structure of the test. There's probably a story to be discerned.