Factor analysis is essentially a (constrained) linear regression model. In this model, each analyzed variable is the dependent variable, common factors are the IVs, and the implied unique factor serve as the error term. (The constant term is set to zero due to centering or standardizing which are implied in computation of covariances or correlations.) So, exactly like in linear regression, there could exist "strong" assumption of normality - IVs (common factors) are multivariate normal and errors (unique factor) are normal, which automatically leads to that the DV is normal; and "weak" assumption of normality - errors (unique factor) are normal only, therefore the DV needs not to be normal. Both in regression and FA we usually admit "weak" assumption because it is more realistic.
Among classic FA extraction methods only the maximum likelihood method, because it departs from the characteristics of population, states that the analyzed variables be multivariate normal. Methods like principal axes or minimal residuals do not require this "strong" assumption (albeit you can make it anyway).
Please remember that even if your variables are normal separately, it doesn't necessarily guarantee that your data are multivariate normal.
Let us accept "weak" assumption of normality. What is the potential threat coming from strongly skewed data, like your, then? It is outliers. If the distribution of a variable is strongly asymmetric the longer tail becomes extra influential in computing correlations or covariances, and simultaneously it provokes apprehension about whether it still measures the same psychological construct (the factor) as the shorter tail does. It might be cautious to compare whether correlation matrices built on the lower half and the upper half of the rating scale are similar or not. If they are similar enough, you may conclude that both tails measure the same thing and do not transform your variables. Otherwise you should consider transforming or some other action to neutralize the effect of "outlier" long tail.
Transformations are plenty. For example, raising to a power>1 or exponentiation are used for left-skewed data, and power<1 or logarithm - for right-skewed. My own experience says that so called optimal transformation via Categorical PCA performed prior FA is almost always beneficial, for it usually leads to more clear, interpretable factors in FA; under the assumption that the number of factors is known, it transforms your data nonlinearly so as to maximize the overall variance accounted by that number of factors.
There are several things in your description that are a bit confusing, for example you state that taking the log transform reverses the direction of the coding, but the log by itself does not reverse coding.
Your main question seems to be that when you look at individual pairwise correlations the sign of the correlation is as expected, but some of the signs of the slopes in a multiple regression are opposite what you expect. This is not uncommon since the interpretation of slopes is much more complex in multiple regression models.
Consider this example (I read this recently, I don't get the credit for thinking of it): Collect data on the change in various peoples pockets, the variables to collect are the total value of the change (y), the total number of coins (x1), and the total number of coins that are not quarters, or if using non-US coins then number of coins not the highest common coin carried (x2). Generally x1 and x2 will both be positivly correlated with y, but if you do a multiple regression using both x1 and x2 then the slope on x2 will be negative because to increase the number of non-quarters without changing the total number of coins we need to trade quarters for other coins of lesser value which decreases y. You could have something similar happening with your data, does it really make sense to increase the religeous variable without the others changing? What is often more meaningful is to compare predicted outcomes for what would be considered common combinations of your predictor variables.
Best Answer
Regarding 1) Factor analysis is based on correlations/covariances. When a highly skewed variable is part of a correlation, the correlation can be affected by the extreme points. This will affect the factor analysis, although I do not know of literature on the extent of the effect (it's probably been studied, though).
Regarding 2) You do not need to use the same transformation on each variable. But transforming variables in different ways and then doing factor analysis can lead to factors that are somewhat hard to interpret.
Regarding 3) I don't know SPSS, sorry.
More generally, what is the nature of these questions? Are they Likert-type scales? Physical measurements? Or what? Ideally, you could tell us what they actually mean.