To clarify what Peter Flom's point is, if you have normal residuals in a regression model and the model is adequate the DV or response variable y will be normally distributed but with mean equal to the regression function ax+b where x is your IV. How x is distributed depends on your design. If you do a histogram of the ys it doesn't tell you anything useful becuase it is just a mixing of normal distributions with different mean values. Histograms of the estimated residuals and qq plots of the residuals can help you determine whether or not the normality assumption is violated to the extent that you need to do something about it. Transformations that make the residuals look more like thay are normally distributed is one way to deal with the problem if you have it. But the are alternatives that I think are better. Robust regression and bootstrap are two such alternatives that I prefer.
Now Peter is right. Your residual histogram looks reasonably normal, so there is probably no need for a transformation or any other change in the model or the fitting procedure.
You seem to have the right intuition in your last paragraph.
It is possible for variable x and z in a regression to appear non significant even though they have some effect on the dependent variable y. The following small reproducible example illustrates that fact.
set.seed(890)
x <- rnorm(1000, mean=10, sd=3)
z <- rnorm(1000, mean=25, sd=6)
y <- ifelse(z>30, sqrt(x), 0) + rnorm(1000, mean=12, sd=10)
m1 <- lm(y~ x + z)
m2 <- lm(y~ x*z)
summary(m1)
summary(m2)
This produces the following output (troncated for readability):
Estimate Std. Error t value Pr(>|t|)
(Intercept) 10.61151 1.79312 5.918 4.48e-09 ***
x -0.00765 0.11085 -0.069 0.945
z 0.08651 0.05514 1.569 0.117
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 10.34 on 997 degrees of freedom
Multiple R-squared: 0.002464, Adjusted R-squared: 0.000463
F-statistic: 1.231 on 2 and 997 DF, p-value: 0.2923
Estimate Std. Error t value Pr(>|t|)
(Intercept) 18.59305 5.11233 3.637 0.00029 ***
x -0.79087 0.48273 -1.638 0.10167
z -0.22747 0.19625 -1.159 0.24669
x:z 0.03077 0.01846 1.667 0.09584 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 10.33 on 996 degrees of freedom
Multiple R-squared: 0.005239, Adjusted R-squared: 0.002243
F-statistic: 1.749 on 3 and 996 DF, p-value: 0.1554
As you can see, y depends on x for some levels of z (this is your significant interaction).
However, in m1, there does not appear to be a significant effect of either x or z on y when you only include the main effects.
In m2, the interaction becomes significant (albeit barely). Note that neither m1 or m2 are very good models for the data.
In terms of interpretation, you would probably say that x has a significant effect on z for some values of z. There are several ways of testing for this. The one you mention in the last paragraph, excluding part of your sample based on the score of observations on a certain variable is usually refered to as "split-sample" analysis in social sciences. Other ways of testing for this imply to calculate the marginal effect of an interaction, depending on the values of one of the two interacted variables.
Best Answer
The intuitive idea behind the bootstrap is this: if your original dataset was a random draw from the full population, then if you take subsample from the sample (with replacement), then that too represents a draw from the full population. You can then estimate your model on all of those bootstrapped datasets. This gives you a large number of estimates and so you can e.g. look at the standard deviations of your estimates - it turns out that often this gives a good guess of the standard error of the estimates. Actually, the standard error of the estimates can be thought of excactly as this if you take the many datasets from the true population.
Suppose for example there is one outlier in your dataset: then in many of your bootstrapped datasets that observation is not included and so for those datasets, you see the estimated coefficients change by a lot.
Similarly, you can look at the F statistic for each of the bootstrap datasets. You could for example see how many times the model was rejected. But I am not sufficiently familiar with SPSS to know what it reports as the F stat: is it the average F statistic?