I will take these out of order. If it is possible to establish a correspondence between the measurements in the first set and the measurements in the second set (for example, Bob's score at time 1 and Bob's score at time 2 correspond because they both came from Bob), then you should do a paired t-test. That is, you should not calculate means for each time, but take differences, and calculate the mean and standard deviation of the differences. The standard error of the differences (i.e., the denominator of the t-statistic) is that standard deviation divided by $\sqrt{n}$. If some students did not participate at one of the occasions, then their scores should be set aside. Furthermore, you do not care if a score is more than 2 s.d.'s higher than the mean, although you may care if one of your differences is more than 2 s.d.'s above the mean of the differences.
The definition of an outlier is a data point that came from a different population than the one you want to be studying. The definition is not a data point that is far away from the rest of your data. However, we almost never know whether or not a data point came from a different distribution than the rest of our data, except that it looks really different. If you should ever spend much time conducting simulations, you will come to notice that every so often a data point that you know comes from the same distribution (because you wrote the simulation code) looks quite a bit different from the rest. This is an uncomfortable fact, but it is nonetheless true. Ultimately, you need to decide whether you believe that data point belongs there or not. There are some (potentially) helpful guidelines:
- With ~20 data points, a z-score with an absolute value greater than 2 is pretty unlikely (although it wouldn't be if you had, say, 100 data points);
- You can look at a plot of your data (e.g., a histogram) to see if the larger
value is contiguous with the rest of your data, or if there is a
large break between it and the rest;
- It can help to run your analysis both with the potential outlier and without it (often, you will get the same answer both ways, and that's reassuring);
- A final possibility is to use 'trimmed samples', that is, exclude
the top and bottom 2 data points (given that you have ~20, this would be a 10% trimmed sample), note that this lessens your power, but many people think it's more even
handed.
In the end, I'm afraid, you will still have to make a decision, however.
Lastly, you should know that the question of 2 vs. 1 - tailed t-tests has long been a contentious topic. It is probably not as important as people have made it out to be, but that is the nature of these things. Personally, I'm against 1-tailed tests, but my opinion is really unimportant. A question you could ask yourself is:
What if I find that the mean decreased by a very large amount?
Would I say, 'Nope, there was no change', or would I say there was a
change?
If it would be possible for you to believe a negative change if the data supported it, then you really should be using a 2-tailed test, but if there is no way you would ever believe the mean went down, then a 1-tailed test is probably fine, and you just let the old grumps (like me) harrumph about it. What you should not do is run the test both ways and pick the one that gives you the result you like best (or run a 1-tailed test, notice that the mean went down by a lot, then run a 2-tailed test and call it 'significant').
Although the answer could be very extensive explaining all aspects of a repeated measures design I will keep it short and answer the question directly. I think, however, that it is wise to do a bit more study on your own to be able to comprehend the answer completely.
You have to look whether the interaction between the group variable (control vs. intervention) and the factor variable (the measurements over time). Although you can proceed to the next step either way it is advised that you only do so in case this step is significant.
In the SPSS repeated measures menu you click options >> select interaction term >> drag to the right column >> click continue >> click ok. You now get descriptive statistics for each group for each time interval. Using the confidence intervals you can determine whether or not the two groups differ on a certain time point.
The syntax you have to add in the second step is:
/EMMEANS=TABLES(Group*factor1) << replace variable with your variable
Best Answer
It seems like you want to run a between-within repeated measures analysis of variance. In this case, you have two categorical variables: group and time. If you indicate that the time variable is the repeated measures variables, and you put group as a factor, you will get the test you seek. You would interpret the group*time interaction to test whether the INCREASE/DECREASE (i.e. change) in brand attitude/purchase intention differs between groups. This is likely of most interest to you.
You can also examine wither time is associated with different means across both groups (this is the time variable) and between groups collapsed across time (this is the group variable). These could be of interest, but I expect you are more interested in seeing how group membership is associated with change in the dependent variable(s).