It shouldn't matter that much since the test statistic will always be the difference in means (or something equivalent). Small differences can come from the implementation of Monte-Carlo methods. Trying the three packages with your data with a one-sided test for two independent variables:
DV <- c(x1, y1)
IV <- factor(rep(c("A", "B"), c(length(x1), length(y1))))
library(coin) # for oneway_test(), pvalue()
pvalue(oneway_test(DV ~ IV, alternative="greater",
distribution=approximate(B=9999)))
[1] 0.00330033
library(perm) # for permTS()
permTS(DV ~ IV, alternative="greater", method="exact.mc",
control=permControl(nmc=10^4-1))$p.value
[1] 0.003
library(exactRankTests) # for perm.test()
perm.test(DV ~ IV, paired=FALSE, alternative="greater", exact=TRUE)$p.value
[1] 0.003171822
To check the exact p-value with a manual calculation of all permutations, I'll restrict the data to the first 9 values.
x1 <- x1[1:9]
y1 <- y1[1:9]
DV <- c(x1, y1)
IV <- factor(rep(c("A", "B"), c(length(x1), length(y1))))
pvalue(oneway_test(DV ~ IV, alternative="greater", distribution="exact"))
[1] 0.0945907
permTS(DV ~ IV, alternative="greater", exact=TRUE)$p.value
[1] 0.0945907
# perm.test() gives different result due to rounding of input values
perm.test(DV ~ IV, paired=FALSE, alternative="greater", exact=TRUE)$p.value
[1] 0.1029412
# manual exact permutation test
idx <- seq(along=DV) # indices to permute
idxA <- combn(idx, length(x1)) # all possibilities for different groups
# function to calculate difference in group means given index vector for group A
getDiffM <- function(x) { mean(DV[x]) - mean(DV[!(idx %in% x)]) }
resDM <- apply(idxA, 2, getDiffM) # difference in means for all permutations
diffM <- mean(x1) - mean(y1) # empirical differencen in group means
# p-value: proportion of group means at least as extreme as observed one
(pVal <- sum(resDM >= diffM) / length(resDM))
[1] 0.0945907
coin
and exactRankTests
are both from the same author, but coin
seems to be more general and extensive - also in terms of documentation. exactRankTests
is not actively developed anymore. I'd therefore choose coin
(also because of informative functions like support()
), unless you don't like to deal with S4 objects.
EDIT: for two dependent variables, the syntax is
id <- factor(rep(1:length(x1), 2)) # factor for participant
pvalue(oneway_test(DV ~ IV | id, alternative="greater",
distribution=approximate(B=9999)))
[1] 0.00810081
Best Answer
Some points:
Each variable that defines a category or type ("young vs old", "male vs female") is a factor that influences the outcome (score on the questionnaire). You are doing a Factorial Experiment. You need to list all the relevant factors, their possible value, and mount an analysis using some software. First thing to look into is interaction between factors - for an example, usually ages makes math scores lower, except on Asians. The factor Race interacts with the factor Age, then. Then check homoscedasticity, then check normality. If all goes right, you can then proceed to ANOVA and / or t-tests.
Giving information to the subjects makes it a paired-sample controlled trial or something like it :) The results of this t-test still are affected by the factors.
Due to sample size (and some probable bias), I would go with presenting the information graphically and using non-parametric tests, like Kolmorogov-Smirnov to compare the two distributions.