First, unrelated to the question you asked, I’m not sure why you are running an ANOVA on your linear regression model. Some clarification about your variables and your goals for running this analysis might be helpful to those trying to help with your issue.
Second, to me personally it is not surprising that you are getting different results after the transformation. Remember the purpose for doing the transformation in the first place. Both ANOVA and linear regression are parametric tests, and one of the several assumptions that must be met for parametric tests is that the data distribution must be normal. The goal of transformation is to accomplish this task of normalizing the distribution. Sometimes normalization will kick out significant results, sometimes it will not change the results, and in your case, sometimes it will bring results into significance. I have had all of these happen in my own transformations. Field (Discovering Statistics Using R, 4th, 2012, p. 193), has a good review of the purpose and debate about transforming data. He points out that especially in the case of ANOVA, transforming data can be problematic, and recommends carefully understanding why and how you are transforming the variables, in addition to exploring robust methods (see below) instead of transforming the data in some circumstances. I highly recommend his very brief review of this issue, in which he refers to several articles on this topic, such as Games, 1984, Psychological Bulletin 95(2): 345-47, "Data transformations, power, and skew A rebuttal to Levine and Dunlap."
Third, if you are actually using different groups (a presumption for ANOVA), then you should really be making sure that each group is normally distributed not just the entirety of the dataset. You may be doing this already—but if so, the information you provide doesn’t make that clear. (Field p. 412: “In terms of normality, what matters is that distribution within groups are normally distributed”). This of course assumes that you are also satisfying all of the other across-data, and within-group tests, such as homogeneity of variance, etc. which were not shown above.
Fourth, you might consider robust tests (Field, p. 441+) instead of transformations. For example, the R package “robust” or command "rlm".
The data are highly skewed & take just a few discrete values: the within-pair differences must consist of predominantly noughts & ones; no transformation will make them look much like normal variates. This is typical of count data where counts are fairly low.
If you assume that counts for each individual $j$ follow a different Poisson distribution, & that the change from low to high load condition has the same multiplicative effect on the rate parameter of each, you can extend the idea in significance of difference between two counts to a matched-pair design by conditioning on the total count for each pair, $n_j$:
$ \sum_{j=1}^m X_{1j} \sim \mathrm{Bin} (\sum_{j=1}^n n_j, \theta)$
where $m$ is the no. pairs. So the analysis reduces to inference about the Bernoulli parameter in a binomial experiment— 7 "successes" out of 24 trials if I read your graphs right.
Check the homogeneity of proportions across pairs—& note if they're too homogeneous it might indicate underdispersion (relative to a Poisson) of the original count variables.
Note that this approach is equivalent to the generalized linear model suggested for Poisson Repeated Measures ANOVA†: while it tells you nothing about the nuisance parameters, point & interval estimates for the parameter of interest can be worked out on the back of a fag packet (so you don't need to worry about software requirements).
† Parametrize your model with the log odds $\zeta=\log_\mathrm{e} \frac{\theta}{1-\theta}$: then the maximum-likelihood estimator is $$\hat\zeta=\log_\mathrm{e}\frac{\sum x_{1j}}{\sum n_j - \sum x_{1j}}=\log_\mathrm{e}\frac{7}{24-7}\approx -0.887$$ with standard error $$\sqrt\frac{\sum n_j}{\sum x_{1j}(\sum n_j-\sum x_{1j})}=\sqrt\frac{24}{7\cdot(24-7)}\approx 0.449$$ for Wald tests & confidence intervals. If you want to adjust for over-/under-dispersion (i.e. use "quasi-Poisson" regression) , estimate the dispersion parameter as Pearson's chi-squared statistic (for association) divided by its degrees of freedom (22) & multiply the standard error by its square root.
Best Answer
If you're interested in comparing means, once you transform you end up with a comparison of things that are not means. If the right assumptions hold you can still test for a difference, but the alternative won't be location-shift.
On the other - and more important - hand, if you omit essential details you'll be more likely to end up with less useful - or even potentially misleading - answers that you won't even realize aren't the answers you need.
By leaving out the fact that you were dealing with count data, you were risking exactly that. While leaving out unnecessary detail is probably useful, knowing it's count data is pretty much central to the problem.
There are techniques for comparing means that are suitable for count data. With some more information about the kind of analysis/information you were after (even if it's what you would have done if the data were normal), we may be able to guide you better.
Transformation is less useful than doing something suited to your actual data.