@Roland is correct that it's hard to say much without knowing what you're doing, substantively speaking. However, there are a few remarks we can still make. They fall into the categories: discovering why it's no good, making it better, and demonstrating improvement.
Diagnostics
R has good linear model diagnostics. Apply them, and read up enough to know what they are telling you. To see all the available ones
model <- lm(formula = y ~ x_1 + x_2 + x_3 + x_4 + x_5 + x_6)
plot(model, 1:6) ## all of them
Each addresses a possible failing. You might check for linearity and interactions first because you've enough data to do something about them.
Making it better
You have lots of data. This means that if there is non-linearity you can potentially learn its form from the data. A generalised additive model (GAM) would be a good start and will probably work better than some random set of polynomials. If you don't want or can't do that, then at least some splines might be helpful.
Also, work your way through the interactions that make sense. These will generate apparent non-linearity and spoil predictions if not modeled. Read up about R's formula
interface to see how to specify them.
Polynomials can work, but without knowing what your data actually is it's hard to say whether they'd be a good idea. Also hard to say, and for the same reasons, is whether your predictor variables might be usefully transformed (logged, etc.)
Confirming it's better
Since your only task is to make the model better then the only quantity worth working with is held-out prediction error. Do whatever you do on a subset of the data then try it out on the held-out set. (Iteratively this is cross-validation). You have to decide what counts as 'doing better' prediction in the context of your problem, but a common choice is root mean squared error. Here again I'm assuming that you actually do have data that is potentially conditionally normal, as your choice of lm
implies.
Practically this would involve writing a function to compute that quantity (or one suitably like it) from a set of predictions and a set of held-out data points. The do your fiddling around and optimizing the model on the other part of the data, use predict
to get predictions on the held-out, and apply the function.
Note that performance on held-out data is not any of the quantities you are wondering about. Those are all in-sample measures and will typically overestimate prediction performance on new data.
Caveats
Finally, note that prediction may just be hard. You may not have the right variables: most likely some important ones are missing, and you can do nothing more about that without knowing what they are.
And that's about as much generic advice as can be given for a bunch of variables called $[y, x_1\ldots x_6]$...
If you set up the data in one long column with A and B as a new column, you then can run your regression model as a GLM with a continuous time variable and a nominal "experiment" variable (A, B). The output of the ANOVA will give you the significance of the difference between the parameters. "intercept' is the common intercept and the "experiment" factor will reflect differences between the intercepts (actually overall means) between the experiments. the "Time" factor will be the common slope, and the interaction is the difference between the experiments with respect to the slope.
I have to admit I cheat (?) and run the models separately first to get the two sets of parameters and their errors and then run the combined model to acquire the differences between the treatments (in your case A and B)...
Best Answer
As you correctly notice your $R^2$ associated with each model is very similar ($0.235$ vs $0.231$). In general none of the two though is "not good". $R^2$ being bad or good is entirely up to the actual application. As mentioned in my comment, unless you have good reasons to believe that your explanatory variables have strong linear relations with your dependent variable and you do not omit any other variables that have strong influence, an $R^2 \approx 0.23$ is far from catastrophic.
The obvious thing to suggest is to look at some kind of information criteria (Akaike Information Criterion, or Bayesian Information Criterion) to see if any of the two models is obviously better. Let me point out that these are not silver bullets. They make their own assumptions that have to be met. (eg. for the BIC you need your models to be nested.)
The factor "next to $R^2$" is the adjusted $R^2$: This is essentially the coefficient of determination but penalized so it accounts for the fact that you have a given number of explanatory variables in your model. Short of cross-validating or bootstrapping your model I would suggest using the AIC with a correction for finite sample sizes, dubbed AICc. You can use it by employing the
AICc
function available in package AICcmodavg to do that. Cross-validated has some excellent thread on the perils of automatic model selection (eg. here and here); I highly recommend reading them. To paraphrase the late George E. P.Box: Your model is certainly wrong, you just want to see if it is any useful. :)