Residual plots are excellent, but the first and most basic plots are to plot the original data where possible.
You should look at and show us the raw time series. It seems that you have three large negative residuals for 330, 331, 332. You don't tell us what the labels mean, but perhaps they are observation numbers.
A plot of observed and fitted versus time of day might be as useful as plot versus time sequence, or even more so.
As you report that you used logarithms, it is a puzzle to know how values can be say 5 lower than is typical on your logarithmic scale. You don't tell us the base you used. Even for base e, those points are a lot lower than fitted.
It is also far from obvious from the logarithmic transformation was a good idea any way: the distribution of your fitted values is very left-skewed.
Assuming that each vertical stripe corresponds to a separate hour, the pattern seems be less activity for about 8 hours (night?) and more for about 16 hours (day?). Your high $R^2$ is probably higher than deserved because the transformation is spreading the lower values out. An observed versus fitted plot would show that more dramatically.
EDIT: Thanks for showing the plot. The very large negative residuals now appear to be a side effect of using an inappropriate logarithmic transformation. Plot log response versus response for the range of your data to see how the values are stretched out at the lower end.
I'd repeat the suggestion to plot observed versus time of day. That's what the regression "sees". There is no time series analysis here, but just time series data treated with regression.
The plots you have reproduced are generally used to detect model misspecifications, usually curvature and heteroscedasticity.
What people do, and R
does automatically, is superimpose a non-parametric lowess curve on this figure. This a local regression technique that offers an informal assessment of whether higher order terms need to be included in the model, among others. You would see that if the local regression line is curved. Then you might want to examine whether your model improves upon introducing them.
Heteroscedasticity means unequal variances and as such is a violation of the OLS assumptions. Although the estimator is still consistent and unbiased, it longer is Minimum Variance (BLUE) so that's something you need to guard against. The way to judge whether your errors are heteroscedastic is by observing the scatter in your figure. Unequal scatter across the horizontal zero line can be indicative of different variances and it might be worth investigating further. There are tests for heteroscedasticity, depending on how serious you are.
When it comes to outliers, these plots do not tell us much. Sure, there seem to be some points further away from others but what we need to remember is that the residuals do not have equal variances. Thus, a better way of detecting outliers is plotting standardized residuals against fitted values, where values above three or below minus three would suggest the presence of an outlier.
Conclusions from plots can be quite subjective though and they cannot take the place of formal tests. You can find a lot more on the internet, so take a look.
Best Answer
What precisely is the difference between top and bottom rows? I guess that the bottom row is after the square root transformation. If so, then residual and fitted are on different scales in the two rows.
What's most obviously missing here are plots of observed values versus the factor(s) in your model. I can't work out how many there might be. What would help also would be (1) showing the data if possible (2) giving precise model statement (R syntax would probably be transparent enough here, or other syntax if it's not R).
One criterion of a good model (and very far from the only such) is that the residuals appear to lack structure. It's more important that the fitted part appears to show the important structure; the two often go together but are not quite implied by each other.
So, the marked grouping bottom left is certainly a puzzle, but I don't see that any of the information you give us allows to venture an explanation, unless it's a side effect of the transformation e.g. that zeros in the data and small values are being pulled apart. But you have a handle in so far as you can group your points, say into left and right groups, and then see how that grouping is echoed on other plots. A separation rule is say fitted = -0.04. (A scatter plot of square root observed versus observed may seem trivial but would bring home exactly what the transformation is doing.)
In the bottom row, the range of your fitted values is about 0.04 and the range of your residuals is about 0.35, almost an order of magnitude higher. That is consistent with the statement of small treatment effects and implies that the grouping is more subtle than it appears here. In practice, however, there is usually a real story behind a grouping as distinct as you have.