I am experimenting with developing a linear model lmodel1
which predicts a temp
response variable, given three independent variables sea.distance
, altitude
, latitude
. model1
is calculated as follows:
model1 <- lm(temperature~altitude+sea.distance+latitude, data=meteodata1)
Below you can see the results of model2 = predict(model1)
.
Question 1: Can someone please try to explain in plain English the result table below?
Question 2: I am trying to plot this new model plot(model2)
but the output plot is rather uninteresting. Below is the abovementioned "poor" plot.
fit lwr upr
Min. :12.61 Min. :-17.001 Min. :33.81
1st Qu.:12.61 1st Qu.:-17.001 1st Qu.:33.81
Median :14.14 Median : -9.723 Median :38.01
Mean :14.14 Mean : -9.723 Mean :38.01
3rd Qu.:15.68 3rd Qu.: -2.445 3rd Qu.:42.22
Max. :15.68 Max. : -2.445 Max. :42.22
I would like to know what should my plot(model2)
arguments should be, so I can reflect the predictions made in it. What values should I take to compare my initial y
values with the ones predicted?
I understand that so far the above plot represents only my dataset(?) and nothing more. ?predict
, help(predict)
doesn't help much when it comes to results. Same story for similar questions
Edit: I am actually taking or atleast expecting two values as my prediction output. They are part of the same variable but different values, since i feed them differently in my dataset. To make things more clear here is an example of my data:
temperature station.id latitude longtitude sea.distance altitude
1 S5 2 1 1 500
2 S5 2 1 1 500
3 S5 2 1 1 500
4 S6 1 2 1 300
5 S6 1 2 1 300
6 S6 1 2 1 300
Hope this helps.
Edit2: I am afraid i can't provide you a reporducible example of this problem for various reasons. I have changed, as demanded, the variable names making it more clear what i am trying to accomplish.
Best Answer
The table you want explained in plain English is just a table of summary statistics. Minimum, maximum, mean, median and quartiles are explained in most introductory statistics texts.
The minimum and lower quartile are the same, as are the maximum and the upper quartile. The mean and median appear in each case to be exactly halfway between those two distinct values. So, at a guess, you are somehow or other feeding just two distinct values to the modelling commands, or getting two distinct values out. (I don't use R except very occasionally, but I guess one is true, or both are.)
Similarly, your graph is a bit fuzzy, but I see just two distinct points.
The short story is that somehow you are showing something that doesn't seem compatible with a sensible linear model with one response and three predictors.
If you showed (or said much more about) your data, you might get much better advice, although this is perilously close to how do I use R, which would be off-topic here.