I am a beginner in plotting/graphing. Kindly explain how to interpret the pairwise scatter plots generated using pairs() function in R.
The data contains 323 columns of different indicators of a disease. Although I see that many columns are mean, std, slope, min, max and so on of any one parameter. For example, for an attribute like 'walking', there are other attributes like: sum.slope.walking, meansquares.slope.walking, sd.slope.walking and so on.
Is it okay to select any one parameter in such a case (such as meansquares.slope..) ?
What are the patterns to look out for to identify relationships between attributes ?
Best Answer
If you have a number of different measurements in your data.frame, then
pairs
will show scatterplots of between all pairs of these measures.Example data:
This is a data.frame with four different measures called
a
,b
,c
andd
on 100 individuals.pairs
draws this plot:In the first line you see a scatter plot of
a
andb
, then one ofa
andc
and then one ofa
andd
. In the second rowb
anda
(symmetric to the first), b andc
andb
andd
and so on.pairs
does not compute sums or mean squares or whatever. If you find that in yourpairs
plot, then that is in your dataframe.What patterns to look for? In my example you find no pattern between
a
andb
, a linear pattern betweena
andc
and a curved, non-linear pattern betweena
andd
. Look for patterns that might be of interest to your statistical questions.Please note, that whilst asking for the interpretation of a plot is a statistical question, questions on how to use
R
alone are not on topic on Cross Validated. You should ask questions onR
programming on Stack Overflow.