Solved – How to cluster longitudinal variables

clustering

I have a bunch of variables which contain longitudinal data from day 0 to day 7. I am looking for an appropriate clustering approach which can cluster these longitudinal variables (not cases) into different groups. I tried to analyze this data set separately by time, but the result was pretty difficult to be reasonably explained.

I investigated the availability of a SAS procedure PROC SIMILARITY because there is an example on its website; however, I think it is not a right way. Some previous studies used exploratory factor analysis in each time point, but this is not an option in my study as well because of unreasonable results.

Hopefully some ideas can be provided here, and a compiled program, such as SAS or R, can be available to process. Any suggestion is appreciated!!


Here is a short example (sorry for the inconsistent position between data and variable names):

id time   V1  V2   V3   V4   V5   V6   V7   V8   V9   V10
2    0    8    7    3    7    6    6    0    0    5    2
2    1    3    5    2    6    5    5    1    1    4    2
2    2    2    3    2    4    4    2    0    0    2    2
2    3    6    4    2    5    3    2    1    2    3    3
2    4    5    3    4    4    3    3    4    3    3    3   
2    5    6    4    5    5    6    3    3    2    2    2
2    6    7    5    2    4    4    3    3    4    4    5
2    7    7    7    2    6    4    4    0    0    4    3
4    0   10    7    0    2    2    6    7    7    0    9
4    1    8    7    0    0    0    9    3    3    7    8
4    2    8    7    0    0    0    9    3    3    7    8
4    3    8    7    0    0    0    9    3    3    7    8
4    4    5    7    0    0    0    9    3    3    7    8
4    5    5    7    0    0    0    9    3    3    7    8
4    6    5    7    0    0    0    9    3    3    7    8
4    7    5    7    0    0    0    9    3    3    7    8
5    0    9    6    1    3    2    2    2    3    3    5
5    1    7    3    1    3    1    3    2    2    1    3
5    2    6    4    0    4    2    4    2    1    2    4
5    3    6    3    2    3    2    3    3    1    3    4
5    4    8    6    0    5    3    3    2    2    3    4
5    5    9    6    0    4    3    3    2    3    2    5
5    6    8    6    0    4    3    3    2    3    2    5
5    7    8    6    0    4    3    3    2    3    2    5

Best Answer

So, you have p variables measured each t times on same n individuals. One way to proceed is to compute t pXp (dis)similarity matrices and apply INDSCAL-model Multidimentional Scaling. It will give you two low-dimensional maps (say, of 2 dimensions). The first map shows the coordinates of p variables in the space of the dimensions and reflects groupings among them, if there are any. The second map shows weights (i.e. importance, or salience) of the dimensions in each matrix of t.

enter image description here