MATLAB: K-means for stock market timeseries

k-means - timeseries

Hi
I am doing my research to test the accuracy of different volatility models in forecasting the stock market volatility using indexes time series. I need to cluster the data normally with K-means into two groups. I already have the time series from different stock markets but all came with the same length. I Just need to cluster each of them into two subsets. Then the first subset will be used to train the models and the second one will be used to test and to forecast the models. I wonder if you can give the direct code or at least how to start the k-means in Matlab.
I seriously look forward to hearing from you very soon.
Regards, Abdelrazzaq.

Best Answer

The function kmeans is part of the Statistics Toolbox in MATLAB. The following code demonstrates how to use k-means to cluster data into two groups and pull out the individual groups.
% Generate random data
nSamples = 100;
sampleWidth = 5;
X = rand(nSamples,sampleWidth);
trainingSetSize = 20;
% seperate into two groups using euclidean distance
% IDX will be size nsamples x 1 where each element indicates the label at
% that index
IDX = kmeans( X , 2 , 'distance' , 'sqEuclidean');
% separate the data into two groups
G1 = X(IDX == 1 , : );
G2 = X(IDX == 2 , : );
As a result of the k-means clustering, the groups will be self similar and would likely make very bad training and test data for an ML algorithm. A much more suitable function for generating training and test sets is the randsample function in the Statistics toolbox. By uniformly sampling a population at random, this function will provide more diverse training data to your ML algorithm and help improve its robustness.
% Randomly select trainingSetSize samples without replacement
rsIDX = randsample( size(X,1) , trainingSetSize );
% Create a logical mask for the selected values
tsMASK = false( nSamples , 1 );
tsMASK( rsIDX ) = true;
% Separate the data into training and test samples.
GTraining = X( tsMASK , : );
GTest = X( ~ tsMASK , : ) ;
Related Question