MATLAB: How to divide dataset into a test, train, split format

data splittingmachine learning

Hello,
I'm trying to split my dataset have the format X_train, X_test, y_train and y_test – in similar fashion to Python's test_train_split but I'm struggling to find a method to do so. Is this possible in MatLab?
I've tried doing the following
seed = 42;
rng(seed);
cv = cvpartition(size(dataset,1), "HoldOut", 0.2);
idx = cv.test;
X_train = subsample(~idx,:);
y_test = subsample(idx,:);
but I'm not entirely sure how to go about deriving X_test and y_train.
Does anybody have a good solution to this? Apologies as I'm fairly new to MatLab!
Thank you!

Best Answer

Does the variable subsample contains both 'X' and 'y' values? If yes, then you don't need to create two variables for X and 'y'. Just use
subsample_train = subsample(cv.training, :)
subsample_test = subsample(cv.test, :)
However, if subsample contains 'X' values and another variable (say, 'y') contain y values then you can do something like this
X_train = subsample(cv.training, :);
y_train = y(cv.training, :);
X_test = subsample(cv.test, :);
y_test = y(cv.test, :);