MATLAB: Split information into two columns

delimiterexpressionregexregexperssionxlsreadxlsx

Hello, I have an excel file and it has 2 columns and a lot of lines. The first column has numbers and the second column has words like this "House_red" or "House_x_red" or "House". I used this to extract the second column
[~,houses] = xlsread('houses.xlsx', 'B:B');
And right now I have all my information separated numbers from strings, but in the strings, I want to have a column with House and the other with Red or x_red, where house will be in the first column and red or x_red in the second column. Is there a way to split a string into two columns knowing that one line can be "House_Red" and another line is "House_X_Red"?

Best Answer

Try this:
strs = {"House_red"; "House_x_red"; "House"; "Apartment"}; % Data
chrs = cellfun(@char, strs, 'Uni',0); % Convert ‘string’ To ‘char’
L1 = cellfun(@isempty, strfind(chrs, '_')); % Find Elements Without Underscores ‘_’
chrs(L1) = cellfun(@(x)sprintf('%s_', x), chrs(L1), 'Uni',0); % Add Terminating Underscores To Them
sep = regexp(chrs, '_', 'split', 'once'); % Split On First Underscore
H = cellfun(@(x)x(:,1), sep);
C = cellfun(@(x)x(:,2), sep);
HC = [H, C]
HC =
4×2 cell array
{'House' } {'red' }
{'House' } {'x_red' }
{'House' } {0×0 char}
{'Apartment'} {0×0 char}
You indicated that your original data was a string array, so I included a step that converts it to a char array. The code resolves the problem of "House" not having an underscore (_) by adding one before splitting the char array ‘chrs’ with by the first underscore. This is the easiest way I can devise to create an appropriate output for ‘sep’, so that each row vector is a (1x2) cell array. After that, the rest is straightforward.