MATLAB: Sorting and storing character and numeric values

cell arrayscharacternumericsorting

Hi everyone, I'm still trying to grasp Matlab so bare with me.
I have a dataset that I import from a csv file that has one text column and the rest are float columns. What I want to do, is be able to sort the rows by the numeric values but still retain the text column. My problem is that I am not sure how to go about this. I've tried cell arrays and have had little luck. What is the ideal Matlab approach for going about this?
Thanks, mattg555

Best Answer

One approach may be to do something like the following - separate the text data and numeric data into two separate matrices. Both would have the same number of rows, say M, with the text (cell) array having only the one column and the numeric data matrix having N rows for each of the N different columns of numbers. Call these matrices/arrays textData and numericData. (Note that the textData is most likely a cell array because you may allow different string lengths for each row.)
Now append a column to the numericData matrix, with values from 1 to M (so row i is assigned the value of i). This column will act like a series of unique identifiers for the numeric data, and will in fact be indices into the textData array so no matter how we sort our data in the numericData array, the identifier can always be used to refer back to the text string in textArray. (So now numericData is a Nx(M+1) matrix.) The code for doing this appending is simply:
[M,N] = size(numericData);
numericData = [numericData [1:M]'];
So now you want to sort the rows by the numeric values. Do you mean to sort on the first column, second column, etc.? If that is the case, then you can use the sortrows command, and indicate the column order of the sort. For example, if your numericData is:
numericData =
95 45 92 41 13 1 84
95 7 73 89 20 74 52
95 7 73 5 19 44 20
95 7 40 35 60 93 67
76 61 93 81 27 46 83
76 79 91 0 19 41 1
then we append the id column as:
[M,N] = size(numericData);
numericData = [numericData [1:M]'];
numericData =
95 45 92 41 13 1 84 1
95 7 73 89 20 74 52 2
95 7 73 5 19 44 20 3
95 7 40 35 60 93 67 4
76 61 93 81 27 46 83 5
76 79 91 0 19 41 1 6
Then we sort on the first column, second, etc. as
numericData = sortrows(numericData,[1:N])
numericData =
76 61 93 81 27 46 83 5
76 79 91 0 19 41 1 6
95 7 40 35 60 93 67 4
95 7 73 5 19 44 20 3
95 7 73 89 20 74 52 2
95 45 92 41 13 1 84 1
All the data is sorted by column, and each row still has its unique identifier (in the last column) so we can easily refer back to the text string for that row.