MATLAB: Converting cellarray to double

cell arrayperformance

I have a cellarray of size 100000 x 20 and need to convert to numeric array. I used function
str2double(StringData);
but it's taking lot of time to convert. i have the same issue when converting numeric to string using arrayfun:
arrayfun(@(x) num2str(x,4),Data,'Uni',0);
Is there a better way to improve the performance?

Best Answer

Here are a few ways that you could try to improve performance:
  • Avoid converting to string. This conversion is always going to be a slow process, and the only time that you really need to convert large amounts of data to/from string is when reading/writing text files (in which case you should be using file-handling functions anyway). So if your code converts massive arrays to string, then it probably could do with a major revision.
  • Avoid repeating the conversion. Review your code: Could this conversion be performed once, and then the string values used again?
  • Use low-level functions. Have a look at str2double: you will see that it is basically just a wrapper for sscanf . And num2str is a wrapper for sprintf . So one could save a few precious milliseconds by using the low-level functions directly. Keep in mind that if you do this you will need to manage the array sizes and classes very precisely.
  • Do not store each string value separately, rather keep multiple values in one char array or string. Many string-parsing functions will operate on the whole array: e.g. sscanf will repeat parsing with the given format until a non-matching value. Some file-reading functions accept strings as their input, and can parse it in one go, e.g. textscan .
  • Cheat. If you are dealing with integers between zero and nine inclusive, then you can do a very direct and fast conversion string-'0', e.g. '9'-'0' -> 9.
  • Consider vectorization. Yes vectorization even helps with converting to string. For example:
sscanf(sprintf('%s\v',CellOfStrings{:}),'%f\v')
is much faster than
cellfun(@(s)sscanf(s,'%f'),CellOfStrings)
although it only works for data of class double.