MATLAB: Cellfun vs. varfun applied to column of table

cellfunstr2doublestr2numtype conversionvarfun

Hello,

I have a table (only showing one column here for demo purposes):

T = table({'147956, 154414'; '1, 7439'; '93053, 101815'; '50151, 54585; 827532, 828570; 5846728, 5848716'; '1063488, 1079019'},'VariableNames',{'indices'})

What I need to do is to convert the 1×1 char arrays into 1xn double arrays.

First I tried to use varfun to apply the str2double function to the indices variable:

T.indices = varfun(@str2double,T,'InputVariables','indices')

That ran without error, but converted all of my 1×1 char arrays into 1×1 double arrays with NaN in each.

I tried the same thing with str2num:

T.indices = varfun(@str2num,T,'InputVariables','indices')

But it gave me the error:

Applying the function 'str2num' to the variable 'indices' generated the following error:

Input must be a character vector or string scalar.

What ultimately worked was using cellfun to apply the str2num function:

T.indices = cellfun(@str2num,T.indices,'UniformOutput',false)

But I'm not sure why. What is the difference between passing "T,'InputVariables','indices'" into varfun and passing "T.indices" into cellfun? In what case would you use varfun? And is there any way to pass in my variable so that str2double returns the correct output?

Thanks in advance for any insights,

Rob

Best Answer

varfun passes entire variables from your table into your function. It calls your function once per table variable. The variable in your table is indices, a 5x1 cell array of character vectors:

>> T.indices
ans =
  5×1 cell array
    {'147956, 154414'                                }
    {'1, 7439'                                       }
    {'93053, 101815'                                 }
    {'50151, 54585; 827532, 828570; 5846728, 5848716'}
    {'1063488, 1079019'                              }

This runs without error because str2double can handle cell array input:

T.indices = varfun(@str2double,T,'InputVariables','indices')

It is similar to:

>> str2double(T.indices)
ans =
   NaN
   NaN
   NaN
   NaN
   NaN

You get an error here because str2num cannot accept cell arrays:

T.indices = varfun(@str2num,T,'InputVariables','indices')

It is similar to:

>> str2num(T.indices)
Error using str2num (line 35)
Input must be a character vector or string scalar.

cellfun, on the other hand, calls your function over and over again, once per cell in your cell array. cellfun passes the contents of each cell when calling your function. The contents of each cell in T.indices is a character vector, so str2num doesn't complain, and you don't get an error here:

T.indices = cellfun(@str2num,T.indices,'UniformOutput',false)

It is similar to:

>> {str2num(T.indices{1}); str2num(T.indices{2}); str2num(T.indices{3});...
    str2num(T.indices{4}); str2num(T.indices{5})}
ans =
  5×1 cell array
    {1×2 double}
    {1×2 double}
    {1×2 double}
    {3×2 double}
    {1×2 double}

As for the different behavior of str2num and str2double, consider the following:

>> str2double('1,000')
ans =
        1000
>> str2num('1,000')
ans =
     1     0

From the documentation for str2double (roughly),

"If [input] is a character vector or string scalar, then [output] is a numeric scalar."

It therefore treats commas differently.

You could use something like sscanf, regexp, strsplit, etc to split your string and, if still needed, use str2double to convert to a double.

(edit) reworded a few things

Related Solutions

MATLAB: Does STRMATCH in MATLAB cause an error for a cell array containing a numeric value

In this example:

str = 9;
strs = strvcat('This is a string array','with two rows');
ind = strmatch(str,strs);

the normal STRMATCH is used, since the first input is not a cell array. In this case, 9 could be used as an ASCII value, and converted to a tab character. That character can then be compared with the other strings.

In this example:

str ={9};
strs = strvcat('This is a cell array of strings','with two rows');
ind = strmatch(str,strs);

@CELL/STRMATCH will be used, since the first input is a cell array. In this case, the code checks that the cell array contains strings, and converts it to a char array:

if iscellstr(str), str = char(str); end

Cell arrays of strings are special cases of cell arrays, which are commonly used to store strings together which have different lengths, and there are many instances where cell arrays of strings and strings can be used interchangeably. However, {9} is not a cell array of strings, so it is not converted to a character array, which leads to the error being thrown by this code:

if ~ischar(str) | ~ischar(strs)
    error('Requires character array or cell array of strings as inputs.')
end

MATLAB: Str2num and commas

I think str2num considers '20' and '000' as two different strings because of the comma. What is the output that you are looking for? if you use str2double it returns 20000 . Hope that serves your purpose.

Best Answer

Related Solutions

MATLAB: Does STRMATCH in MATLAB cause an error for a cell array containing a numeric value

MATLAB: Str2num and commas

Related Question