MATLAB: Cellfun vs. varfun applied to column of table

cellfunstr2doublestr2numtype conversionvarfun

Hello,
I have a table (only showing one column here for demo purposes):
T = table({'147956, 154414'; '1, 7439'; '93053, 101815'; '50151, 54585; 827532, 828570; 5846728, 5848716'; '1063488, 1079019'},'VariableNames',{'indices'})
What I need to do is to convert the 1×1 char arrays into 1xn double arrays.
First I tried to use varfun to apply the str2double function to the indices variable:
T.indices = varfun(@str2double,T,'InputVariables','indices')
That ran without error, but converted all of my 1×1 char arrays into 1×1 double arrays with NaN in each.
I tried the same thing with str2num:
T.indices = varfun(@str2num,T,'InputVariables','indices')
But it gave me the error:
Applying the function 'str2num' to the variable 'indices' generated the following error:
Input must be a character vector or string scalar.
What ultimately worked was using cellfun to apply the str2num function:
T.indices = cellfun(@str2num,T.indices,'UniformOutput',false)
But I'm not sure why. What is the difference between passing "T,'InputVariables','indices'" into varfun and passing "T.indices" into cellfun? In what case would you use varfun? And is there any way to pass in my variable so that str2double returns the correct output?
Thanks in advance for any insights,
Rob

Best Answer

varfun passes entire variables from your table into your function. It calls your function once per table variable. The variable in your table is indices, a 5x1 cell array of character vectors:
>> T.indices
ans =
5×1 cell array
{'147956, 154414' }
{'1, 7439' }
{'93053, 101815' }
{'50151, 54585; 827532, 828570; 5846728, 5848716'}
{'1063488, 1079019' }
This runs without error because str2double can handle cell array input:
T.indices = varfun(@str2double,T,'InputVariables','indices')
It is similar to:
>> str2double(T.indices)
ans =
NaN
NaN
NaN
NaN
NaN
You get an error here because str2num cannot accept cell arrays:
T.indices = varfun(@str2num,T,'InputVariables','indices')
It is similar to:
>> str2num(T.indices)
Error using str2num (line 35)
Input must be a character vector or string scalar.
cellfun, on the other hand, calls your function over and over again, once per cell in your cell array. cellfun passes the contents of each cell when calling your function. The contents of each cell in T.indices is a character vector, so str2num doesn't complain, and you don't get an error here:
T.indices = cellfun(@str2num,T.indices,'UniformOutput',false)
It is similar to:
>> {str2num(T.indices{1}); str2num(T.indices{2}); str2num(T.indices{3});...
str2num(T.indices{4}); str2num(T.indices{5})}
ans =
5×1 cell array
{1×2 double}
{1×2 double}
{1×2 double}
{3×2 double}
{1×2 double}
As for the different behavior of str2num and str2double, consider the following:
>> str2double('1,000')
ans =
1000
>> str2num('1,000')
ans =
1 0
From the documentation for str2double (roughly),
"If [input] is a character vector or string scalar, then [output] is a numeric scalar."
It therefore treats commas differently.
You could use something like sscanf, regexp, strsplit, etc to split your string and, if still needed, use str2double to convert to a double.
(edit) reworded a few things