MATLAB: Save in a string the values obtained after varfun and grouping

unique values to string after grouping

Hi everyone, I'm struggling with this problem: I apply the following function to a table:
sectors = varfun(@(v) numel(unique(v)), fullTable, 'GroupingVariables', {'ID1','Date'}, 'InputVariables', 'Country');
In this way, after grouping for ID1 and Date, I obtain the number of unique countries in the output field. Is there a way to obtain a list of such countries in a string? I tried adding a function to the list only for printing the values but I receive an error of too many outputs. If I add mat2str(unique(v)) it does not work since it is a categorical array.
sectors = varfun(@(v) [numel(unique(v)), unique(v)], fullTable, 'GroupingVariables', {'ID1','Date'}, 'InputVariables', 'Country');
Thanks for your help

Best Answer

If you want to create a new table with two new variables, one for the number of unique countries and one for the list of unique countries then a) you have to use rowfun instead of varfun, and b) you can't use an anonymous function (since you need a function with 2 outputs) but a normal m function:
function [count, list] = uniquecountries(countries)
%inputs:
% countries: a list of countries (categorical)
% count: the number of unique countries (double)
% list: the list of unique countries, joined by a ', ' (scalar cell array of a char vector)
%Note: list is returned as a scalar cell array instead of a char vector so that rowfun can concatenate the output of uniquecountries
validateattributes(countries, {'categorical'}, {});
list = unique(countries);
count = numel(list);
list = {strjoin(cellstr(list), ', ')};
end
Then:
sections = rowfun(@uniquecountries, fullTable, ...
'GroupingVariables', {'ID1','Date'}, ...
'InputVariables', 'Country', ...
'NumOutputs', 2, ...
'OutputVariableNames', {'Country_count', 'Country_list'})
Alternatively, if you don't want a separate m function, you can have two calls to rowfun or varfun to create two tables that you then join. It will be much slower however:
sectionscount = rowfun(@(v) numel(unique(v)), fullTable, ...
'GroupingVariables', {'ID1','Date'}, ...
'InputVariables', 'Country', ...
'OutputVariableNames', 'Country_count');
sectionslist = rowfun(@(v) {strjoin(cellstr(unique(v)), ', ')}, fullTable, ...
'GroupingVariables', {'ID1','Date'}, ...
'InputVariables', 'Country', ...
'OutputVariableNames', 'Country_list');
sections = join(sectionscount, sectionslist)