Can somebody explains me the difference between "multimodal" and "multivariate"?
For example, I have a dataset which contains different information. All information objects are connected together by a timestamp. Is this dataset multimodal or multivariate? If I create an algorithm for clustering these data, should I call this algorithm multimodal or multivariate?
Best Answer
Put very simply, "multi-modal" refers to a dataset (variable) in which there is more than one mode, whereas "multi-variate" refers to a dataset in which there is more than one variable.
Here is a simple demonstration, coded with R:
That's the gist of it. When you have response and regressor variables, and you want to fit a model that maps them, the use of "multivariate" depends on the nature of the mapping. When there is only one response and one covariate, we say this is simple regression; if there is more than one covariate, we say it is multiple regression; and if there is more than one response variable, we call it multivariate regression. In your case, I gather you are interested in clustering / unsupervised learning, so these distinctions don't really apply.
However, the clustering aspect makes this a little more interesting. In order to cluster successfully, you generally want your data to be multimodal in the full data space. The clusters / latent groupings are found by finding a partition that separates the data into unimodal subsets that are more coherent than the original (unpartitioned) superset.