[Math] Mathematical notation for a unique values in a vector

notationvectors

Can anyone tell me how can I mathematically describe the following problem.

I have two vectors: (i) vector $\boldsymbol{V} = [v_1,…,v_n]^T$, containing some values among which some of them are the same; and (ii) vector $\boldsymbol{P} = [p_1,…,p_n]^T$, containing the probabilities of occurrence for each corresponding value in $\boldsymbol{V}$.

I would like to get a new vector $\boldsymbol{V}^* = [v_1^*,…,v_m^*]^T$, containing only the unique values from vector $\boldsymbol{V}$ (i.e. $m < n$), and the corresponding probability vector $\boldsymbol{P}^* = [p_1^*,…,p_m^*]^T$, in which the probabilities from vector $\boldsymbol{P}$ that have the same value in vector $\boldsymbol{V}$ are summed up.

What would be the best way to denote this using only math symbols?

Thanks in advance!

Best Answer

I'd say this question is rather in the domain of computer science. I would typically address this problem like this

Let $\textbf{V} \in F^n$ be an $n$-dimensional vector of some field $F$. $\textbf{V}$ can be additionally represented as an enumerated list (AKA array), such that each dimension of $\textbf{V}$ is enumerated by an integer from $0$ to $n-1$.

Notation 0: For ever vector $\textbf{V}$, let dim($\textbf{V}$) be its dimension, that is, the number of its elements.

Notation 1: Let $\textbf{V}[i]$ be the $i$-th element of $\textbf{V}$, the element corresponding to the $i$-th enumerated dimension, such that $\textbf{V}[i] \in F$ and i $\in \mathbb{Z}$, $i \in [0, n)$

Notation 2: An index vector can be defined as $\textbf{IND}_n \in \mathbb{Z}^m$, such that $\textbf{IND}_n[i] \in [0, n) \; \; \forall i \in [0, m)$

Notation 3: We can also use index vectors as indices for other vectors. Let $\textbf{W} = \textbf{V}[\textbf{IND}_n]$ be another vector, such that $W \in F^m$, where $m = \mathrm{dim}(\textbf{IND}_n)$. In particular, $\textbf{W}[i] = \textbf{V}[\textbf{IND}_n[i]] \; \; \forall i \in [0, m)$

Finally, we need to define a function that finds unique elements. I have never seen it having a particular notation. In matlab, for example, a similar function is simply called "unique". Usually people just define a function and explain what it does. We will define a function UniqueIndex$(\textbf{V})$, which will return a vector of indices of $\textbf{V}$, which will correspond to the indices of the unique elements of $\textbf{V}$. We must also specify their order. For example, we will require that all repeating elements of $\textbf{V}$ will be skipped, but the order of the first occurrences of each element will be preserved.

Then, we can find our indices using $\textbf{IND}_n$ = UniqueIndex$(\textbf{V})$

We can find the vector of unique elements $\textbf{V}^* = \textbf{V}[\textbf{IND}_n]$

If we have another vector $\textbf{P}$, elements of which correspond to the elements of $\textbf{V}$ in the same order, the elements of the reduced vector $\textbf{P}^* = \textbf{P}[\textbf{IND}_n]$

This is more or less exactly what you would to in Python to actually perform this operation

Related Question