Solved – VC dimension of Parity Function

machine learningvc-dimension

Reference: Machine Learning Foundations lecture by Yishay Mansour. I am struggling to understand the explanation when they arrive at $X_S(e_j)$

Could someone provide some details? Thanks.

Best Answer

The VC dimension is the largest value of $N$ for which there exists a data set of $N$ points that can be shattered by the concept class. Shattering a data set means that every possible labeling of the data set can be realized by picking the right function from the concept class.

In this case, the feature space is the set of all possible length-$n$ bit vectors. If we want to say that the VC dimension is at least $N$, we need to exhibit a data set of size $N$ that can be shattered by $C_6 = \{\chi_S: S \subseteq \{1, \dots, n\} \}$. The example above says to pick the data set to be the $n$ unit vector columns of the $n \times n$ identity matrix. Now suppose we have some arbitrary labeling of these vectors $b_1, \dots, b_n$. The claim is that we can pick an element of $C_6$ that achieves this labeling.

Indeed, choose $\chi_S \in C_6$ with $S = \{i: b_i = 1 \}$. Then for any $e_j$ with label $b_j$, our chosen $\chi_S$ will classify this vector as 1 if and only if $j \in S$. But by construction, $j \in S$ if and only if $b_j = 1$. So the classifier recovers the true labels for all $n$ points in the data set. We have exhibited a size $n$ data set that can be shattered by the concept class, so we know the VC dimension is at least $n$.

As a concrete example, let $n = 3$, so the data set is $e_1 = 100, e_2 = 010, e_3 = 001$. Now dream up any labeling you want of these three vectors, say $(b_1, b_2, b_3) = (0, 1, 1)$. Then the set $S$ is $\{2, 3\}$. Now we check that the classifier $\chi_S$ recovers the labels $b_i$ for each $i = 1,2,3$. Indeed, $$ \chi_S(e_1) = 0 \\ \chi_S(e_2) = 1 \\ \chi_S(e_3) = 1 $$ which you can check using both definitions of $\chi_S$ (as a general bitwise sum or the special case for elementary unit vectors).

Best Answer

Related Solutions

Solved – How to choose the right number of parameters in Logistic Regression

Solved – What text or tutorials would you recommend for learning machine learning for image processing

Related Question