I am trying to check variable importance of this gender variable. I know if p value is less than 0.05 then its important otherwise not but what is it's giving p-value < 2.2e-16. I have tried other methods too but giving the same for all of the categorical variable. I have pasted output for only one variable. SO, how to interpret this. (variable to be considered or not).

## Frequency of my data points

```
table(data$gender,data$target)
Output:
N Y
F 2107566 2560932
M 1307442 1567399
U 3 16
```

## To test statistical significance score:

```
chisq.test(table(data$gender,data$target))
Output:
Pearson's Chi-squared test
data: table(data$gender, data$target)
X-squared = 86.9407, df = 2, p-value < 2.2e-16
```

Note : I think one of the possible reason might be because of 7.5 million rows. So, will this be solved by sampling the data for checking the significance.

## Best Answer

Important parts of the Q is answered. That p-value enables you to undoubtedly report on whether your hypoethsis is true or not. It is far more better than when p-value is at the critical level something like 0.05