Data Analysis – Continuous and Categorical Variable Analysis

categorical datacontinuous data

I have three variables:

  • distance (continuous, variable range negative infinity to positive infinity)
  • isLand (discrete categorical/ Boolean, variable range 1 or 0)
  • occupants (discrete categorical, variable range 0-7)

I want to answer the following statistical questions:

  • How to I compare distributions that have both categorical and continuous variable. For example, I like to determine if the data distribution of distance vs occupants varies depending on the value of isLand.
  • Given two of the three variables, can I predict the third using some equation?
  • How can I determine independence with more than two variables?

Best Answer

I would recommend reading about logistic or log-linear models in particular, and methods of categorical data analysis in general. The notes on the following course are pretty good for a start: Analysis of Discrete Data. The textbook by Agresti is quite good. You might also consider Kleinbaum for a quick start.

Related Question