Solved – How to handle count data (categorical data), when it has been converted to a rate

categorical datacount-dataincidence-rate-ratio

I am working on disease infection data, and I am puzzled on whether to handle the data as "categorical" or "continuous".

  • "Infection Count"
    • the number of infection cases found in a specific period of time, the count
      is generated from categorical data (i.e. no. of patient tagged as "infected")
  • "Patient Bed Days"

    • sum of total number of day stay in the ward by all patients in that ward, again, the count is generated from categorical data (i.e. no. of patient tagged as "staying in that particular ward")
  • "infection per patient bed days"

    • "infection count" / "patient bed days"
      both were originally count data, but now becomes a rate

Question:

  • Can I use Chi-Square here to assess whether the difference in "infections per patient bed days" is statistically significant or not?

Updates

I have found that I can compare the incidence rate (or call it infection rate), but doing something like "incidence rate difference" (IRD) or "incidence rate ratio" (IRR). (I found it from here)

  • What is the difference between IRD and t-test?
  • Is there any statistical test complementary for IRR?

Best Answer

For me it does not at all sound appropriate to use a chi-square test here.

I guess what you wanna do is the following: You have different wards or treatments or whatever else kind of nominal variable (i.e., groups) that divides your data. For each of these groups you collected the Infection Count and the Patient Bed Days to calculate the infection per patient bed days. Know you wanna check for differences between the groups, right?

If so, an analysis of variance (ANOVA, in case of more than two groups) or a t-test (in case of two groups) is probably appropriate given by the reasons in Srikant Vadali's post (and if the assumptions homogeneity of variances and comparable groups sizes are also met) and the beginner tag should be added.