Solved – Scatter plot in R with non numeric data

rscatterplot

I need to represent some non numeric data of a questionnaire in a scatter plot in R. What I mean by a non numeric data is that, I have two questions answers to which are some text. For example Q1 has following answers ("A", "B", "C") and Q2 has following answers ("X", "Y", "Z")

I need to represent these two questions (Q1 and Q2) in a scatter plot in order to show visually which persons who answered to Q1 with "A" answered to Q2 with "X" and so on. It would be better to represent like this:

enter image description here

Thank you.

Best Answer

I'm not sure I understood your question, but I hope the following helps.

First, let's make some data similar to what you might have in your hands:

> set.seed(3409)
> df <- data.frame(q1 = sample(LETTERS[1:3],   15, replace = TRUE),
+                  q2 = sample(LETTERS[24:26], 15, replace = TRUE))
> df
   q1 q2
1   A  Z
2   C  Z
3   B  Z
4   B  X
5   C  Z
6   B  Y
7   C  X
8   C  X
9   A  Y
10  A  Y
11  C  Z
12  C  Z
13  C  Y
14  A  X
15  A  Z

Now we'll make the table that will base the graph we want to do later:

> tb <- table(df$q1, df$q2)
> tb

    X Y Z
  A 1 2 2
  B 1 1 1
  C 2 1 4

Two popular ways of plotting the data above are through a barplot and a mosaic plot:

> barplot(tb, beside = TRUE, legend = TRUE)  # barplot

barplot

> plot(tb)  # mosaic plot

mosaic plot

As far as I know, scatterplots are not suited for categorical data.