Solved – How to best display crosstab data

categorical datacorrespondence-analysisdata visualizationmultidimensional scaling

I have a 10×10 matrix composed of two variables with 10 brands each. One variable is the brand purchased, the other is the brand considered. My matrix shows a crosstabulation between the two. I need an effective way to clearly visualize these data so that proximities suggest similarities between categories and distances dissimilarities. I ran a correspondence analysis, but wasn't impressed with the graph. Are there any alternative technique to consider?

Best Answer

I'm not certain of your exact data, or the process you're using to analyze it, but what you describe makes me think of a correlation matrix. In R, generating the matrix, as well as the corresponding heat map (with dendrogram) is easy. The example below used example data to show correlations between usage rates of different IT applications, and generates the image using the "plots" and "RColorBrewer" packages in R.

Note that you do not need to pass a correlation matrix to the following script example; you may pass cross-tab results directly, as any numbers in the matrix will be translated into the heatmap.

Sample data:

,Service Catalog, Incident Management, CMDB, Platform, Change Management, Knowledge, 
    Request Management
Service Catalog,100,95,92,88,85,80,65
Incident Management,95,100,90,79,86,83,50
CMDB,92,90,100,68,85,76,42
Platform,88,79,68,100,79,61,45
Change Management,85,86,85,79,100,58,85
Knowledge,80,83,76,61,58,100,45
Request Management,65,50,42,45,85,45,100

Sample code:

MyData <- subset(Example, select=c(Service.Catalog:Request.Management))

MyMatrix <- as.matrix(MyData)
MyScaled <- scale(MyMatrix)

library("plots")

install.packages("RColorBrewer")

png(filename="MyTest.png", width = 500, height = 500, res=72)

heatmap.2(MyMatrix, margins=c(20,20))

heatmap(MyMatrix, margins=c(15,15))
dev.off()

Example image

Related Question