A spineplot (mosaic plot) works well for the example data here, but can be difficult to read or interpret if some combinations of categories are rare or don't exist. Naturally it's reasonable, and expected, that a low frequency is represented by a small tile, and zero by no tile at all, but the psychological difficulty can remain. It's also natural that people fond of spineplots choose examples which work well for their papers or presentations, but I've often produced examples that were too messy to use in public. Conversely, a spineplot does use the available space well.
Some implementations presuppose interactive graphics, so that the user can interrogate each tile to learn more about it.
An alternative which can also work quite well is a two-way bar chart (many other names exist).
See for example tabplot
within http://www.surveydesign.com.au/tipsusergraphs.html
For these data, one possible plot (produced using tabplot
in Stata, but should be easy in any decent software) is
![enter image description here](https://i.stack.imgur.com/mfLhM.png)
The format means it is easy to relate individual bars to row and column identifiers and that you can annotate with frequencies, proportions or percents (don't do that if you think the result is too busy, naturally).
Some possibilities:
If one variable can be thought of a response to another as predictor, then it is worth thinking of plotting it on the vertical axis as usual. Here I think of "importance" as measuring an attitude, the question then being whether it affects behaviour ("often"). The causal issue is often more complicated even for these imaginary data, but the point remains.
Suggestion #1 is always to be trumped if the reverse works better, meaning, is easier to think about and interpret.
Percent or probability breakdowns often make sense. A plot of raw frequencies can be useful too. (Naturally, this plot lacks the virtue of mosaic plots of showing both kinds of information at once.)
You can of course try the (much more common) alternatives of grouped bar charts or stacked bar charts (or the still fairly uncommon grouped dot charts in the sense of W.S. Cleveland). In this case, I don't think they work as well, but sometimes they work better.
Some might want to colour different response categories differently. I've no objection, and if you want that you wouldn't take objections seriously any way.
The strategy of hybridising graph and table can be useful more generally, or indeed not what you want at all. An often repeated argument is that the separation of Figures and Tables was just a side-effect of the invention of printing and the division of labour it produced; it's once more unnecessary, just as it was to manuscript writers putting illustrations exactly how and where they liked.
Best Answer
This means that your estimates of Sales as a function of Mvisits are more accurate for low values of Sales. You can see that reflected in the confidence bands around the smoothing regression (the red curvy line), which are relatively narrow for average Sales $\le$ 30, but widening quite a bit thereafter.
Of course the histogram of Mvisits is also skewed to the right, so taken together this means your data are giving you a reliable idea about how these two variables relate only for low values of each.