Solved – How to analyse survey data to find customer satisfaction from weighted scores

analysismachine learningsurvey

I have survey data from multiple customers, where the customers have scored on a scale each question from 1 to 10, to indicate how much they agree with a specific question – 10 being the heights score.

I'm looking for the best machine learning approach to analyse customer satisfaction, so that I can identify the determining factor that made the customer satisfied with my service.

Through data engineering I can put various categories on the customers, like how much time I have worked with them. Who was on the team for the customer. And many many more.

Is this possible, and what would be the best way to go about it? Im totally new to machine learning, and have only played a little with pandas for python.

Thank you in advance for you time and help, it is much appreciated.

Best Answer

I am assuming that one of the statements is your outcome of interest; something like, "I am satisfied with my customer experience". You want to tie back both the responses to other questions, as well as demographic/transactional/profile information about your customers to this outcome statement. If so, your question sounds like what's often called "key driver analysis". In my experience, it's never 1 driver, it's multiple; and those drivers change for different customer profiles.

Do you have a hypothesized framework for what drives satisfaction? Do you believe that there is some unmeasured, latent influence on satisfaction that is expressed by the things you can measure? If so, you might use structural equation modeling or a confirmatory factor analysis to confirm or refute your hypotheses.

Otherwise, you might look at techniques such as partial least squares or principal components regression. These tend not to come from a preconceived hypothesis of how the world works. You may even learn a great deal simply by visualizing the correlations between the different survey item responses and your satisfaction outcome measurement- no formal model needed.

Please note that unless your key drivers are uncorrelated- which is highly unlikely- you will have to deal with untangling multicollinearity in your analysis. That is, if customer satisfaction is correlated with both price and packaging, but price and packaging are correlated with each other, you'll need an approach that either exploits or accounts for the correlation between price and packing. The 4 approaches I listed above, which are certainly not exhaustive, have different ways of dealing with multicollinearity.

For a gentle introduction to customer satisfaction analysis and psychographics, I like this author's work. Please note I don't know him, and am in no way affiliated, I just like this style and he does examples in R (which I also use). He has postings on network analysis of key drivers, structural equation modeling, and relative importance of drivers, among others.

Related Question