Solved – combine several ordinal questions to create an index and/or composite measure

compositeordinal-datapsychometricsreferencessurvey

I am using a survey that contains several questions about various dimensions of performance for policy research institutes. Here, performance in the policy arena is unpacked into things like:

  • quality of research,
  • overall ability to engage with policy stakeholders,
  • quality of recommendations, overall support to, and influence on, policymakers,
  • etc.

Each question has a typical 5-level ranking (i.e., ordinal response from "strongly disagree" to "strongly agree", or "very bad" to "very good").

I am thinking of pursing either of two options:

  1. Creating a sort of composite measure, where all dimensions of performance are aggregated together so as to have one composite variable of "performance". I could then use this composite variable as the dependent variable (perceptions of performance).

  2. Combining 20 of these questions to create a performance index. For each dimension, responses range between 1 and 5. So total scores on the index would thus range from a minimum of 20 up to a maximum total of 100 points. This index could also be used as a dependent variable. Or perhaps only for descriptive statistics.

Does this make sense? Any advice and reference would be greatly appreciated.

Best Answer

Yes, both your points make perfect sense, and are indeed a standard practice - at least in an area called psychometry.

But I cannot agree with the title question: it is not always valid to ordinal variables. In general case one cannot add nor subtract values measured on ordinal scale and hope, that the result would be independent from arbitrariness that come in the notion of ordinal variable.

Ordinal variable is a special case of interval variable; one in which we cannot say how far away from each other are adjacent levels of the variable. For instance, the education (which in many contexts is a valid ordinal variable) can be measured in 3 levels:

  • Primary education
  • Secondary education
  • Higher education

These 3 levels are usually mapped internally into numeric values "1", "2" and "3" - but this mapping is completely arbitrary. One can equally well map these levels as "1", "10", "100", or "8", "12", "17" (the last example would be a rough estimate of years of education) or employ the procedure from the Witkowski's paper. All statistical procedures that are designed for ordinal variables are invariant with respect to any injective function applied to the values associated to the levels. Imagine now, that we asked the subject to state the education level of mother and father. And now we want to build a parents' education index - by simply averaging parents' education level.

Now the outcome will become highly dependent on the mapping done between education levels and numbers, that represent them internally. For the most typical case ("1", "2" and "3") the process of averaging yields the same level "2" if one parent has Primary education and the other has Higher education, and if both parents have Secondary education. This feature might be correct, or might not, depending on how well the assigned numerical values represent the actual value each education has in our view.

The typical 5-level ranking you mentioned (a.k.a. Likert scale) was specially crafted in such a way, that the semantical distance between consecutive levels is kept roughly constant. Because of this property, such variables can be classified as interval, hence we can proceed with addition (or arithmetic mean, or any other mathematical manipulations).