Solved – Why would one remove items which load on more than one component or factor in PCA or FA

factor analysispca

Why would a researcher remove items which load onto more than one component after rotation in PCA using Varimax?

A couple of studies I'm using as the basis for a study I'm conducting have done this but did not explain why.

The only explanation I can think of is that because the factors are supposed to be unrelated with varimax – the presence of items which load heavily onto 2 factors questions the results.

I'm more familiar with regular Factor Analysis, where it doesn't seem to be an issue if an item loads onto 2 or more factors. I can't seem to find any literature on this specific, so any help would be appreciated.

My problem is I am currently doing my Final Year Project for my undergraduate degree which aims to determine if there are underlying organistional cultural factors to the restaurant industry in Dublin. I'm using a shortened version of a scale called the Organistional culture profile, which uses 40 items to measure culture. originally the authors used PCA and varimax to discern the factors/componants, in a sample of american accountants. i aim to use the same method to have a look at the factors underlying dublin restaurant workers culture – which i predict will be somewhat different as it is in a different industry and country. wanting to follow the original authors method as closely as possible i have also used PCA and varimax. the authors removed any items which did not load on any factors (fair enough), but also factors which loaded highly (>.4) on more than one factor. Following their lead, and also (i thought) getting their logic (i assumed they expected the factors to be unrelated as the used Varimax, an orthoganal rotation) I too removed all highly cross-loading items and ran the PCA again. my problem is now that 5 different items/variables are crossloading on factors in the new rotated factor matrix. I can't get my head around how this could happen – surely if they didn't cross load in the first PCA they shouldn't in the second? additionally – if i was to follow the same logic and remove these variables as they cross load……. sorry for the long winded rant , but i am stumped!

Best Answer

PCA has more than one use. One reason to use it is to construct measurement scales, and you want those scales to be distinct, and have their own items. Items which load on more than one factor are kind of weird and confusing, so you don't want them. That's why people take them out.

Imagine I've developed a test, I'd say "Questions 1 to 9 measure language ability, questions 11 through 20 measure math ability, and question 10 measures both". That seems weird, so I take out Question 10.