Solved – How to compare the positive predictive value and negative predictive value of two diagnostic tests

diagnostic

Im a surgeon who's trying to compare two diagnostic tests that are used to diagnose appendicitis. Two diagnostic tests were applied on 150 patients and the results were compared to a gold standard. I understand that the sensitivities and specificities can be compared by using a McNemars chi square test. But what about the positive and negative predictive values… how do you compare these? Any help would be appreciated!
eg.
enter image description here

Best Answer

UPDATE

I found a relevant paper!

Leisenring W, Alonzo T, Pepe MS. Comparisons of predictive values of binary medical diagnostic tests for paired designs. Biometrics 2000; 56:345-51.

It's not an easy one to read, I think it would be best to get a statistician with experience of hierarchical regression models to read the paper and advise. If this is not an option open to you then I will try to find time to have a go with your data, if you can provide it in the following format:

+--+-----+-----+
|  |  D+ |  D- |
|  |B+ B-|B+ B-|
+--+-----+-----+
|A+|a  b |e  f |
|A-|c  d |g  h |
+--+-----+-----+

Where D is your reference standard and A and B are your two index tests (RIPASA and Alvarado). The full cross-tabulation is necessary.

(Previous answer)

In this example the test on the left is superior according to the diagnostic likelihood ratios (labeled PLR & NLR) which are both further from 1.

Diagnostic likelihood ratios, sensitivity and specificity are all significantly less sensitive to prevalence than PPV and NPV (which are both directly sensitive to prevalence) but they can still be affected if different prevalence in studies is due to spectrum bias.

Of course, a simple comparison of point estimates is not satisfactory - the better test might have only been tested in a small population and there could be no statistically significant difference. Also you want to know whether they were both evaluated in representative samples (using a single gate [cohort or nested case-control] design).

If you have multiple studies of each technique then a meta-analysis (using bivariate or HSROC method) may be appropriate. See chapter 10 (I think) of the Cochrane handbook for systematic reviews of diagnostic test accuracy.

Finally, it could be that the diagnostically superior test is not acceptable to the same population as the inferior test. Better tests are often more expensive and more invasive, both factors which can render such tests poor for, e.g., screening.