Solved – d prime with 100% hit rate probability and 0% false alarm probability

I would like to calculate d prime for a memory task that involves detecting old and new items. The problem I have is that some of the subjects have hit rate of 1 and/or false alarm rate of 0, which makes the probabilities 100% and 0%, respectively.

The formula for d prime is d' = z(H) - z(F), where z(H) and z(F) are the z transforms of hit rate and false alarm, respectively.

To calculate the z transform, I use the Excel function NORMSINV (i.e., z(H)=NORMSINV(hit rate)). However, if the hit rate or false alarm rate is 1 or 0, respectively, the function returns an error. This is because z transform , as I understand, indicates the area under the ROC curve, which does not mathematically allow for 100% or 0% probability. In this case, I'm not sure how to calculate d' for the subjects with ceiling performance.

One website suggests replacing 1 and 0 rate with 1 – 1/(2N) and 1/2N with N being the maximum number of hits and false alarms. Another website says "neither H nor F can be 0 or 1 (if so, adjust slightly up or down)". This seems arbitrary. Does anyone have an opinion on this or would like to point me to the right resources?

Best Answer

Stanislaw & Todorov (1999) have a good discussion of this under the heading Hit and False-Alarm Rates of Zero or One.

They discuss the pros and cons of several methods for dealing with these extreme values, including:

Use a non-parametric statistic such as $A'$ instead of $d'$ (Craig, 1979)
Aggregate data from multiple subjects before calculating the statistic (Macmillan & Kaplan, 1985)
add 0.5 to both the number of hits and the number of false alarms, and add 1 to both the number of signal trials and the number of noise trials; dubbed the loglinear approach (Hautus, 1995) (see note below)
Adjust only the extreme values by replacing rates of 0 with $0.5/n$ and rates of 1 with $(n-0.5)/n$ where $n$ is the number of signal or noise trials (Macmillan & Kaplan, 1985)

The choice is ultimately up to you. Personally I prefer the third approach. The first approach has the drawback that $A'$ is less interpretable to your readers who are much more familiar with $d'$. The second approach may not be suitable if you are interested in single-subject behavior. The fourth approach is biased because you are not treating your data points equally.

Note: the loglinear method calls for adding 0.5 to all cells under the assumption that there are an equal number of signal and noise trials. If this is not the case, then the numbers will be different. If there are, say, 60% signal trials and 40% noise trials, then you would add 0.6 to the number of Hits, and 2x0.6 = 1.2 to the number of signal trials, and then 0.4 to the number of false alarms, and 2x0.4 = 0.8 to the number of noise trials, etc.

Best Answer

Related Solutions

Related Question