There are many binary similarity measures (e.g. Jaccard, Sorensen, etc), each of them is sensitive to different properties of the compared sets. I would like to use the metric $S=\frac{N_{A\bigcap B}}{min(N_{A}; N_{B})}$, where $N_{A}$ is the count of set $A$. So basically I divide the size of the intersection by the size of the smaller set. I am sure I am not the first who found this out, and maybe it has some pretty name. Does anybody know?
Jaccard Similarity – A Similarity Measure with Binary Data: Does This One Have a Name?
distance-functionsjaccard-similaritysimilarities
Related Solutions
You can use Generalized Jaccard Index, and assume that the set $s$ is actually a multiset:
If $\mathbf{x} = (x_1, x_2, \ldots, x_n)$ and $\mathbf{y} = (y_1, y_2, \ldots, y_n)$ are two vectors with all real $x_i, y_i \geq 0$, then their Jaccard similarity coefficient is defined as $$J(\mathbf{x}, \mathbf{y}) = \frac{\sum_i \min(x_i, y_i)}{\sum_i \max(x_i, y_i)}.$$
Here you can read "vector" as "multiset", and $x_i$ is a count of element $i$ in the multiset $\mathbf x$.
Aside: It sounds like your underlying problem (though not your direct questions) is related to calibration, on which a fair bit has been written. If your device is not as close as you'd like to the commercial one, it may not matter so much, as long as it's fairly consistent in the way it responds. A calibration curve (in most cases, just a line) is often used to adjust readings on devices to match some standard (whence the scale on which the readings are made can be correspondingly adjusted for any such consistent bias). So the methodology of calibration may be of use to you if your device has some bias compared to the commercial one.
Your direct question sounds like you probably want equivalence testing; in particular, a two-one-sided test (TOST) procedure.
The more usual way of setting this up amounts to setting a pair of equivalence bounds around your gold standard measurement (values which are "close enough" to call equivalent) and then showing that you'd reject the hypothesis that the population mean of your measurement would lie above the upper bound and also that it would lie below the lower bound (and so you would conclude it will lie between the bounds).
[This can also be recast as seeing if a two sided confidence interval for the parameter lies entirely within the pair of equivalence bounds.]
See for example Walker & Nowacki (2011) [1]; there's a discussion of TOST in industrial applications in Richter & Richter (2002) [2].
However, a caveat: Presumably you're testing your device not at one value but across the range of the device. Given that there may be more bias at one value than another (indeed, it's possible to be biased low in one place and high in another), you probably want to look at equivalence at each value for the standard device rather than a simple TOST setting (in that case establishing equivalence bands, which may not necessarily be equally wide at every value -- e.g. if equivalence is in percentage terms). This brings us back nearer to the calibration problem I mentioned at the start.
[1]: Walker, E., & Nowacki, A. S. (2011).
Understanding Equivalence and Noninferiority Testing.
Journal of General Internal Medicine, 26(2), 192–196.
http://doi.org/10.1007/s11606-010-1513-8
(Ignore the 'noninferiority' stuff there, you're just after the equivalence part)
[2]: Richter, S. J. & C. Richter (2002),
"A Method for Determining Equivalence in Industrial Applications,"
Quality Engineering, 14(3), 375–380
Best Answer
Your measure seems to resolve to a distance defined by Simpson. See A Survey of Binary Similarity and Distance Measures page 44, equation 45.