No direct answer. Only "hidden variables in quantum logic" (NO physics needed).
Quite long; subdivisions hopefully help.
Disclaimer
This expands what I said about hidden variables in (more general) lectures in Bologna, 2010
My study of this subject goes back to 1992 - 1993 with two later complements:
1998, approximations of arbitrary theories with theories admitting noncontextual hidden variables;
2010, (nonexistence of) noncontextual embedding in von Neumann's modular quantum logic.
So I have almost no idea about how
quantum logicians
look at this subject now.
You might consider (the already posted answer and): Svozil, Dalla Chiara & Giuntini or the Handbook of Quantum Logic and Quantum Structures. Philosophers might look at this
But above all the writings of Rédei, because they go back directly to von Neumann, including specific difficulties of noncommutative probability.
General setting for a quantum logic:
$L$ structured by a partial binary operation $\oplus$ (commutative, associative, with 0).
Examples: $L(H)$ (closed subspaces of an Hilbert space, with sum of orthogonal subspaces). Boolean algebras (with disjoint union). Self-adjoint idempotents (projections: $e^2=e=e^*$) in a ring with involution ($e\oplus f=e+f$ when $ef=0=fe$), the real interval $[0,1]$ with ordinary sum.
In all examples above the natural pre-order ($x\geq y$ iff $\exists z:x=y\oplus z$) is a partial order since $x\oplus y=x\Rightarrow y=0$. Even more the $z$ in $x=y\oplus z$ is unique when it exists; one writes $z=x\ominus y$ (orthocomplement of $y$ in $x\geq y$, or relative negation; $1\ominus x$ is the orthocomplement $x^\perp$, or negation, of $x$). $x\perp y$ means "$x\oplus y$ exists" (orthogonality relation; mutually exclusive propositions: one is contained in the negation of the other).
Homomorphisms $f$ preserve $\oplus$ ($x\oplus y=z\Rightarrow f(x)\oplus f(y)=f(z)$); states (a.k.a. probability measures) are homomorphisms with values in the real interval $[0,1]$ with ordinary sum.
$\sigma$-homomorphisms preserve joins (for the partial order) of increasing sequences, i.e. preserve countable $\oplus$ (countable additivity). Complete additivity means preservation of
all existing joins of directed families, hence of all existing (even uncountable) $\oplus$. My preference is for finite additivity: the "no-go" theorems are stronger (not even an embedding preserving finite $\oplus$).
Except the real interval, the above examples satisfy $x\perp x\Rightarrow x=0$ (orthoalgebras; "partial boolean algebras" are essentially the same thing for our purposes). Orthoalgebras where $\leq$ is a lattice order are the same thing as orthomodular lattices
Gleason's theorem (1957)
The original version was for the logic of "type I factors" (real or complex):
The only countably additive probability measures on $L(H)$ for a separable Hilbert spaces $H$ are the well known ones (induced by bounded linear operators with trace 1). Extension to the non-separable cases:
in the natural way, requiring complete additivity.
(These are the normal states, which then are exactly the classical mixture i.e. $\sigma$-convex combination of normal pure states, which in turn are exactly the vectorial ones)
in the set theoretic way: assume only countable additivity but also that there are no uncountable real-valued measurable cardinals, or that the dimension of the Hilbert space is less than the first of such cardinals.
This really reduces to the natural case above.
All this is quite standard and spelled out in many books, two examples:
G. Kalmbach, Measures and Hilbert lattices (1986);
A. Dvurecenskij, Gleason's Theorem and its Applications (1993)
Desired generalization: the (completely) additive probability measures on the orthoalgebra of projections uniquely extend to (normal) states on the
algebra of observables.
In this form the theorem has been generalized to (real or complex) W$^*$-algebras with no abelian (type I$_1$, boolean logic) or type I$_2$ (subdirect product of "spin factors", see below) components.
A very good exposition (as anything I have seen from this author) is:
S. Maeda, Probability measures on projections in von Neumann algebras, Reviews in Mathematical Physics 1 (1990), 235-290.
Then it was also extended to the Jordan setting (JBW algebras), to states that need not be normal (finitely additive probability measures) on AW$^*$ algebras and their Jordan generalization, to signed, complex and more general measures, and so on. One example,
but Google Scholar can give the idea of how much work was done on the subject.
When the theorem applies, one has a equivalence (extending von Neumann's one) between the orthoalgebra of projections, the Jordan ring of self adjoint operators, the enveloping associative ring with involution (if the Jordan ring has no exceptional component), the "effect algebra", and still other structures. This consequence of the theorem is possibly more important than the application to the hidden variables problem.
How does Gleason's theorem rule out noncontextual hidden variables?
The subject of hidden variables became formal part of mathematical quantum logic (as opposed to other kinds of formalizations) with
N. Zierler and M. Schlessinger, Duke J. 32, 251 (1965)
A non-contextual hidden variable theory for a logic $L$ is a $\oplus$ preserving embedding of $L$ in a boolean algebra $B$.
Since the beginning it was observed that the existence of an embedding as above is the same thing a as the existence of a "full" set $S$
of two-valued states (0 and 1, true and false) on $L$ (full: many inequivalent definitions that became equivalent when each $s\in S$ is two-valued. One: if $\forall s\in S,s(x)+s(y)=s(z)$ then $x\oplus y=z$).
This follows from the fact that the compact convex set of finitely additive states on a Boolean algebra has as extreme points exactly the two-valued measures. In general the pure states on $L$ are defined as the extreme
points of the convex set of states. Two valued states are always pure, not conversely.
For the countably additive case one uses embeddings in $\sigma$-algebras of sets, not more general countably complete boolean algebras.
(There are some with no countably additive measures at all: "open subsets of the real line modulo meager sets"; this is essentially the same as the classical example of a commutative AW$^*$-algebra which cannot be W$^*$).
The correct abstract setting is that of measure algebras, see
Fremlin's free volumes about measure theory
Gleason's theorem excludes existence of two-valued states for $L(H)$ when $H$ has dimension at least three (or for the more general cases noted above), hence even the weakest kind
of noncontextual hidden variables (with finite additivity) cannot exist.
Side note: "modular" noncontextual hidden variables.
Besides embeddings of a general $L$ in a distributive case (the noncontextual hidden variable problem above), one can consider orthoalgebra embedding of a general $L$ in a modular $L_f$ associated to a finite von Neumann algebra. For $L(H)$ of infinite dimension
(and the other cases with a properly infinite component where Gleason's theorem has been generalized) this is not possible, even in the weakest sense of finitely additive embeddings.
This is not explicitly noted in the literature, but follows easily for example from S. Pulmannova and A. Dvurechnskij, Sum logics, vector-valued measures and representations, Ann. H. Poincare A 53 (1990), 83 - 95 Cor. 3.3 pag. 91 implies that $\sigma$-additive faithful representation of a non-modular $L$ (associated to a W$^*$-algebra with no type I$_2$ component) into such a modular $L_f$ cannot exist since it must preserve also the lattice operations. To treat finitely additive faithful representations of $L(H)$ (for a infinite dimensional Hilbert space) in a modular $L_f$, apply the above to finite dimensional initial intervals of $L(H)$: these restrictions are $\sigma$-additive, hence lattice representations; in particular they preserve equidimensionality (by its lattice definition). Hence a infinite orthogonal sequence of atoms in $L(H)$ must go to an infinite orthogonal sequence of nonzero and equidimensional elements, which cannot exist in a $L_f$ with a (separating set of) finite dimension function. Finally, such a $L(H)$ is a sublogic of any non-finite $L$, since by (Loomis - Maeda) dimension theory one finds a homogeneous sequence of nonzero elements. So impossibility of modular embeddings for $L(H)$ assure the same for $L$.
This "modular no go theorem" shows that, in the same way as one cannot reduce the current quantum
theories to classical theories (but see below), one cannot reduce them to the particular (finite, modular) case that von Neumann wanted.
Now look at the cases not covered by Gleason's theorem.
The "abelian" part is the one which already has classical (boolean) logic, so nothing more has to be said for this question.
Now the remaining I$_2$ part. It is not classical, but (unique among the W$^*$ algebras and and similar cases) it has non-contextual hidden variables.
The so called "spin factors" (the matrix $2\times 2$ rings over the real, complex, quaternionic $*$-ring, or the more general Jordan structures) have as logic $L$ a length 2 lattice: to an antichain of incomparable "points" one adds the bottom 0 and the top 1. This is
a projective line, here moreover with orthocomplementation, i.e. a partition of the set of points in subsets of unordered pairs (one the orthocomplement of the other). The general type I$_2$ is subdirect product of these projective ortho-lines, so it suffices to see a boolean embedding for projective lines with orthocomplementation.
Even more generally, for later use: noncontextual hidden variables exist for horizontal sums of boolean algebras.
For a family $\{B_i\}_i$ of boolean algebras (or orthoalgebras), their horizontal sum is the following structure: the base set is the disjoint union of the $B_i$, except that
all of the top elements $1_i\in B_i$ are identified (they give the same element $1$ of the horizontal sum), and the same for the bottom elements $0_i$; the partial binary operation $\oplus$ is given by the trivial cases
($0\oplus a=a=a\oplus 0$) and exactly the following cases: $a\oplus b=c$ iff $a,b,c$ are in the same $B_i$ for one $i$, and this relation holds in $B_i$.
The horizontal sum of a family of othoalgebras or othomodular lattices is again such. So for boolean algebras $B_i$ one has an orthomodular lattice. Special case: the projective lines with orthocomplementation
are exactly such horizontal sums with each $B_i$ being a four element boolean algebra (two nontrivial propositions, one the negation of the other).
To see that a horizontal sum of boolean algebras has noncontextual hidden variables, one replaces the horizontal sum with the tensor product of the Boolean algebras (i.e. the topological direct product of their Stone spaces);
for the embedding, a nontrivial proposition $a\in B_i$ goes to the proposition $a\otimes(\otimes_{j\neq i}1_j)$ (using Stone spaces $X_j$ of the $B_j$: a subset $A\subseteq X_i$ goes to $A\times(\prod_{j\neq i}X_j)$).
[This is the "finitely additive" case; the "countably additive case" ($\sigma$-algebras of sets) and the "completely additive case" (measure algebras) are analogue]
"Intuitive picture": if $B_1$ is a logic of propositions referring to something happening in this world, and $B_2$ is instead referring to something happening in a far away galaxy,
you take the horizontal sum when you say "I cannot test together something happening here and something happening there" (so the only propositions are the two trivial ones and the ones
for the two cases separately), but you take the boolean tensor product when you say "in principle one can consider also propositions of the form [$a$ happens here and $b$ happens there] and the boolean algebra they generate".
There are two interesting consequences of the above easy noncontextual hidden variable embedding for horizontal sums of boolean algebras.
1 each theory has contextual hidden variables.
Take any $L$ (with at least four elements i.e. one non trivial proposition), and a collection of boolean subalgebras $B_i$ that covers $L$.
Three obvious examples of such coverings: (i) the collection of all four element boolean subalgebras (i.e. one takes a nontrivial proposition, its negation, and the two trivial proposition);
(ii) the collection of all finite boolean subalgebras; (iii) the collection of all maximal boolean subalgebras.
Since already (i) covers $L$, also the larger collection (ii) covers, and then also (iii) covers
(each Boolean subalgebras extends to a maximal one). Since a observable can be identified with its spectral resolution
(i.e. the boolean algebra of propositions referring to measurements of that observable, like:
is the observed measure in this set of real numbers?), one can consider boolean subalgebras as ideal observables (with two [ideal or real] observables being compatible when there is a ideal observable common extension of both)
and having a covering intuitively means that a sufficient set of observables has been singled out.
Now replace $L$ with the horizontal sum of the $B_i$, and then take the non-contextual hidden variable theory as above for the horizontal sum.
This gives a "contextual hidden variable theory" for the original $L$.
[When the $B_i$ are the maximal boolean subalgebras of $L$, this is a reformulation of Gudder's universal contextual hidden variable theory for $L$.
S. P. Gudder, On hidden-variable theories, Journal of Mathematical Physics 11, 431-436 (1970).
He considered only the countable additive theory (as it is customary),
but nothing changes for the other two points of view. And yes, "universal" means that one has a categorical universal property to abstractly define up a unique isomorphism such a contextual embedding among the other possible ones]
What does mean "contextual"? It means that since incompatible observables cannot be tested together, when a proposition $a\in L$ belongs to an (ideal) observable $B_i$ and also to a (different, possibly incompatible) observable $B_j$,
we "split" $a$: it is no more a single proposition $a$, but two distinct propositions $a_i$ (to be treated in the context of the observable $B_i$) and $a_j$. When the $B_i$ are the maximal ideal observables you are splitting
each $a$ in all possible largest ideally experimentable contexts (every experiment tests only a boolean algebra of propositions).
NB: this shows that the heart of the matter is a family $\{B_i\}_i$ of Boolean algebras ($i$ are indexes for the [ideal] observables, $B_i$ is the boolean algebra of propositions about $i$),
and a set $S$ of states $s$ with a function that maps each pair $(s,i)$ to a (finitely additive, or more) probability measure $s_i$ on $B_i$. One can start from a $L$ (with states) and obtain the preceding data,
or one can start from the observables (with states) and recover $L$ (with states) as explained below. [This is not a equivalence, only an adjoint pair of constructions; one can disregard this not being an equivalence for reasons I have written about more generally elsewhere].
NB: This is for "sharp" quantum logics; for "unsharp" logics (where orthoalgebras are generalized to effect algebras)
one uses MV-algebras in place of Boolean algebras.
2 any "theory" with at most a continuum of propositions can be arbitrarily approximated with theories admitting noncontextual hidden variables.
Note: in 1999 I found that in Austria some work was done (possibly by Svozil?), which for one side I considered very related to what follows, and for the other side it is completely different in method: much less general (the starting $L$ was inside a Hilbert space) but much more concrete; I cannot find it now. The approximation of $L$, if I remember correctly, it was not by changing the states but by moving elements of $L$ inside the Hilbert space. The conclusion was much weaker, but interesting for the concreteness of the construction.
Here a theory is a pair of sets (the set $L$ of propositions and the set $S$ of states) with the function $(s,x)\in S\times L\mapsto s(x)\in[0,1]$ that gives the probability $s(x)$ that the proposition $x$ turns out to be true when the experimental test is performed with the system in the state $s$.
The natural topology for the approximation of theories is the one of pointwise convergence in the set of such functions of two variables since
we can practically perform only a finite amount of experiments.
One usually identifies propositions $x,y$ that the states cannot separate ($\forall s,s(x)=s(y)$) and dually one identifies states $s,s'$ that give the same results on propositions ($\forall x,s(x)=s'(x)$).
One can always add (if not already present) the two trivial propositions "always false" $0_L$ ($\forall s,s(0_L)=0$) and "always true" $1_L$ ($\forall s,s(1_L)=1$).
One has a natural orthoalgebra structure on the set of (identified as above) propositions, defined by: $x\oplus y=z$ iff $\forall s,s(x)+s(y)=s(z)$. There might be other compatible orthoalgebra structures
(i.e. structures that make every $s$ a finitely additive probability measure), but this is the minimal (and natural) orthoalgebra structure on $L$ compatible with the states. [And yes, in the usual cases
like the ones where Gleason applies this
"intrinsic logic orthoalgebra structure" on $L$ defined by the usually chosen set of states coincides with the usually taken orthoalgebra structure on $L$].
Now use this: the real numbers have a Hamel basis $\{h_i\}_i$ (of continuum cardinality) as vector space over the rationals; one takes (as always possible) the number $1$ as member of such a basis. Partition the set of nontrivial propositions using pairs (a proposition and its negation).
To each pair $\{x,x^\perp\}$ associate injectively a $h_i\neq 1$, and note that the rational multiples of $h_i$ can approximate arbitrarily well each $s(x),s(x^\perp)=1-s(x)$. Taking such modifications of
the probability function on $S\times L$ (the ones that for each pair $x,x^\perp$ the states take only values in the rational multiples of the corresponding $h_i$) one sees that the associated intrinsic logic is a horizontal sum of four element boolean algebras (i.e. a projective line with orthogonality), without changing identifications (as above) in the sets $L$ or $S$; using density of the rationals in the reals one sees that
the family of such modifications arbitrarily approximates, for pointwise convergence, the initially given probability function.
Conclusion: no finite set of experiments can ever rule out the possibility that physical reality is modeled by a intrinsic logic $L$ which is a orthocomplemented projective line,
hence a theory with non-contextual hidden variables.
The Bohm theory in 1952 was the first explicitly with this kind of $L$. It was for someone a surprise because von Neumann's non go theorem rules out "non contextual hidden variables" also for bidimensional Hilbert spaces, but by assuming additivity not only on propositions but also for observables, and sums of non compatible observables are not covered with above definition of noncontextual hidden variables.
So the Mackey problem was raised and Gleason's solution
ruled out noncontextual hidden variables in dimension at least three using a really minimal and hopefully un-objectionable hypothesis.
So it is mathematically proved that there is no way to convince a philosopher who wants noncontextual hidden variable theories "whatever it takes".
On the other hand, the practical problem of finding a useful hidden variable theory (of Bohm type, i.e. starting from projective ortholines) is not solved by the above approximation procedure
(absolutely ideal and using the axiom of choice; Hamel bases are not there in Soloway / Shelah models of ZF + DC). So quantum mechanics is done the way it is done simply because it gives good prediction
of experimental results. Yes, you can prove that in principle you can use only classical-like theories, but NO, nothing will change practically used physical theories
until someone gives a way to make experimental predictions at least as easily an accurately as the present method, whatever philosophical difficulties it might have.
Return now to the noncontextual non-go theorem (for the $L$ where Gleason applies) to concretize it using finite substructures of $L$.
Kochen - Specker
where the first (1967) to construct a finite sub-ortholattice $L$ (of a $L(H)$ for a finite dimensional Hilbert space $H$) where non-contextual hidden variables are impossible
(even more, there are no two-valued probability measures on $L$).
Better finite $L$ were found later (starting with A. Peres, Two simple proofs of the Kochen - Specker theorem, Journal of Physics A24, L175-L178 (1991) and then others, Google Scholar should find them).
Along these lines, a finite $L$ like that and hypothetical two valued states on $L$ are also used in the "free will theorem"
which for most people is quite possibly a better introduction to this subject.
The smallest dimension of $H$ for such an $L$ is at least three, and in fact a three dimensional example is known. But the known four dimensional examples are better for the following reasons: they are easier to construct / describe, and they can be linked to the 1935 "EPR paradox"
and the 1966 Bell's inequalities
Four is the dimension of the tensor product of two to dimensional Hilbert spaces; that tensor product describes a system with two components (say, two particles completely determined,
for the specific purposes of the experiment, by their spin. This is the reason for the name "spin factors"). Two ways (horizontal sum and boolean tensor product) were given above
to describe a composite system (the boolean product only when the components have a boolean logic embedding); the Hilbert tensor space product
is the orthodox way to go if the two components have orthodox quantum mechanical description (please disregard fermions and bosons, symmetric and anti-symmetric part of the tensor product), and we again want a orthodox description for the composite system (again, disregard the non unicity of C$^*$ tensor products in general: here we have two finite dimensional Hilbert spaces and hence no such problem: only an algebraic tensor product).
Suppose that we are mainly interested in propositions $x$ that refer to a single subsystem, and that we are only interested in states $s$ which are "decomposable" (its representing vectors are decomposable
$v_1\otimes v_2$ with $v_i\in H_i$). There are more states: the pure states are identified with the proportionality classes of its vectors; the vectors in a tensor product are the sums of decomposable vectors, but not all such vectors are decomposable, hence: there are states which are superposition of two decomposable states but are not decomposable. A decomposable state is naturally recovered by its restriction to both subsystems separately. How naturally? analogously to the recovering of a product of two probability measures on boolean algebras by the two component measures [note: probabilistic independence is natural reducibility to a direct product]; so here we are treating only the states of the composite system that are as decomposable as possible in the two components,
disregarding the more complicated propositions and states that depend upon the "superpositions" available in quantum mechanics, and the propositions that are generated by logical operations
on the propositions of the two subsystems).
Abstracting form the specific case of the tensor product: one considers only some of the states and some of the propositions, in such a way that certain pairs (or larger sets) of propositions
to be considered are compatible (meaning that are contained in a boolean subalgebra in $L$) and other pairs (or larger sets) of propositions, even if not compatible in $L$,
are to be treated as compatible because the chosen $s$ treats them independently.
In that setting it is possible to derive certain inequalities relating linear combinations of $s(p(x,y,\dots))$
($s$ a state as above; $p$ a ortholattice polynomial in the propositions $x,y,\dots$ as above) involving an arbitrary but fixed $s$, and various ortholattice polynomials.
These are the famous Bell's inequalities (either essentially the original ones, or generalizations).
If we consider all $s$ and all $x,y,\dots$ (instead of the more restricted one as above) we can derive other inequalities, less stringent than Bell's inequalities.
In 1992 I saw a derivation of these inequalities in this setting,
but now I cannot find it. However, pag. 4 of
https://arxiv.org/abs/quant-ph/0207062
resembles what I remember to have read at the times.
A summary, to see a concrete form of a Bell inequality:
Fix a state $s$. Define the distance (relative to $s$) of two propositions $A,B$ as
$d(A, B) = s(A \vee B) - s(A \wedge B)$. In the boolean setting this is the usual $s$-measure of $(A\cup B)\setminus(A\cap B)$, but in a more general setting
$A,B$ might not be "compatible" (no boolean subalgebra contains them both).
In the boolean setting, since $d$ is really a distance (except for separation since $s(C)=0$ is possible for a nonzero $C$),
one has the triangle inequality
$|d(A, B) - d(A, C)| \leq d(B, C) \leq d(A, B) + d(A, C)$. The triangle inequality implies
$d(A, D) \leq d(A, B) + d(B, C) + d(C, D)$. This inequality makes sense (as a inequality involving true distances) in a general $L$
whenever the four propositions are pairwise compatible,
but we have no reasons to claim it true since a boolean algebra containing all four propositions need not exist.
This is an example of Bell's inequality; in each concrete case one specifies the reasons to treat the four propositions as jointly compatible (and so to say that the inequality should be true).
In the specific case of EPR this involves "locality" conditions (if the two subsystem are sufficiently spatially separated, so that a light signal cannot be sent and received from a subsystem to the other
between a measurement on the first system and a measurement on the second systems, then one should treat proposition concerning one system as independent, hence compatible, with the propositions concerning the other system).
Arguing mathematically instead of physically, one can say that if we consider two subsystems that have noncontextual hidden variables (as it happens for "spin factors") and then considers only "decomposable" states,
then all propositions should be treated as pairwise compatible (inside one system by noncontextual hidden variables; for different subsystem by the choice of $s$), so the "correct" way to model the system
"should" be not the tensor product of Hilbert spaces, but something like the horizontal sum, then embedded in the boolean tensor product (so that all the interesting sets of propositions are compatible, and the
above Bell's inequality must be true).
The diversity with the Hilbert tensor product is evident: in the
orthodox quantum description there are also non-decomposable states (superpositions of decomposable ones) and also new propositions for the composite system not decomposable into a pair of propositions
(one on each subsystem).
If, experimentally, one finds that a state violates one of Bell's inequalities for propositions as above, then the proposed alternative description to the orthodox one is disproved (but one could search for another one), and if the weaker form of Bell's inequalities that is true for the orthodox description is not violated quantum mechanics is not disproved. (Existence of genuine quantum superpositions of states is compatible with experiments, but the proposed hidden variable theory is not). And exactly this seems to happen up to now.
Which among the EPR assumptions must be relaxed? This discussion require knowledge of physics and/or philosophy as so is completely outside both my interests and suitability as mathoverflow subject.
Best Answer
The expected number of trials is infinite if $p\le1/2$ and $n/(2p-1)$ if $p>1/2$.
To see this, a one-step analysis is enough. First, to reach level $n$ one needs to first reach level $1$ starting from level $0$, then to reach level $2$ starting from level $1$ and so on. Each of these durations has the same distribution hence the expected time $t_n$ needed to reach level $n$ starting from level $0$ is $t_n=nt_1$. Second, starting from level $0$, the first step puts us at level $1$, and then we hit level $1$ after $1$ step, or the first step puts us at level $-1$ and then we must climb two steps to reach $1$. The former happens with probability $p$ and the latter happens with probability $1-p$ hence $t_1=p\cdot1+(1-p)\cdot(1+t_2)$. Plugging $t_2=2t_1$ in this yields $t_1=1+2(1-p)t_1$. To complete the proof, note that this relation holds in $[0,+\infty]$ and that its solution is $t_1=+\infty$ if $2(1-p)\ge1$ and $t_1=1/(2p-1)$ otherwise.