Proposition: Let $\{ \left| \psi ^{(k)}\right> :k\}$ be any collection of normalized vectors in a hilbert space, let $0\leq p_k\leq 1$ be such that $\sum _kp_k=1$, and define $\rho :=\sum _kp_k\left| \psi ^{(k)}\right> \left< \psi ^{(k)}\right|$. Then, (i) $\operatorname{tr}(\rho )=1$ and (ii) $\operatorname{tr}(\rho ^2)\leq 1$.
Remark: "Any" really means any: this collection needs to be neither orthonormal nor a basis (though each $\left| \psi ^{(k)}\right>$ definitely needs to be normalized!).
Proof: Let $\{ \left| e_i\right> :i\}$ be an orthonormal basis of the hilbert space (not necessarily eigenvectors for $\rho$), and write $\left| \psi ^{(k)}\right> =\sum _ic^{(k)}_i\left| e_i\right>$. (Note that the index $i$ runs over (in general) very different index set that $k$ does.) Then,
$$
\operatorname{tr}(\rho ):=\operatorname{tr}\left( \sum _kp_k\left| \psi ^{(k)}\right> \left< \psi ^{(k)}\right|\right) =\sum _kp_k\operatorname{tr}\left( \left| \psi ^{(k)}\right> \left< \psi ^{(k)}\right| \right)=\sum _kp_k\cdot 1=1,
$$
where one can see that $\operatorname{tr}\left( \left| \psi ^{(k)}\right> \left< \psi ^{(k)}\right| \right) =1$ by picking any orthonormal basis of the hilbert space which contains $\left| \psi ^{(k)}\right>$ and using the definition of the trace.
Hence, $\rho$ is trace-class (Disclaimer: You actually need to check that $\operatorname{tr}(|\rho |)<\infty$, but it shouldn't be too hard to show that $|\rho |=\rho$.), hence compact, and so there is an orthonormal basis $\{ \left| f_i\right> :i\}$ consisting of eigenvectors of $\rho$: $\rho \left| f_i\right> =\lambda _i$. Expanding the equation $\lambda _i=\left< f_i\right| \rho \left| f_i\right>$ shows that $0\leq \lambda _i\leq 1$, and so
$$
\operatorname{tr}(\rho ^2)=\sum _i\left< f_i\right| \rho ^2\left| f_i\right> =\sum _i\lambda _i^2\leq \sum _i\lambda _i=1.
$$
$\square$
In this post all manipulations are at a rather formal level and no deep mathematical discussion takes places. For that see the references at the end.
To start, consider a bipartite Hilbert space $H=H_A\otimes H_B$ and define the operator $\mathbb I_A \otimes |\psi\rangle : H_A\longrightarrow H $ for some $|\psi\rangle \in H_B$ and the identity operator $\mathbb I_A$ on $H_A$, such that for $|\varphi\rangle \in H_A$ it holds that
$$ \left(\mathbb I_A \otimes |\psi\rangle \right) |\varphi\rangle := |\varphi\rangle \otimes |\psi\rangle \quad . \tag{1}$$
Its adjoint is the following operator $\left(\mathbb I_A \otimes \langle \psi| \right) : H \longrightarrow H_A$, where for $|\varphi\rangle \in H_A, |\phi\rangle \in H_B$ we have
$$ \left(\mathbb I_A \otimes \langle \psi| \right) (|\varphi\rangle \otimes|\phi\rangle) = |\varphi\rangle \langle \psi|\phi \rangle \quad . \tag{2}$$
For a density operator $\rho$ on $H$ we define the partial trace of $\rho$ as
$$\rho_A:=\mathrm{Tr_B}\,\rho := \sum\limits_{k \in K} \left(\mathbb I_A \otimes \langle \psi_k| \right) \rho\left(\mathbb I_A \otimes |\psi_k\rangle \right) \quad , \tag{3}$$
where $\{|\psi_k\rangle\}_{k\in K}$ for an index set $K$ is an orthonormal basis in $H_B$. Note that $\rho_A$ is a linear operator on $H_A$. We can similarly define an operator $\rho_B:=\mathrm{Tr}_A\,\rho$.
Instead of $(3)$ people often write, as a shorthand, something like
$$\rho_A= \sum\limits_{k \in K} \langle \psi_k| \rho |\psi_k\rangle \quad , \tag{4} $$
which can indeed be confusing. Regarding your second concern: I agree (cf. the calculations below), but I guess this could also be a matter of definition and convention, although I've not seen this before. In the end, something like $\langle \varphi|\otimes \langle \psi|$ should denote an element of $H^*$ with $ \langle \varphi|\otimes \langle \psi| \left(|\alpha\rangle \otimes |\beta\rangle\right) = \langle \varphi|\alpha\rangle \langle \psi|\beta\rangle$, for $|\varphi\rangle,|\alpha\rangle \in H_A$ and $|\psi\rangle, |\beta\rangle \in H_B$. So if we put elements of $H_A$ in the first slot of $\otimes$, i.e. on the left of side the tensor product symbol, then it seems natural to me to write $\langle \varphi|\otimes \langle \psi| \in H^*$ instead of $\langle \psi|\otimes \langle \varphi|$.
To proceed, let us now do the explicit calculations: For an operator $O_A$ on $H_A$ we compute
\begin{align}
\mathrm{Tr^{(A)}} \rho_A\,O_A &= \sum\limits_{j\in J} \langle \varphi_j | \rho_A \, O_A|\varphi_j\rangle\\
&= \sum\limits_{j\in J} \langle \varphi_j | \sum\limits_{k \in K} \left(\mathbb I_A \otimes \langle \psi_k| \right) \rho\left(\mathbb I_A \otimes |\psi_k\rangle \right) \, O_A |\varphi_j\rangle \\
&=\sum\limits_{j\in J}\langle \varphi_j | \sum\limits_{k \in K} \left(\mathbb I_A \otimes \langle \psi_k| \right) \rho\, O_A |\varphi_j\rangle \otimes |\psi_k\rangle\tag{5}\\
&=\sum\limits_{j\in J}\sum\limits_{k\in K}\langle \varphi_j|\otimes \langle \psi_k| \,\rho\,O \,|\varphi_j\rangle \otimes |\psi_k\rangle \\
&= \mathrm{Tr}\rho\, O \quad .
\end{align}
Here, $\{|\varphi_j\rangle\}_{j\in J}$ denotes an orthonormal basis in $H_A$, $\mathrm{Tr}^{(A)}$ the trace operation on $H_A$ and $O:= O_A\otimes \mathbb I_B$.
For a more rigorous treatment, see for example S. Attal. Tensor products and partial traces. Lecture Notes, especially section $2.3$. or Michael M. Wolf. Mathematical Introduction to
Quantum Information Processing. Lecture notes, especially theorem 1.35. You can find a pdf for the first reference here and for the second here.
Best Answer
The matrix $\mathcal{O}_{x^*}$ is huge. If you are looking for a satisfying value for a predicate with a 100-bit input, it is $2^{100}\times 2^{100}$, or more if you consider ancillary qubits. Its value is known only as a product of matrices representing primitive gates. Calculating an element of the matrix is computationally equivalent to calculating whether an input satisfies the predicate. Looking for the solution in the matrix is just a classical brute-force search.