I do not know if this helps, but here are some upper bounds, standard in operator space theory. The first inequality, attributed to Haagerup, is an analog of the Cauchy-Schwarz inequality in your setting:
$$\Vert\sum_{\alpha}S_{\alpha}\otimes B_{\alpha}\Vert\leq \Vert\sum_{\alpha}S_{\alpha}\otimes \overline{S_{\alpha}}\Vert^{1/2} \Vert\sum_{\alpha}B_{\alpha}\otimes \overline{B_{\alpha}}\Vert^{1/2}.$$
Here for a matrix $A = (A_{i,j})$, $\overline{A}$ denotes the matrix $(\overline{A_{i,j}})$. The expressions appearing on the right-hand side of this inequality are the norms of $(S_\alpha)$ and $(B_\alpha)$ in the operator Hilbert space OH. For a proof, see for example page 123 in Pisier's Introduction to Operator Space Theory.
Another inequality (no longer symmetric) that reduces to the usual Cauchy-Schwarz inequality when the matrices are of size $1$ is the following (and the same with the role of S and B reversed):
$$\Vert\sum_{\alpha}S_{\alpha}\otimes B_{\alpha}\Vert\leq \Vert\sum_{\alpha}S_{\alpha}S_{\alpha}^*\Vert^{1/2} \Vert\sum_{\alpha}B_{\alpha}^* B_{\alpha}\Vert^{1/2}.$$
Now the terms appearing on the left are, in the language of operator spaces, the row (resp. column) norm of $(S_\alpha)$ (resp. $(B_\alpha)$). The row (resp. column) norm of $S=(S_\alpha)$ is just the norm of the matrix $ROW(S)$ (resp. $COLUMN(S)$) obtained, in a block-decomposition, by putting $S_\alpha$'s on the first row (resp. column) and $0$'s on the other rows (resp. columns).
This last inequality is very easy to prove, and more generally we have $\Vert\sum_i a_i b_i\Vert\leq \Vert\sum_i a_i a_i^*\Vert^{1/2} \Vert\sum_i b_i^* b_i\Vert^{1/2}$ for any matrices $a_i$ and $b_i$. Indeed, the LHS of this inequality is $\Vert ROW(a) COLUMN(b)\Vert$, and its RHS is $\Vert ROW(a)\Vert \Vert COLUMN(b)\Vert$. This inequality is thus just expressing that the operator norm is sub-multiplicative.
Edit (for a lower bound, without the typo this time). In the case when the $S_\alpha$'s form an orthonormal family for the scalar product $\langle A,B\rangle = Tr(B^* A)/d_S$, you get the following lower bound:
$$\Vert\sum_{\alpha}S_{\alpha}\otimes B_{\alpha}\Vert\geq \max(\Vert\sum_{\alpha}B_{\alpha}^* B_\alpha\Vert^{1/2}, \Vert \sum_{\alpha}B_{\alpha} B_\alpha^*\Vert^{1/2}).$$
This is because $\sum_{\alpha}B_{\alpha}^* B_\alpha$ is $1/d_S Tr \otimes id$ applied to $X^*X$, where $X=\sum_{\alpha}S_{\alpha}\otimes B_{\alpha}$. And since $1/d_s Tr$ is a state, $1/d_S Tr \otimes id$ is of norm $1$ from $M_{d_S} \otimes M_{d_B}$ to $M_{d_B}$.
In the specific situation of your problem, here are the bounds one actually gets.
Your assumptions on the $S_\alpha$'s imply that they all are unitary, and orthonormal for $\langle A,B\rangle = Tr(B^* A)/d_S$. Therefore you have that
$$\Vert\sum_{\alpha}S_{\alpha}\otimes \overline{S_{\alpha}} \Vert = \Vert\sum_{\alpha}S_{\alpha}^*S_{\alpha}\Vert= \Vert\sum_{\alpha}S_{\alpha}S_{\alpha}^*\Vert = N$$ where $N$ is the number of terms in the sum (you want to take $N=d_S^2-1$).
You therefore get
$$ \max(\Vert\sum_{\alpha}B_{\alpha}^* B_\alpha\Vert^{1/2}, \Vert \sum_{\alpha}B_{\alpha} B_\alpha^*\Vert^{1/2}) \leq \Vert\sum_{\alpha}S_{\alpha}\otimes B_{\alpha} \Vert $$
and $$\Vert\sum_{\alpha}S_{\alpha}\otimes B_{\alpha} \Vert \leq \sqrt N \min(\Vert\sum_{\alpha}B_{\alpha}^* B_\alpha\Vert^{1/2}, \Vert \sum_{\alpha}B_{\alpha} B_\alpha^*\Vert^{1/2}).$$
The lower bound is tight (take all the $B_\alpha$'s but one equal to zero). The upper bound too (take $B_\alpha = \overline{S_{\alpha}}$), and implies the one you gave in your question.
Best Answer
You don't really have any stricter conditions, and in fact it is simple to reduce the general case to yours: every matrix is diagonal in some basis, and every matrix is negative definite if you subtract a suitable multiple of the identity to it.
So no, there can be no better bounds than the general case.