Let me answer my own question by contructing, for $p>2$ not an even
integer (say $2k<p<2k+2$), a matrix $A$ such that $\|A\|_p> \|A'\|_p$. In fact I construct a family of matrices $A_n \in M_n(\mathbb C)$ such that $\|A_n\|_p > \|A_n'\|_p$ whenever $n-k$ is an even positive integer, and $\|A_n\|_p < \|A'_n\|_p$ if $n-k$ is an odd positive integer$.
This will imply that $\|A\|_p \leq \|A'\|_p$ for all $n$ and all $A \in M_n$ if and only if $p$ is an even integer or $p=\infty$. For other values of $p$, these two quantities are not comparable, and if we go the the Schatten classes (ie we allow infinite matrices), there is no implication between the properties $\|A\|_p<\infty$ and $\|A'\|_p<\infty$.
Here is the construction. For every integer $n$, consider $S_n \in M_n(\mathbb C)$ to be the matrix of a cyclic permutation of $\{{1,\dots,n\}}$ in which one of the $1$'s is replaced by a $-1$. Take $A_n=Id+S_n$, so that $A_n'=Id+S_n'$. I only sketch the proof that $A_n$ works, since I might be the only one interested ;).
There are two independent claims:
Claim 1: the function $p \mapsto \|A_n\|_p^p - \|A_n'\|_p^p$ has at most $n-1$ zeros (counting multiplicities).
Claim 2: $\|A_n\|_p = \|A_n'\|_p$ for all even integers $2 \leq p \leq 2n-2$.
These two claims together imply that $\|A_n\|_p^p -\|A_n'\|_p^p$ is non zero outside of $\{{2,4,\dots,2n-2\}}$ and changes signs at each of these values of $p$. Since it is negative for $p=2n$, we have the announced properties.
The second claim is easier. A first observation is that $Tr(S_n^k) = Tr({S'_n}^k)$ for every $k$ with $-n+1 \leq k \leq n-1$: $k=0$ is obvious, and if $k \neq 0$ both matrices have a zero diagonal. Hence, since $A_n^* A_n = 2+S_n +S_n^*$ and ${A'_n}^* A'_n = 2+ S'_n +{S'_n}^{*}$ we have $\|A_n\|_p = \|A_n'\|_p$ for all even integers $p \leq 2n-2$.
To prove the first claim, first notice that the eigenvalues of $S'_n$ (resp. $S_n$) are $\lambda_k=\exp(2ki\pi/n)$, $k=1\dots n$ (resp. $ \mu_k=\exp( (2k+1) i \pi/n)$, $k=1\dots n$). Thus the singular values of $A'_n$ (resp. $A_n$) are $|1+\lambda_k|$ (resp. $|1+\mu_k|$). In particular if $N(B)$ denotes the number of distinct non-zero singular values of a matrix $B$, we have $N(A_n)+N(A'_n)=n$. Hence $\|A_n\|_p^p - \|A_n'\|_p^p$ can be written in the form $\sum_{j=1}^n \alpha_j e^{\beta_j p}$, and such a function cannot have more than $n-1$ zeros unless it is identically zero.
Edit Now this answers the first question for the operator norm and the normalized Hilbert-Schmidt norm.
The answer depends on the norm you are considering. The answer is no for the operator norm, but is yes for the normalized Hilbert-Schmidt norm (at least if you replace $O(\varepsilon)$ by $o(1)$, see the answers to this question).
Here are some details on the counterexample for the operator norm.
- By a theorem of Lin (see here), for a pair of self-adjoint matrices of norm less than $1$, they approximately commute if and only if they can be approximated by commuting matrices.
- Voiculescu proved that the preceding does not hold for triples of self-adjoint matrices of norm less than $1$ (see the link I gave here, or the references in the paper by Exel and Loring given in the comments).
1+2 imply that there is a sequence of triples $A_1^n,A_2^n,A_3^n$ of matrices of norm less than $1$ which are pairwise close to (self-adjoint) commuting matrices, but whose distance to the triples of commuting matrices is bounded below.
- By continuity of the functional calculus and the fact that $t \in [-2,2] \mapsto e^{it}$ is a homeomorphism on its image, this implies that the unitary matrices $(e^{i A_1^n}, e^{i A_1^n},e^{i A_1^n})$ are pairwise close
to pairs of commuting unitaries, but are at positive distance from triples of commuting unitaries. This is what you were looking for.
Best Answer
It turns out the the previous answer had the right ingredients, but in the wrong combination. Here is a cleaner proof.
Notation: Let $s_j(X)$ denote the $j$-th singular value of a matrix $X$ (we assume that singular values are arranged in decreasing order). Similarly, let $\lambda_j(X)$ denote the $j$-th eigenvalue of a Hermitian matrix $X$. Let $S(X)$ denote the diagonal matrix of singular values of $X$.
Proof: Since $B$ is a contraction, we have \begin{eqnarray*} I &\ge& BB^*\\ AA^* &\ge& ABB^*A^*\\ \lambda_j(AA^*) &\ge& \lambda_j(ABB^*A^*)\\ \lambda_j^{1/2}(AA^*) &\ge& \lambda_j^{1/2}(ABB^*A^*)\\ s_j(A) &\ge& s_j(AB). \end{eqnarray*}
Proof: Using Lemma 1 and that $A$ is a contraction, we have $0 \le 1 - s_j(A) \le 1 - s_j(AB)$ for all $j$. Consequently, it follows that \begin{equation*} \|I-S(A)\| \le \|I-S(AB)\|, \end{equation*} for any unitarily invariant norm. Now using a corollary of Lidkskii's majorization (see e.g., [Theorem IV.3.4 in Bha97]), it follows that the following inequality \begin{equation*} \|I-S(AB)\| = \|S(I)-S(AB)\| \le \|I-AB\|, \end{equation*} holds for all unitarily invariant norms.
Proof: Let $A=UP$ be the polar decomposition of $A$. Then for any unitarily invariant norm $\|\cdot\|$, \begin{equation*} \|A-U\|=\|P-I\|=\|S(A)-I\|. \end{equation*} Combining this equality with Theorem 2, the result is immediate.