MATLAB: How many arithmetic operations does matlab require to determine the Schur decomposition

arithmetic operationscomputational complexityhesshessenberg decompositionMATLABqrqr decompositionschurschur decompositionsqrtmsquare root matrix

Consider a matrix A of size n.
I would like to determine the square root of this matrix which can be done like this:
n = 10; % variable size
A = rand(n); % random square matrix A of size n by n
A_sqrt = sqrtm(A); % square root of matrix A
Inside the sqrtm command, matlab requires the Schur decomposition for determining the square root of a matrix.
[S, T] = schur(A); % with A = S*T*S and S*S' = I and T = (quassi) upper triangular matrix
To determine the speed of sqrtm, I would like an expression of the amount of required distinguisable operations (summation, subtraction, multiplication, division and square root) expressed in the matrix size n. To get this expression, I would like to know how Matlab determines the Schur decomposition of a matrix.
I read that the first step is to determine the upper Hessenberg form H by means of
[G,H] = hess(A); % with A = G*H*G' and G*G' = I
After this, a QR decomposition of H is executed
[Q,R] = qr(H); % with H = Q*R and Q*Q' = I and R = upper triangular matrix
How does one continue from here to a Schur decomposition?
If this is not the way how Matlab determines a Schur decomposition, what is it?

Best Answer

The second step in computing the Schur decomposition is the QR algorithm, not the QR decomposition (unfortunately the names are very similar). The Hessenberg form is computed in advance to allow the QR algorithm to be applied implicitly. There are other ways of computing the Schur decomposition, but the QR algorithm is the most standard and simplest.
It's not very practical to count the number of operations in this algorithm, particularly because the QR algorithm is iterative, so the number of passes through the data depends on the convergence speed, which again depends on the eigenvalues that are used. In practice, the locality of the code (how many times a block of memory is accessed) is also more relevant to performance than simply counting how many times each operation is done.
In practice, I would normally say that Schur has complexity O(n^3), and check how the performance by getting time measurements on a specific machine.
Related Question