[Math] How to combine two matrix equations into one

finite differencesmatricesnumerical methodspartial differential equations

I have a discretely sampled 2D function:

S =
     1     2     3     4
     1     2     3     4
     1     2     3     4

I want to find finite difference matrices, DX and DY such that:

$\ S_x=DX*S,S_y=DY*S $

where subscript denotes partial derivation with respect to.

Finding DY was easy. Using central differences:

DY =
   -1.0000    1.0000         0
   -0.5000         0    0.5000
         0   -1.0000    1.0000

Test:

>> DY*S

ans =

     0     0     0     0
     0     0     0     0
     0     0     0     0

Finding DX seems more difficult. Matrix product is "row x column". Since x is along the row of S; I try to find DX such that $\ S_x=S*DX $ instead:

DX =
   -1.0000   -0.5000         0         0
    1.0000         0   -0.5000         0
         0    0.5000         0   -1.0000
         0         0    0.5000    1.0000

Test:

>> S*DX

ans =

     1     1     1     1
     1     1     1     1
     1     1     1     1

Now I have the two matrix equations:
$\ Sx = S*DX, Sy = DY*S $

How can I combine these two into one matrix equation?

The thing is that I know Sx and Sy and am trying to solve for S using a least squares approach. I read that the conjugate gradient method does this. If I can cast my two equations into the shape of one matrix equation: $\ \mathbf{A}\mathbf{x}=\mathbf{b} $ I could use the conjugate gradient method to find a least squares solution.

I guess I will have to reshape S into a vector:

s=S(:)

s =

     1
     1
     1
     2
     2
     2
     3
     3
     3
     4
     4
     4

Next I will have to find a 12×12 matrix dy such that:

$\ sx = dx*s $

where sx now is also reshaped as a vector.

Using my example data and solving for dx I find:

>> sx/s

ans =

         0         0         0         0         0         0         0         0         0    0.2500         0         0
         0         0         0         0         0         0         0         0         0    0.2500         0         0
         0         0         0         0         0         0         0         0         0    0.2500         0         0
         0         0         0         0         0         0         0         0         0    0.2500         0         0
         0         0         0         0         0         0         0         0         0    0.2500         0         0
         0         0         0         0         0         0         0         0         0    0.2500         0         0
         0         0         0         0         0         0         0         0         0    0.2500         0         0
         0         0         0         0         0         0         0         0         0    0.2500         0         0
         0         0         0         0         0         0         0         0         0    0.2500         0         0
         0         0         0         0         0         0         0         0         0    0.2500         0         0
         0         0         0         0         0         0         0         0         0    0.2500         0         0
         0         0         0         0         0         0         0         0         0    0.2500         0         0

How is this related to DX? I do not understand how to generate this matrix when s is unknown.
BTW I am using Matlab.

Thanks in advance for any answers!

Best Answer

I'll show you the code for doing this with square matrices, and leave the generalization to rectangular matrices to you. First we'll generate a data matrix S:

>> n = 4;
>> S=reshape(1:n*n, [n n]);

Now define the shift-by-one matrix on vectors:

>> E = sparse(2:n, 1:n-1, 1, n, n);

and the corresponding central difference matrix:

>> D = (E' - E) / 2;

Now you can define X and Y gradient matrices, which act on matrices (suitably reshaped as vectors) by:

>> I = speye(n);                      % identity matrix
>> Dx = kron(I,D);
>> Dy = kron(D,I);

and use them in the following way:

>> Sx = reshape( Dx * S(:), [n n]);
>> Sy = reshape( Dy * S(:), [n n]);

i.e. whenever you want to calculate a gradient, you first reshape the matrix S into a vector, apply the appropriate derivative operator, and then reshape the result back into a matrix.


Note that I've defined E, D etc as sparse matrices, for computational efficiency. If you want to see what any of these matrices look like, you can convert them to dense matrices using the function full, for example:

>> full(D)
ans =
         0    0.5000         0         0
   -0.5000         0    0.5000         0
         0   -0.5000         0    0.5000
         0         0   -0.5000         0