If you have functions $f: \mathbb{R}^{n} \longrightarrow \mathbb{R}^{m}$ and $g: \mathbb{R}^{k} \longrightarrow \mathbb{R}^{n}$, the chain rule behaves just the same as in the scalar case, as mentioned in the comments: the derivative of the function $f\circ g: \mathbb{R}^{k} \longrightarrow \mathbb{R}^{m}$ is given by
$$(f \circ g)'(x) = f'\big(g(x)\big) \cdot g'(x).$$
Only now you have to take into account that the $\cdot$ denotes composition of the respective derivatives which are linear transformations:
$$(f \circ g)'(x): \mathbb{R}^{k} \longrightarrow \mathbb{R}^{m}$$
$$f'\big(g(x)\big): \mathbb{R}^{n} \longrightarrow \mathbb{R}^{m}$$
$$g'(x): \mathbb{R}^{k} \longrightarrow \mathbb{R}^{n}$$
You can also fix bases and think of these derivatives as matrices, in which case $(f \circ g)'(x)$ is $m\times k$, $f'\big(g(x)\big)$ is $m\times n$, and $g'(x)$ is $n\times k$; as you can verify, such product of matrices makes sense.
Now, in your case, $n = m = k$ and $g(x) = Ax$ is a linear transformation. For linear transformations $g'(x) = g$, for any $x\in \mathbb{R}^{n}$. This states that the best linear approximation of a linear transformation, near $x$, is the linear transformation itself, which is quite intuitive. Back to the chain rule, for $g(x)=Ax$, we have
$$(f \circ g)'(x) = f'(Ax) \cdot A.$$
I am not sure how you would get a transpose, but some references use different conventions and perhaps your $\mathrm{d}/\mathrm{d}x$ notation means something else (some kind of gradient?). For instance, for $m=1$ and $f: \mathbb{R}^{n} \longrightarrow \mathbb{R}$, the gradient $\nabla f(x)$ is defined as the unique vector that satisfies
$$f'(x)[v] = \langle v, \nabla f(x) \rangle.$$
Here, we denote by $f'(x)[v]$ the linear functional $f'(x)$ applied to the vector $v \in \mathbb{R}^n$, which gives a number. In this case,
$$(f \circ g)'(x) = f'(Ax) \cdot A$$
while
$$\nabla(f \circ g)(x) = A^T \nabla f (Ax).$$
Best Answer
As long as you interpret the RHS correctly regarding where the point of evaluation goes, then yes, it is correct. If we explicitly mention the point of evaluation, and we are super pedantic about the order of evaluation, then for every $\xi \in \mathbb{R}^m$, we have
where the $\cdot$ means matrix multiplication.
The reason I used the word "interpret" above is because from a purely technical standpoint, when you leave out the variable $\xi$, you need to ensure that both sides of the equation has functions with the same domain and target space. In this example, we have the following domains and target spaces:
So, strictly speaking the RHS of your equation is not defined properly. If you wanted to be super formal and write the equation above in a correct form, without explicitly mentioning the variable $\xi$, then we would have to introduce the following "auxillary" functions:
Here, $\omega$ is a sort of "evaluation map", which evaluates the matrix $A$ on the vector $\eta$ by multiplication. $\iota_1$ and $\iota_2$ are the "canonical injections", which allow us to think of an element as being part of a larger product space. With this, we can the write the precise (but cumbersome) statement:
On the RHS, you can see that both $\iota_1 \circ (Dg) \circ f$ and $\iota_2 \circ \partial_v f$ are functions from $\mathbb{R}^m$ into $M_{q \times n}(\mathbb{R}) \times \mathbb{R}^n$, so their sum is a function of the same kind. Thus, composing this sum with $\omega$ makes sense and the result is a function from $\mathbb{R}^m$ into $\mathbb{R}^q$; this agrees with the LHS.
Final Remarks: