For $\frac{\partial }{\partial \beta} e^{(X_i \beta)}$
$X_i$ is a row vector, $\beta$ is a column vector so that the argument of e is a scalar.
I know the answer is
$ e^{(X_i \beta)}X'_i$
but why is the last row vector transposed into a column vector?
In addition what does it mean to take the derivative in this case with respect to the vector $\beta$?
Best Answer
Using, as example, $2-$dimension vectors: $ X=(X_1,X_2) $ and $ \beta=(\beta_1, \beta_2)^T $ we have: $$ e^{(X\beta)}=e^{(X_1\beta_1+X_2\beta_2)} $$ and ( see here for the derivative of a scalar with respect to a vector) $$ \frac{\partial}{\partial \beta}e^{(X\beta)}= \begin{pmatrix} \frac{\partial}{\partial \beta_1}e^{(X_1\beta_1+X_2\beta_2)}\\ \frac{\partial}{\partial \beta_2}e^{(X_1\beta_1+X_2\beta_2)} \end{pmatrix}= \begin{pmatrix} X_1e^{(X_1\beta_1+X_2\beta_2)}\\ X_2e^{(X_1\beta_1+X_2\beta_2)} \end{pmatrix}= e^{(X\beta)}X^T $$ I think that you can extend this to any finite $n-$dimension vectors.