C: Vector and Matrix Calculus¶
It is assumed that all vectors are of size \(n\) and similarly all matrices are of size \(n \times n\).
C1¶
\(\frac{\partial}{\partial \theta}\left[\mathbf{A}^{-1}(\theta)\right] = -\mathbf{A}^{-1}\frac{\partial}{\partial \theta}\left[\mathbf{A}\right]\mathbf{A}^{-1}\)¶
C2¶
\(\frac{\partial}{\partial \theta}\left[|\mathbf{A}(\theta)|\right] = |\mathbf{A}|\text{Tr}\left[\mathbf{A}^{-1}\frac{\partial}{\partial \theta}\left[\mathbf{A}\right]\right]\)¶
Starting with \(|\mathbf{I}| = 1\), observe how this value changes if we add \(h\mathbf{A}\) i.e.
By first principles, we can then show that the differential of \(|\mathbf{I}|\) with respect to \(\mathbf{A}\) is
If \(\mathbf{A} = \mathbf{A}(\theta)\) then by chain rule,
So far we have shown how a determinant changes if we had the identity matrix. Lets define \(\mathbf{B}\in\mathbb{R}^{n,n}\), then using the fact that \(|\mathbf{B}\mathbf{A}| = |\mathbf{B}||\mathbf{A}|\), and the chain rule, we have
Substituting \(\mathbf{B} = \mathbf{A}\) we finally have
C3¶
\(\frac{\partial}{\partial \theta}\left[\log |\mathbf{A}(\theta)|\right] = \text{Tr}\left[\mathbf{A}^{-1}\partial\left[\mathbf{A}\right]\right]\)¶
Using the chain rule we know that \(\partial\left[f(g(\theta))\right] = g'(\theta)f'(g(\theta))\). Let \(f\) be the log function and \(g(\theta) = |\mathbf{A}|\) then by using the chain rule we have