TransWikia.com

Taking the matrix derivative of the product of one matrix and a Hadamard Product.

Mathematics Asked on November 16, 2021

Consider the three matrices $mathbf{C}$, $mathbf{A}$, and $mathbf{T}$. The matrix $mathbf{C}$ has $mathit{m} times mathit{k}$ entries, $mathbf{A}$ is a $mathit{k} times mathit{n}$ matrix, and $mathbf{T}$ is a $mathit{m} times mathit{n}$ matrix.

I’d like to evaluate the following matrix derivative:
$$frac{partial}{partialmathbf{C}}bigl( (mathbf{C}mathbf{A}) circ mathbf{T} bigr)$$

Where $circ$ represents the Hadamard product. Note that the dimensions of this expression are consistent since $mathbf{CA}$ is a $mathit{m} times mathit{n}$ matrix. Note that both $mathbf{A}$ and $mathbf{T}$ are both constant matrices with respect to $mathbf{C}$.

I’m wondering how I can evaluate and then express this result. I know that since the expression I am taking the derivative of is a $mathit{m}$ $times$ $mathit{n}$ matrix, and $mathbf{C}$ is a $mathit{m} times mathit{k}$ matrix, that the result of this derivative expression will have $mathit{m} times mathit{n} times mathit{m} times mathit{k}$ entries.

I’d appreciate any answer, including one in index notation.

Thank you for your time.

2 Answers

It's probably simpler to vectorize the matrix equation, and then to eliminate the Hadamard product in favor of multiplication by a diagonal matrix, i.e. $$eqalign{ F &= Tcirc CA \ {rm vec}(F) &= {rm vec}(T)circ {rm vec}(CA) \ &= {rm Diag}big({rm vec}(T)big),(A^Totimes I),{rm vec}(C) \ frac{partial f}{partial c} = frac{partial,{rm vec}(F)}{partial,{rm vec}(C)} &= {rm Diag}big({rm vec}(T)big),(A^Totimes I) \ }$$ If you really want a fourth order tensor, there is a straightforward one-to-one mapping between the matrix and its vectorized form, e.g. $$eqalign{ F &in {mathbb R}^{mtimes n} quadiffquad f in {mathbb R}^{mntimes 1} \ F_{ij} &= f_{alpha} \ alpha &= i+(j-1),m \ i &= 1+(alpha-1),{rm mod},m \ j &= 1+(alpha-1),{rm div},m \ }$$ So you can convert the gradient matrix into a tensor
$$eqalign{ frac{partial f_alpha}{partial c_beta} = frac{partial F_{ij}}{partial C_{kell}} \ }$$

Answered by greg on November 16, 2021

Because $C mapsto (CA) circ T$ is a linear map, it is very easy to compute the derivative in differential form. In particular, we have $$ D_C(C_0)(dC) = (dC,A) circ T. $$ Now, let $E_{ij}$ denote the matrix with a $1$ in the $i,j$ entry and zeros elsewhere. Let $e_i$ denote the column vector with a $1$ in the $i$ entry and zeros elsewhere. We have $$ frac{partial f}{partial C_{ij}}|_{C = C_0} = D_C(C_0)(E_{ij}) = (E_{ij}A) circ T = (e_ie_j^top A circ T) = e_i(A^top e_j)^top circ T\ = E_{ii} T operatorname{diag}(A^top e_j). $$ In index notation, one might write $$ frac{partial f_{pq}}{partial C_{ij}} = e_p^TE_{ii} T operatorname{diag}(A^top e_j)e_q = delta_{ip} e_i^top T A_{jq}e_q = delta_{ip} A_{jq}T_{iq}, $$ with no summation implied.

Answered by Ben Grossmann on November 16, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP