Given the product of a matrix and a vector
A.v
with A of shape (m,n) and v of dim n, where m and n are symbols, I need to calculate the Derivative with respect to the matrix elements.
I haven't found the way to use a proper vector, so I started with 2 MatrixSymbol
:
n, m = symbols('n m')
j = tensor.Idx('j')
i = tensor.Idx('i')
l = tensor.Idx('l')
h = tensor.Idx('h')
A = MatrixSymbol('A', n,m)
B = MatrixSymbol('B', m,1)
C=A*B
Now, if I try to derive with respect to one of A's elements with the indices I get back the unevaluated expression:
diff(C, A[i,j])
>>>> Derivative(A*B, A[i, j])
If I introduce the indices in C also (it won't let me use only one index in the resulting vector) I get back the product expressed as a Sum:
C[l,h]
>>>> Sum(A[l, _k]*B[_k, h], (_k, 0, m - 1))
If I derive this with respect to the matrix element I end up getting 0 instead of an expression with the KroneckerDelta
, which is the result that I would like to get:
diff(C[l,h], A[i,j])
>>>> 0
I wonder if maybe I shouldn't be using MatrixSymbols to start with. How should I go about implementing the behaviour that I want to get?
The git version of SymPy (and the next version) handles this better:
In [55]: print(diff(C[l,h], A[i,j]))
Sum(KroneckerDelta(_k, j)*KroneckerDelta(i, l)*B[_k, h], (_k, 0, m - 1))