Numpy.Matmul and Numpy.Dot: Accelerating code by using Numpy in-built functions

I am trying to speed up a function within my code.

The initial function I had written was:

def f(self):
    temp = 0
    for i in range(self.N):
        for j in range(self.N):
            a = self.Asolution[:, j]
            b = self.Bsolution[:, i]
            c = self.Matrix[j][i]

            d = c*np.multiply(a, b)
            temp += simps(d, self.time)
    return temp

where self.Asolution = odeint(...) , and same for self.Bsolution.

self.Matrix is a square matrix of size self.N x self.N and simps is Simpson integration. self.Asolution and self.Bsolution have dimensions (t x N).

However, I need to call this function many times, and it takes too long as self.N is quite large. Thefore, I decided to give a go to numpy in-built functions as I am mostly dealing with matrix multiplication. I tend to use for loops for everything, which is not the smartest option... Thus, I am a bit unfamiliar with in-built numpy functions. I modified the function to:

def f(self):
   d = np.dot(np.dot(self.Asolution, self.Matrix), self.Bsolution.transpose())
   d = np.array(d)
   temp = simps(d, self.time)
   temp = sum(temp)
 
   return temp

This is significantly faster, but I am not getting the same result as above.

I think I have misunderstood the use of np.dot or I am missing the way I am multiplying the matrices. My main goal is to remove the double for loop from the first code, because it takes forever. What am I missing here? Thanks in advance for any hints!

EDIT:

self.Asolution and self.Bsolution have sizes (t x N) - each column is a different position and the rows indicate how the position evolves in time.

self.Matrix has size (N x N).

Solution

After many trial and error, I managed to find a way to get the same result faster than with the double for loop. I post it just for completeness and to close the question.

n = len(self.time)
d = [np.matmul(np.matmul(self.Asolution[:][i], self.Matrix),
     self.Bsolution[:][i].transpose) for i in range(n)]
temp = np.simps(d, self.time)