Partials w.r.t a single array where two different parts of the array are used in compute?

I have an array, a, of (x,y) coordinate pairs. This array is the only input to the component.

To generate an array of the angle of each coordinate pair, I use np.arctan2(a[1,:], a[0,:])

I'm not sure how to declare the partials, now. If I used two inputs (i.e x_coordinates, y_coordinates) it'd be quite easy to define the partials as per SymPy: Can I safely differentiate atan2()?.

However, since I only have the one array, I'm not sure how to go about this. My first idea was to do:

y = a[1, :]
x = a[0, :]
partials['angles', 'a'] = x/(x**2 + y**2) - y/(x**2 + y**2)

but I suspect this isn't correct, since I'm missing the dx and dy parts of the total derivative of arctan2.

What's the best way to go about this, without having to split the input array?

Solution

Here a and theta are both vector valued. When OpenMDAO computes partials, it flattens the outputs and inputs in C-order (row-major).

Lets assume we compute atan2 at 5 points.

So in this case we have a single scalar output at 5 points (5 rows in the jacobian) and two inputs at each point (10 columns in the jacobian).

Furthermore, each atan2 calculation only depends on the corresponding (y, x) in a. So we'd expect that only the first two elements of the first row of the jacobian will be nonzero. In the second row, it will be the 3rd and 4th elements, etc. This is a banded structure.

Sometimes in situations like this it's useful to see the structure of the partial derivatives Jacobian obtained by approximating the derivatives:

    import openmdao.api as om
    import numpy as np

    class Atan2Comp(om.ExplicitComponent):

        def initialize(self):
            self.options.declare('num_points', default=5, desc='number of points at which atan2 is computed')

        def setup(self):
            n = self.options['num_points']
            self.add_input('a', shape=(n, 2))
            self.add_output('theta', shape=(n,))

            self.declare_partials(of='theta', wrt='a')

        def compute(self, inputs, outputs):
            a = inputs['a']
            y = a[:, 0]
            x = a[:, 1]
            outputs['theta'] = np.arctan2(y, x)

        def compute_partials(self, inputs, partials, discrete_inputs=None):
                partials['theta', 'a'] = 1.0

    n = 5
    p = om.Problem(model=om.Group())
    ivc = p.model.add_subsystem('ivc', om.IndepVarComp(), promotes_outputs=['a'])
    ivc.add_output('a', shape=(n, 2))
    p.model.add_subsystem('atan2', Atan2Comp(num_points=n), promotes_inputs=['a'])

    p.setup()

    p.set_val('a', np.random.rand(n, 2))

    p.setup()
    p.run_model()

    with np.printoptions(linewidth=1024):
        p.check_partials()

This shows the structure of the Jacobian as being:

----------------------------
Component: Atan2Comp 'atan2'
----------------------------
  atan2: 'theta' wrt 'a'
    Forward Magnitude : 7.071068e+00
         Fd Magnitude : 3.177623e+00 (fd:forward)
    Absolute Error (Jfor  - Jfd) : 7.503866e+00 *

    Relative Error (Jfor  - Jfd) : 2.361472e+00 *

    Raw Forward Derivative (Jfor)
[[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]]

    Raw FD Derivative (Jfd)
[[ 0.93986629 -0.35863362  0.          0.          0.          0.          0.          0.          0.          0.        ]
 [ 0.          0.          0.72103352 -0.80351616  0.          0.          0.          0.          0.          0.        ]
 [ 0.          0.          0.          0.          2.06027062 -0.73005685  0.          0.          0.          0.        ]
 [ 0.          0.          0.          0.          0.          0.          0.48842487 -0.71201471  0.          0.        ]
 [ 0.          0.          0.          0.          0.          0.          0.          0.          1.22969926 -0.9404304 ]]

So the FD result has the expected value. We just crammed 1's into the matrix, so clearly the analytic form has the wrong value. Let's fix that.

First, lets declare this partial as being sparse, since it can make a significant difference in performance.

The column of each nonzero element is just [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], in our example with 5 points. The row of each nonzero element is repetitive [0, 0, 1, 1, 2, 2, 3, 3, 4, 4]. The declare_partials call thus becomes:

cs = np.arange(2 * n, dtype=int)
rs = np.repeat(np.arange(n, dtype=int), 2)
self.declare_partials(of='theta', wrt='a', rows=rs, cols=cs)

Now, in compute_partials, we need to put the partials of y on the diagonal (based on the way I ordered y, x in a). Since it was declared as a sparse derivative, this is equivalent to putting the partials of y on the even-indexed elements of the partial.

            def compute_partials(self, inputs, partials, discrete_inputs=None):
                a = inputs['a']
                y = a[:, 0]
                x = a[:, 1]

                # The partials wrt y go on the even-index elements of the sparse partial.
                partials['theta', 'a'][::2] = x / (x**2 + y**2)

                # The partials wrt x go on the odd-index elements of the sparse partial.
                partials['theta', 'a'][1::2] = -y / (x**2 + y**2)

With those changes implemented, the result of check_partials shows

----------------------------
Component: Atan2Comp 'atan2'
----------------------------
  atan2: 'theta' wrt 'a'
    Forward Magnitude : 3.326988e+00
         Fd Magnitude : 3.326985e+00 (fd:forward)
    Absolute Error (Jfor  - Jfd) : 3.489868e-06 *

    Relative Error (Jfor  - Jfd) : 1.048958e-06 *

    Raw Forward Derivative (Jfor)
[[ 1.55572771 -1.03761371  0.          0.          0.          0.          0.          0.          0.          0.        ]
 [ 0.          0.          1.05925649 -0.05293248  0.          0.          0.          0.          0.          0.        ]
 [ 0.          0.          0.          0.          1.71871308 -1.07781134  0.          0.          0.          0.        ]
 [ 0.          0.          0.          0.          0.          0.          0.99912434 -0.1373792   0.          0.        ]
 [ 0.          0.          0.          0.          0.          0.          0.          0.          1.13627733 -0.15229154]]

    Raw FD Derivative (Jfd)
[[ 1.5557261  -1.03761209  0.          0.          0.          0.          0.          0.          0.          0.        ]
 [ 0.          0.          1.05925643 -0.05293242  0.          0.          0.          0.          0.          0.        ]
 [ 0.          0.          0.          0.          1.71871122 -1.07780948  0.          0.          0.          0.        ]
 [ 0.          0.          0.          0.          0.          0.          0.9991242  -0.13737906  0.          0.        ]
 [ 0.          0.          0.          0.          0.          0.          0.          0.          1.13627715 -0.15229136]]

Just note that if your "vector" dimension of 'a' is not the first axis, you're going to have a different sparsity pattern, but the same process applies.