cnvidianumerical-methodspolynomial-mathatan2# What is the logic behind this nvidia 's arctan2?

Nvidia has some functions in Cg 3.1 Toolkit Documentation

arctan2 is implemented as follows

```
float2 atan2(float2 y, float2 x)
{
float2 t0, t1, t2, t3, t4;
t3 = abs(x);
t1 = abs(y);
t0 = max(t3, t1);
t1 = min(t3, t1);
t3 = float(1) / t0;
t3 = t1 * t3;
t4 = t3 * t3;
t0 = - float(0.013480470);
t0 = t0 * t4 + float(0.057477314);
t0 = t0 * t4 - float(0.121239071);
t0 = t0 * t4 + float(0.195635925);
t0 = t0 * t4 - float(0.332994597);
t0 = t0 * t4 + float(0.999995630);
t3 = t0 * t3;
t3 = (abs(y) > abs(x)) ? float(1.570796327) - t3 : t3;
t3 = (x < 0) ? float(3.141592654) - t3 : t3;
t3 = (y < 0) ? -t3 : t3;
return t3;
}
```

What is the formula or logic behind this ? I couldn't find any references in their libraries.

Solution

The code has 3 parts.

First the input, which can be interpreted as a point (x,y), gets mapped to the sector below the diagonal in the first quadrant. The absolute values and max/min operations effectively act as reflections (or identity) first on the coordinate axes and then on the diagonal.

In the middle then an arcus or inverse tangent approximation for `r=y/x`

is computed. Note that the Taylor series is `a = 1*r-1/3*r^3+1/5*r^5-1/7*r^7+...`

However, the Taylor series is overly correct at the origin and rapidly loses accuracy away from it. Using some fitting procedure, a polynomial that is equally good/bad on the whole interval was determined. Its coefficients are close to the Taylor coefficients, especially in the lower degrees. The polynomial evaluation is most efficiently done via the Horner scheme, where the coefficients are used starting from the highest degree.

And finally, the signs and magnitudes of the input are used to undo the original reflections in reverse order, only that now the angle `a`

gets transformed. So if there was a reflection on the diagonal, `a`

gets mapped to `pi/2-a`

. If there was a reflection on the `y`

axis, `a`

gets mapped to `pi-a`

. And finally in case of a reflection on the `x`

axis, `a`

gets changed to `-a`

.

With some recursive function calls, the procedure in question could also be compactly formulated as (here in Python)

```
def atan2(y,x):
if y<0: return -atan2(-y,x)
if x<0: return pi-atan2(y,-x)
if x<y: return 0.5*pi-atan2(x,y)
return p(y/x)
```

where the polynomial is evaluated as

```
def p(r):
r2 = r*r
res = - 0.013480470 # *r^11
res = res*r2 + 0.057477314 # *r^9
res = res*r2 - 0.121239071 # *r^7
res = res*r2 + 0.195635925 # *r^5
res = res*r2 - 0.332994597 # *r^3
res = res*r2 + 0.999995630 # *r^1
return r*res
```

For comparison the Taylor series can be implemented as

```
def a11(r):
r2 = r*r
res = 0
for k in range(11,0,-2):
res = 1/k-r2*res
return r*res
```

To compare the errors of both approximations in one plot, use a logarithmic vertical axis, the Taylor error grows too fast.

```
r = np.linspace(0,1,500)
plt.semilogy(r,abs(p(r)-np.arctan(r)), r, abs(a11(r)-np.arctan(r)))
plt.legend(["residual of p", "residual of Taylor"])
plt.grid(); plt.show()
```

This gives the error plot

which shows the described error behavior of the minimaxed polynomial and the Taylor polynomial of equal degree.

- cs50 runoff infinite loop
- Can the unsigned keyword be used in non-obvious ways?
- In C/C++ what's the simplest way to reverse the order of bits in a byte?
- Dot Product function in C language
- How to read two lines with a random numbers count from input into two arrays
- RobotC - Programming an Elevator
- Strange behavior when adding longs in RobotC
- Do unsigned functions have to return something?
- Duplicate symbols in Microsoft C library
- Using ssize_t vs int
- undefined reference to `sctp_get_no_strms'
- Passing pointer of an array to a function in C
- C Program to find day of week given date
- Embedding Python in C, linking fails with undefined reference to `Py_Initialize'
- Converting python string object to c char* using ctypes
- State-of-the-art for embedding scriptable, interactive SVG in Gtk+ applications?
- Does the stack get freed after scope block?
- VS Code's C/C++ extension says the C23 true and false keywords are undefined
- How to use Docker compose in c program?
- How to declare a pointer to a character array in C?
- How to get one character at a time
- How to get a timeout to work when connecting to a socket
- Periodically trigger pthread workers and wait for completion
- The return value goes wrong if I release something else
- Have two C compilers
- My function is not reading data from the input file properly (C)
- Consolidating GNU C's and C23's deprecated function attribute
- Infix to postfix left one parentheses at the end when expression is fully enclosed
- Casting a function pointer to another type
- Runtime cast of void pointer