Search code examples
c++c++20automatic-differentiation

automatic differentiation and getting the next representable floating point value


Getting the next representable floating point number of type T greater than a given T x can be achieved by calling std::nextafter(x) or, assuming T = double, next_double_up(x), where

double next_double_up(double v)
{
    if (std::isinf(v) && v > 0)
        return v;
    if (v == -0)
        v = 0.f;
    std::uint64_t ui = std::bit_cast<std::uint64_t>(v);
    if (v >= 0)
        ++ui;
    else
        --ui;
    return std::bit_cast<double>(ui);
}

However, now I need to do this for T = CppAD::AD<U>, where U is a floating point type. I clearly cannot use std::nextafer or my next_double_up here. (And even if I could, the operations involved are clearly not differentiable.)

What can I do here? Would it be a good idea to simply add a "nice" positive epsilon? If so, how should I choose it? Maybe epsilon = std::numeric_limits<U>::epsilon()?

I think I cannot prevent to end up with a larger value than what would really be the "next" greater value. However, it is more important I guess that the obtained value is actually larger.


Solution

  • For CppAD::AD<U>, where U is a floating point type, you're correct that typical bitwise operations won't work and are not differentiable.

    Adding a "nice" positive epsilon is a viable workaround. Using std::numeric_limits<U>::epsilon() as epsilon could be a reasonable choice.

    Here's a simplified code snippet:

    #include <CppAD/CppAD.hpp>
    #include <limits>
    
    template<typename U>
    CppAD::AD<U> next_ad_up(CppAD::AD<U> x) {
        CppAD::AD<U> epsilon = CppAD::numeric_limits<CppAD::AD<U>>::epsilon();
        return x + epsilon;
    }
    

    This method should be differentiable and fulfills your requirement of getting a larger value. However, note that the increment is based on machine epsilon, and might not be the "next representable" value in the same sense as std::nextafter or next_double_up.

    Keep in mind:

    • If you're optimizing a function that is sensitive to very small changes, adding epsilon could introduce numerical instability.
    • The choice of epsilon could be problem-specific. You might need to tune it based on your specific use case.