Search code examples
macrosjuliametaprogramming

Symbol to gradient with @generated macro in Julia


For reasons of performance, I need gradients and Hessians that perform as fast as user-defined functions ( the ForwardDiff library, for example, makes my code significantly slower). I then tried metaprogramming using the @generated macro, testing with a simple function

using Calculus
hand_defined_derivative(x) = 2x - sin(x)

symbolic_primal = :( x^2 + cos(x) )
symbolic_derivative = differentiate(symbolic_primal,:x)
@generated functional_derivative(x) = symbolic_derivative

This gave me exactly what I wanted:

rand_x = rand(10000);
exact_values = hand_defined_derivative.(rand_x)
test_values = functional_derivative.(rand_x)

isequal(exact_values,test_values)        # >> true

@btime hand_defined_derivative.(rand_x); # >> 73.358 μs (5 allocations: 78.27 KiB)
@btime functional_derivative.(rand_x);   # >> 73.456 μs (5 allocations: 78.27 KiB)

I now need to generalize this to functions with more arguments. The obvious extrapolation is:

symbolic_primal = :( x^2 + cos(x) + y^2  )
symbolic_gradient = differentiate(symbolic_primal,[:x,:y])

The symbolic_gradient behaves as expected (just as in the 1-dimensional case), but the @generated macro does not respond to multiple dimensions as I believed it would:

@generated functional_gradient(x,y) = symbolic_gradient
functional_gradient(1.0,1.0)

>> 2-element Array{Any,1}:
    :(2 * 1 * x ^ (2 - 1) + 1 * -(sin(x)))
    :(2 * 1 * y ^ (2 - 1))

That is, it doesn't transform the symbols into generated functions. Is there an easy way to solve this?

P.S.: I know I could define the derivatives with respect to each argument as one-dimensional functions and bundle them together into a gradient (that's what I'm currently doing), but I'm sure there must be a better way.


Solution

  • First, I think you don't need to use @generated here: this is a "simple" case of code generation, where I'd argue using @eval is simpler and less surprising.

    So the 1D case could be rewritten like this:

    julia> using Calculus
    
    julia> symbolic_primal = :( x^2 + cos(x) )
    :(x ^ 2 + cos(x))
    
    julia> symbolic_derivative = differentiate(symbolic_primal,:x)
    :(2 * 1 * x ^ (2 - 1) + 1 * -(sin(x)))
    
    julia> hand_defined_derivative(x) = 2x - sin(x)
    hand_defined_derivative (generic function with 1 method)
    
    # Let's check first what code we'll be evaluating
    # (`quote` returns the unevaluated expression passed to it)
    julia> quote
               functional_derivative(x) = $symbolic_derivative
           end
    quote
        functional_derivative(x) = begin
                2 * 1 * x ^ (2 - 1) + 1 * -(sin(x))
            end
    end
    
    # Looks OK => let's evaluate it now
    # (since `@eval` is macro, its argument will be left unevaluated
    #  => no `quote` here)
    julia> @eval begin
               functional_derivative(x) = $symbolic_derivative
           end
    functional_derivative (generic function with 1 method)
    
    julia> rand_x = rand(10000);
    julia> exact_values = hand_defined_derivative.(rand_x);
    julia> test_values = functional_derivative.(rand_x);
    
    julia> @assert isequal(exact_values,test_values)
    
    # Don't forget to interpolate array arguments when using `BenchmarkTools`
    julia> using BenchmarkTools
    julia> @btime hand_defined_derivative.($rand_x);
      104.259 μs (2 allocations: 78.20 KiB)
    
    julia> @btime functional_derivative.($rand_x);
      104.537 μs (2 allocations: 78.20 KiB)
    

    Now the 2D case does not work because the output of differentiate is an array of expressions (one expression per component), which you need to transform into an expression which builds an array (or a Tuple, for performance) of components. This is symbolic_gradient_expr in the example below:

    julia> symbolic_primal = :( x^2 + cos(x) + y^2  )
    :(x ^ 2 + cos(x) + y ^ 2)
    
    julia> hand_defined_gradient(x, y) = (2x - sin(x), 2y)
    hand_defined_gradient (generic function with 1 method)
    
    # This is a vector of expressions
    julia> symbolic_gradient = differentiate(symbolic_primal,[:x,:y])
    2-element Array{Any,1}:
     :(2 * 1 * x ^ (2 - 1) + 1 * -(sin(x)))
     :(2 * 1 * y ^ (2 - 1))
    
    # Wrap expressions for all components of the gradient into a single expression
    # generating a tuple of them:
    julia> symbolic_gradient_expr = Expr(:tuple, symbolic_gradient...)
    :((2 * 1 * x ^ (2 - 1) + 1 * -(sin(x)), 2 * 1 * y ^ (2 - 1)))
    
    julia> @eval functional_gradient(x, y) = $symbolic_gradient_expr
    functional_gradient (generic function with 1 method)
    

    Like in the 1D case, this performs identically to the hand-written version:

    julia> rand_x = rand(10000); rand_y = rand(10000);
    julia> exact_values = hand_defined_gradient.(rand_x, rand_y);
    julia> test_values = functional_gradient.(rand_x, rand_y);
    
    julia> @assert isequal(exact_values,test_values)
    
    julia> @btime hand_defined_gradient.($rand_x, $rand_y);
      113.182 μs (2 allocations: 156.33 KiB)
    
    julia> @btime functional_gradient.($rand_x, $rand_y);
      112.283 μs (2 allocations: 156.33 KiB)