Symbol to gradient with @generated macro in Julia

For reasons of performance, I need gradients and Hessians that perform as fast as user-defined functions ( the ForwardDiff library, for example, makes my code significantly slower). I then tried metaprogramming using the @generated macro, testing with a simple function

using Calculus
hand_defined_derivative(x) = 2x - sin(x)

symbolic_primal = :( x^2 + cos(x) )
symbolic_derivative = differentiate(symbolic_primal,:x)
@generated functional_derivative(x) = symbolic_derivative

This gave me exactly what I wanted:

rand_x = rand(10000);
exact_values = hand_defined_derivative.(rand_x)
test_values = functional_derivative.(rand_x)

isequal(exact_values,test_values)        # >> true

@btime hand_defined_derivative.(rand_x); # >> 73.358 μs (5 allocations: 78.27 KiB)
@btime functional_derivative.(rand_x);   # >> 73.456 μs (5 allocations: 78.27 KiB)

I now need to generalize this to functions with more arguments. The obvious extrapolation is:

symbolic_primal = :( x^2 + cos(x) + y^2  )
symbolic_gradient = differentiate(symbolic_primal,[:x,:y])

The symbolic_gradient behaves as expected (just as in the 1-dimensional case), but the @generated macro does not respond to multiple dimensions as I believed it would:

@generated functional_gradient(x,y) = symbolic_gradient
functional_gradient(1.0,1.0)

>> 2-element Array{Any,1}:
    :(2 * 1 * x ^ (2 - 1) + 1 * -(sin(x)))
    :(2 * 1 * y ^ (2 - 1))

That is, it doesn't transform the symbols into generated functions. Is there an easy way to solve this?

P.S.: I know I could define the derivatives with respect to each argument as one-dimensional functions and bundle them together into a gradient (that's what I'm currently doing), but I'm sure there must be a better way.

Solution

First, I think you don't need to use @generated here: this is a "simple" case of code generation, where I'd argue using @eval is simpler and less surprising.

So the 1D case could be rewritten like this:

julia> using Calculus

julia> symbolic_primal = :( x^2 + cos(x) )
:(x ^ 2 + cos(x))

julia> symbolic_derivative = differentiate(symbolic_primal,:x)
:(2 * 1 * x ^ (2 - 1) + 1 * -(sin(x)))

julia> hand_defined_derivative(x) = 2x - sin(x)
hand_defined_derivative (generic function with 1 method)

# Let's check first what code we'll be evaluating
# (`quote` returns the unevaluated expression passed to it)
julia> quote
           functional_derivative(x) = $symbolic_derivative
       end
quote
    functional_derivative(x) = begin
            2 * 1 * x ^ (2 - 1) + 1 * -(sin(x))
        end
end

# Looks OK => let's evaluate it now
# (since `@eval` is macro, its argument will be left unevaluated
#  => no `quote` here)
julia> @eval begin
           functional_derivative(x) = $symbolic_derivative
       end
functional_derivative (generic function with 1 method)

julia> rand_x = rand(10000);
julia> exact_values = hand_defined_derivative.(rand_x);
julia> test_values = functional_derivative.(rand_x);

julia> @assert isequal(exact_values,test_values)

# Don't forget to interpolate array arguments when using `BenchmarkTools`
julia> using BenchmarkTools
julia> @btime hand_defined_derivative.($rand_x);
  104.259 μs (2 allocations: 78.20 KiB)

julia> @btime functional_derivative.($rand_x);
  104.537 μs (2 allocations: 78.20 KiB)

Now the 2D case does not work because the output of differentiate is an array of expressions (one expression per component), which you need to transform into an expression which builds an array (or a Tuple, for performance) of components. This is symbolic_gradient_expr in the example below:

julia> symbolic_primal = :( x^2 + cos(x) + y^2  )
:(x ^ 2 + cos(x) + y ^ 2)

julia> hand_defined_gradient(x, y) = (2x - sin(x), 2y)
hand_defined_gradient (generic function with 1 method)

# This is a vector of expressions
julia> symbolic_gradient = differentiate(symbolic_primal,[:x,:y])
2-element Array{Any,1}:
 :(2 * 1 * x ^ (2 - 1) + 1 * -(sin(x)))
 :(2 * 1 * y ^ (2 - 1))

# Wrap expressions for all components of the gradient into a single expression
# generating a tuple of them:
julia> symbolic_gradient_expr = Expr(:tuple, symbolic_gradient...)
:((2 * 1 * x ^ (2 - 1) + 1 * -(sin(x)), 2 * 1 * y ^ (2 - 1)))

julia> @eval functional_gradient(x, y) = $symbolic_gradient_expr
functional_gradient (generic function with 1 method)

Like in the 1D case, this performs identically to the hand-written version:

julia> rand_x = rand(10000); rand_y = rand(10000);
julia> exact_values = hand_defined_gradient.(rand_x, rand_y);
julia> test_values = functional_gradient.(rand_x, rand_y);

julia> @assert isequal(exact_values,test_values)

julia> @btime hand_defined_gradient.($rand_x, $rand_y);
  113.182 μs (2 allocations: 156.33 KiB)

julia> @btime functional_gradient.($rand_x, $rand_y);
  112.283 μs (2 allocations: 156.33 KiB)