Search code examples
pythonpandasrubydataframepycall

Syntax for `apply` pandas function in ruby


I need to convert a python script into ruby. I use for that the gems Pandas and Numpy which make the work quite simple.

For example I have these kind of lines:

# python
# DF is a dataframe from Pandas

DF['VAL'].ewm(span = vDAY).mean()
DF['VOLAT'].rolling(vDAY).std()

so no question asked, I convert like this:

# ruby
df['VAL'].ewm(span: vDAY).mean
df['VOLAT'].rolling(vDAY).std

easy.


But I have a function apply from Pandas which takes a function as first argument and I really don't know how to convert it in ruby. It's something like that :

# python
import numpy as np

DF['VAL'].rolling(vDAY).apply(lambda x: np.polyfit(range(len(x)), x, 1)[0])
# output=> NaN or Float

I tried to decomposed the lambda like this:

# ruby
polyfit = ->(x) { t = Numpy.polyfit((0...x.size).to_a, x, 1); t[0] }

puts polyfit.call(<insert Array argument>) 
#=> I have a satisfying output for my lambda

# but...
df['VAL'].rolling(vDAY).apply(&polyfit)
# output=> `apply': <class 'TypeError'>: must be real number, not NoneType (PyCall::PyError)

# or
df['VAL'].rolling(vDAY).apply{ |x| polyfit.call(x) }
# output=> `apply': <class 'TypeError'>: apply() missing 1 required positional argument: 'func' (PyCall::PyError)

# or
df['VAL'].rolling(vDAY).apply(polyfit)
#output=> `apply': <class 'TypeError'>: must be real number, not NoneType (PyCall::PyError)

# or
df['VAL'].rolling(vDAY).apply(:polyfit)
# output=> `apply': <class 'TypeError'>: 'str' object is not callable (PyCall::PyError)

It's obviously not working. The problem is this "x" argument in the python inline syntax that I really don't know how to get it "the ruby way"

If someone can "translate" this apply function from python syntax to ruby, it would be really nice :)

I just want to point out that I'm a ruby/rails developer and I don't know python professionally speaking.


UPDATE:

Ok, it's a complete misunderstanding of python code for my part: apply needs a function argument as a callable object. So in ruby it's not a lambda but a Proc I need.

So the solution for those who encounter the same problem:

# ruby
polyfit = Proc.new { t = Numpy.polyfit((0...x.size).to_a, x, 1); t[0] }
df['VAL'].rolling(vDAY).apply(polyfit)

Solution

  • The solution is to use an Proc (see "UPDATE" section in the original question)

    # ruby
    polyfit = Proc.new { t = Numpy.polyfit((0...x.size).to_a, x, 1); t[0] }
    df['VAL'].rolling(vDAY).apply(polyfit)