I have a pandas series, s1, and I want to make a new series, s2, by applying a function that takes two inputs to create one new value. This function would be applied to a 2-value window on s1. The resulting series, s2, should have one fewer value than s1. There are many ways to accomplish this but I'm looking for a way to do it very efficiently. This is on Linux and I'm currently running python 2.7 and 3.4 and pandas 15.2, though I can update pandas if that's necessary. Here's a simplification of my problem. My series consists of musical pitches represented as strings.
import pandas
s1 = pandas.Series(['C4', 'E-4', 'G4', 'A-4')
I'd like to use this function:
def interval_func(event1, event2):
ev1 = music21.note.Note(event1)
ev2 = music21.note.Note(event2)
intrvl = music21.interval.Interval(ev1, ev2)
return intrvl.name
On s1 and a shifted version of s1, to get the following series:
s2 = pandas.Series(['m3', 'M3', 'm2'])
In response to your edit, we could try and use a similar .rolling method, but pandas does not currently support non-numeric types in rolls.
So, we can use a list comprehension:
[music21.interval.Interval(music21.note.Note(s1[i]),\
music21.note.Note(s1[i + 1])).name\
for i in range(len(s1)-1)]
or, an apply:
import music21
import pandas as pd
import numpy as np
s1 = pd.Series(['C4', 'E-4', 'G4', 'A-4'])
df = pd.DataFrame({0:s1, 1:s1.shift(1)})
def myfunc(x):
if not any([pd.isnull(x[0]), pd.isnull(x[1])]):
return music21.interval.Interval(music21.note.Note(x[0]),music21.note.Note(x[1])).name
df.apply(myfunc, axis = 1)
nb, I would be surprised if the apply
is any faster than the comprehension