In Stata, mkspline
automatically creates variables containing a linear spline given a series of knot point values...
mkspline knot1 30 knot2 40 knot3 50 knot4 = v1
Here is the result of running this on a series of values in Stata. It basically distributes the value over the spline knots. Sorry I don't know the technical math or statistical term for this, just the concept overall.
v1 knot1 knot2 knot3 knot4
10 10 0 0 0
20 20 0 0 0
30 30 0 0 0
40 30 10 0 0
50 30 10 10 0
60 30 10 10 10
70 30 10 10 20
80 30 10 10 30
90 30 10 10 40
100 30 10 10 50
Is there an equivalent to this in Python with Numpy or Pandas or similar?
I don't think there is a function for that.
Try with numpy:
thresh = [0,30,40,50]
diffs = np.maximum(df[['v1']].to_numpy() - thresh,0)
diffs[:,:-1] = np.minimum(diffs[:,:-1], [np.diff(thresh)])
Output:
array([[10, 0, 0, 0],
[20, 0, 0, 0],
[30, 0, 0, 0],
[30, 10, 0, 0],
[30, 10, 10, 0],
[30, 10, 10, 10],
[30, 10, 10, 20],
[30, 10, 10, 30],
[30, 10, 10, 40],
[30, 10, 10, 50]])