I have a pandas Dataframe set out as follows. There are in fact 192 X,Y,Z triplet columns, this is just the first three.
shot V0e V0n V0d S0_Pe S0_Pn S0_Pd S0_Se S0_Sn S0_Sd
0 1001 457950.4 7331695.2 2.5 458004.5 7331794.1 2.2 457950.4 7331695.2 2.1
1 1002 457948.0 7331689.4 2.3 457999.5 7331782.5 2.3 457993.6 7331792.8 2.3
2 1003 457945.6 7331683.5 2.4 457999.5 7331782.5 2.4 457945.6 7331683.5 2.6
3 1004 457943.3 7331677.8 2.3 457995.4 7331770.8 2.3 457988.8 7331781.2 2.5
4 1005 457940.9 7331672.1 2.2 457995.4 7331770.8 2.6 457948.0 7331689.4 2.4
What I want to do is rearrange them to look as follows so I can animate them in a Plotly animation. To do this in need the data in Long format.
Shot Easting Northing Depth
1001 457950.4 7331695.2 2.5 #V0e,V0n,V0d
1001 458004.5 7331794.1 2.2 #S0_Pe,S0_Pn,S0_Pd
1001 457950.4 7331695.2 2.1 #S0_Se,S0_Sn,S0_Sd
1002 457948.0 7331689.4 2.3 #V0e,V0n,V0d
1002 457999.5 7331782.5 2.3 #S0_Pe,S0_Pn,S0_Pd
1002 457993.6 7331792.8 2.3 #S0_Se,S0_Sn,S0_Sd
1003 457945.6 7331683.5 2.4 #V0e,V0n,V0d
1003 457999.5 7331782.5 2.4 #S0_Pe,S0_Pn,S0_Pd
1003 457945.6 7331683.5 2.6 #S0_Se,S0_Sn,S0_Sd
1004 457943.3 7331677.8 2.3 #V0e,V0n,V0d
1004 457995.4 7331770.8 2.3 #S0_Pe,S0_Pn,S0_Pd
1004 457995.4 7331770.8 2.3 #S0_Se,S0_Sn,S0_Sd
1005 457943.3 7331677.8 2.3 #V0e,V0n,V0d
1005 457995.4 7331770.8 2.3 #S0_Pe,S0_Pn,S0_Pd
1005 457948.0 7331689.4 2.4 #S0_Se,S0_Sn,S0_Sd
This is also acceptable.
Shot Easting Northing Depth
1001 457950.4 7331695.2 2.5 #V0e,V0n,V0d
1002 457948.0 7331689.4 2.3 #V0e,V0n,V0d
1003 457945.6 7331683.5 2.4 #V0e,V0n,V0d
1004 457943.3 7331677.8 2.3 #V0e,V0n,V0d
1005 457943.3 7331677.8 2.3 #V0e,V0n,V0d
1001 458004.5 7331794.1 2.2 #S0_Pe,S0_Pn,S0_Pd
1002 457999.5 7331782.5 2.3 #S0_Pe,S0_Pn,S0_Pd
1003 457999.5 7331782.5 2.4 #S0_Pe,S0_Pn,S0_Pd
1004 457995.4 7331770.8 2.3 #S0_Pe,S0_Pn,S0_Pd
1005 457995.4 7331770.8 2.3 #S0_Pe,S0_Pn,S0_Pd
1001 457950.4 7331695.2 2.1 #S0_Se,S0_Sn,S0_Sd
1002 457993.6 7331792.8 2.3 #S0_Se,S0_Sn,S0_Sd
1003 457945.6 7331683.5 2.6 #S0_Se,S0_Sn,S0_Sd
1004 457995.4 7331770.8 2.3 #S0_Se,S0_Sn,S0_Sd
1005 457948.0 7331689.4 2.4 #S0_Se,S0_Sn,S0_Sd
I have looked at pandas.melt.
dfMELT=pd.melt(df,id_vars=['shot'],value_vars=["V0e","V0n","V0d","S0_Pe","S0_Pn","S0_Pd","S0_Se","S0_Sn","S0_Sd"])
but it doesn't rearrange as above, it only works with 3 columns at a time so my X is dealt with, then my Y then my Z. This clearly won't work for plotting coordinates in a Scatter plot.
shot variable value
0 1001 V0e 457950.4
1 1002 V0e 457948.0
2 1003 V0e 457945.6
3 1004 V0e 457943.3
....
shot variable value
779 1780 V0e 456009.1
780 1001 V0n 7331695.2
781 1002 V0n 7331689.4
You can use pivot_longer from pyjanitor to abstract the process. Your columns have a pattern (some end with e
, some with d
, some with n
). Let's pass a list of regular expressions that capture this pattern:
# pip install janitor
import janitor
import pandas as pd
df.pivot_longer(index='shot',
names_to = ("Easting", "Northing", "Depth"),
names_pattern = (r".+e$", r".+n$", r".+d$"))
shot Easting Northing Depth
0 1001 457950.4 7331695.2 2.5
1 1002 457948.0 7331689.4 2.3
2 1003 457945.6 7331683.5 2.4
3 1004 457943.3 7331677.8 2.3
4 1005 457940.9 7331672.1 2.2
5 1001 458004.5 7331794.1 2.2
6 1002 457999.5 7331782.5 2.3
7 1003 457999.5 7331782.5 2.4
8 1004 457995.4 7331770.8 2.3
9 1005 457995.4 7331770.8 2.6
10 1001 457950.4 7331695.2 2.1
11 1002 457993.6 7331792.8 2.3
12 1003 457945.6 7331683.5 2.6
13 1004 457988.8 7331781.2 2.5
14 1005 457948.0 7331689.4 2.4
And if you want it in the order of appearance:
df.pivot_longer(index='shot',
names_to = ("Easting", "Northing", "Depth"),
names_pattern = (r".+e$", r".+n$", r".+d$"),
sort_by_appearance = True)
shot Easting Northing Depth
0 1001 457950.4 7331695.2 2.5
1 1001 458004.5 7331794.1 2.2
2 1001 457950.4 7331695.2 2.1
3 1002 457948.0 7331689.4 2.3
4 1002 457999.5 7331782.5 2.3
5 1002 457993.6 7331792.8 2.3
6 1003 457945.6 7331683.5 2.4
7 1003 457999.5 7331782.5 2.4
8 1003 457945.6 7331683.5 2.6
9 1004 457943.3 7331677.8 2.3
10 1004 457995.4 7331770.8 2.3
11 1004 457988.8 7331781.2 2.5
12 1005 457940.9 7331672.1 2.2
13 1005 457995.4 7331770.8 2.6
14 1005 457948.0 7331689.4 2.4
You could also use the .value
approach - any values associated with .value
stays as a header:
(df.pivot_longer(index = 'shot',
names_to = ".value",
names_pattern = r".+(.)$")
.rename(columns = {"e" : "Easting",
"n" : "Northing",
"d" : "Depth"}
)
)
shot Easting Northing Depth
0 1001 457950.4 7331695.2 2.5
1 1002 457948.0 7331689.4 2.3
2 1003 457945.6 7331683.5 2.4
3 1004 457943.3 7331677.8 2.3
4 1005 457940.9 7331672.1 2.2
5 1001 458004.5 7331794.1 2.2
6 1002 457999.5 7331782.5 2.3
7 1003 457999.5 7331782.5 2.4
8 1004 457995.4 7331770.8 2.3
9 1005 457995.4 7331770.8 2.6
10 1001 457950.4 7331695.2 2.1
11 1002 457993.6 7331792.8 2.3
12 1003 457945.6 7331683.5 2.6
13 1004 457988.8 7331781.2 2.5
14 1005 457948.0 7331689.4 2.4
pivot_longer aims to make the reshaping process easier; however, you may not be interested in importing/installing another library - let's see one way we can resolve this using pandas' wide_to_long
:
First, lets rename the columns :
res = df.copy()
res = res.set_index('shot')
res = res.rename(columns = lambda col: f"Easting_{col[:-1]}"
if col.endswith("e")
else f"Northing_{col[:-1]}"
if col.endswith("n")
else f"Depth_{col[:-1]}")
Now we can reshape:
(pd.wide_to_long(res.reset_index(),
i = 'shot',
stubnames = ['Easting', 'Northing', 'Depth'],
j = 'wateva',
sep = "_",
suffix = ".+")
.droplevel('wateva')
.reset_index()
)
shot Easting Northing Depth
0 1001 457950.4 7331695.2 2.5
1 1002 457948.0 7331689.4 2.3
2 1003 457945.6 7331683.5 2.4
3 1004 457943.3 7331677.8 2.3
4 1005 457940.9 7331672.1 2.2
5 1001 458004.5 7331794.1 2.2
6 1002 457999.5 7331782.5 2.3
7 1003 457999.5 7331782.5 2.4
8 1004 457995.4 7331770.8 2.3
9 1005 457995.4 7331770.8 2.6
10 1001 457950.4 7331695.2 2.1
11 1002 457993.6 7331792.8 2.3
12 1003 457945.6 7331683.5 2.6
13 1004 457988.8 7331781.2 2.5
14 1005 457948.0 7331689.4 2.4