With the below pd dataframe:
print(df)
A B C
X 1 2 3
Y 4 5 6
Z 7 8 9
I need to create a simple interaction network file, or SIF file, of the format:
node1 xx node2
node1 xx node2
node1 yy node2
.
.
.
Where each line is an interaction on a df: row label, value, column label. Below is an iterative (and naive) approach to writing such a file:
with open ('interaction.sif', 'w') as sif:
for row in df.index:
for col in df.columns:
sif.write('{}\t{}\t{}'.format(row, df[col][row], col))
The inefficient code above offers the ideal sif file for the dataframe df
:
X 1 A
X 2 B
X 3 C
Y 4 A
Y 5 B
Y 6 C
Z 7 A
Z 8 B
Z 9 C
Is there a dataframe method to write to a csv or table, for example, in the format above? Or is there a way to vectorize this operation?
You need stack
with reset_index
:
df = df.stack().reset_index()
df.columns = list('ABC')
df = df[['A','C','B']]
print (df)
A C B
0 X 1 A
1 X 2 B
2 X 3 C
3 Y 4 A
4 Y 5 B
5 Y 6 C
6 Z 7 A
7 Z 8 B
8 Z 9 C
And then DataFrame.to_csv
:
print (df.to_csv(sep='\t', index=None, header=None))
X 1 A
X 2 B
X 3 C
Y 4 A
Y 5 B
Y 6 C
Z 7 A
Z 8 B
Z 9 C
df.to_csv('interaction.sif', sep='\t', index=None, header=None)