I have a cross-tabulation .txt
file of this shape:
A | B | C | D | |
---|---|---|---|---|
A | 9.7 | |||
B | 9.7 | |||
C | 9.7 | 9.7 | ||
D | 9.7 | 9.7 |
However, when I try to read it using
df = pd.read_csv(file, sep='\t', index_col=0)
-where file
is the location of this .txt
file that I'm getting this data frame from-:
Unnamed: 0 | A | B | C | D | |
---|---|---|---|---|---|
A | 9.7 | NaN | |||
B | 9.7 | NaN | |||
C | 9.7 | 9.7 | NaN | ||
D | 9.7 | 9.7 | NaN |
As you can see, it shifts the column labels to the right by adding a new label ("Unnamed: 0").
At this point, I'd only need to do three things:
As I'm an absolute beginner in Python, I can't find the way to do so, although it's probably a very simple task.
For more information regarding how my .txt
file is built -probably the origin of my problem-, here is the python code I'm using to create the cross-tab shown above (the first table pasted):
file_path
is the path where I the cross-tab is created,
unique_nodes
is a list where I store the unique nodes I work with. They are the column and row names of the cross-tab: A, B, C, D
.
node1
and node2
are nodes from that list.
node_weights
are the values of the cells in the table (in this case, 9.7)
with open(file_path, 'w') as file:
# Write header row
file.write('\t' + '\t'.join(unique_nodes) + '\n')
# Write rows of cross-tab data
for node1 in unique_nodes:
file.write(node1 + '\t')
for node2 in unique_nodes:
file.write(str(node_weights[node1][node2]) + '\t')
file.write('\n')
You can just shift the column names to the left by 1 and drop the last column.
df.columns = df.columns.to_list()[1:] + ['temp']
df = df.drop('temp', axis = 1)