I have a dataframe like this:
import pandas as pd
df = pd.DataFrame({'col1': ['abc', 'def', 'tre'],
'col2': ['foo', 'bar', 'stuff']})
col1 col2
0 abc foo
1 def bar
2 tre stuff
and a dictionary like this:
d = {'col1': [0, 2], 'col2': [1]}
The dictionary contains column names and indices of values to be extracted from the dataframe to generate strings like this:
abc (0, col1)
So, each string starts with the element itself and in parenthesis, the index and column name are shown.
I tried the following list comprehension:
l = [f"{df.loc[{indi}, {ci}]} ({indi}, {ci})"
for ci, vali in d.items()
for indi in vali]
which yields
[' col1\n0 abc (0, col1)',
' col1\n2 tre (2, col1)',
' col2\n1 bar (1, col2)']
So, it is almost ok, just the col1\n0
parts need to be avoided.
If I try
f"{df.loc[0, 'col1']} is great"
I get
'abc is great'
as desired, however, with
x = 0
f"{df.loc[{x}, 'col1']} is great"
I get
'0 abc\nName: col1, dtype: object is great'
How could this be fixed?
What you are seeing is the string representation, and ugly newline \n
characters, of a pd.Series
object returned by the loc
acessor.
You should use pd.DataFrame.at
to return scalars, and note there's no need here for nested {}
for your index labels:
L = [f'{df.at[indi, ci]} ({indi}, {ci})' \
for ci, vali in d.items() \
for indi in vali]
print(L)
['abc (0, col1)', 'tre (2, col1)', 'bar (1, col2)']