Here I want to select the elements in each list which meet the condition that they starts with '6'. However I didn't find the way to achieve it.
The lists are converted from a dataframe:
d = {'c1': ['64774', '60240', '60500', '19303', '38724', '11402'],
'c2': ['', '95868', '95867', '60271', '60502', '19125'],
'c3':['','','','','95867','60500']}
df= pd.DataFrame(data=d)
df
c1 c2 c3
64774
60240 95868
60500 95867
19303 60271
38724 60502 95867
11402 19125 60500
list = df.values.tolist()
list = str(list)
list
[['64774', '', ''],
['60240', '95868', ''],
['60500', '95867', ''],
['19303', '60271', ''],
['38724', '60502', '95867'],
['11402', '19125', '60500']]
I tried the code like:
[x for x in list if x.startswith('6')]
However it only returned '6' for elements meet the condition
['6', '6', '6', '6', '6', '6', '6', '6', '6']
What I'm looking for is a group of lists like:
"[['64774'], ['60240'], ['60500'], ['60271'], ['60502'], ['60500']]"
When you do list = str(list)
you're converting your list to a string representation, i.e. list
becomes
"[['64774', '', ''], ['60240', '95868', ''], ['60500', '95867', ''], ['19303', '60271', ''], ['38724', '60502', '95867'], ['11402', '19125', '60500']]"
You then loop through the string with the list comprehension
[x for x in list if x.startswith('6')]
Which produces each individual character in the string which means you just find all occurrences of 6
in the string, hence your result of
['6', '6', '6', '6', '6', '6', '6', '6', '6']
Sidenote: Don't use variable names that shadow builtin functions, like list
, dict
and so on, it will almost definitely cause issues down the line.
I'm not sure if there is any specific reason to use a dataframe/pandas for your question. If not, you could simply use a list comprehension
d = {
'c1': ['64774', '60240', '60500', '19303', '38724', '11402'],
'c2': ['', '95868', '95867', '60271', '60502', '19125'],
'c3':['','','','','95867','60500']
}
d2 = [[x] for v in d.values() for x in v if x.startswith('6')]
# d2: [['64774'], ['60240'], ['60500'], ['60271'], ['60502'], ['60500']]