Working with pandas dataframe suppose i have data frame with similar structure as following:
import pandas as pd
a_choise = ["True", "False", "False", "False", "True", "False", "False", "True", "True"]
b_choise = ["True", "True", "False", "False", "False", "False", "True", "True", "True"]
c_choise = ["False", "False", "True", "False", "True", "True", "False", "True", "False"]
a_n = ["a1", "a2", "a3", "a4", "a5", "a6", "a7", "a8", "a9"]
b_n = ["b1", "b2", "b3", "b4", "b5", "b6", "b7", "b8", "b9"]
c_n = ["c1", "c2", "c3", "c4", "c5", "c6", "c7", "c8", "c9"]
df = pd.DataFrame(
{"a": list(range(1, 10)), "b": list(range(11, 20)), "c": range(21, 30),
"a_Cho":a_choise, "b_Cho":b_choise, "c_Cho":c_choise,
"a_n":a_n, "b_n":b_n, "c_n":c_n}
)
a b c a_Cho b_Cho c_Cho a_n b_n c_n
0 1 11 21 True True False a1 b1 c1
1 2 12 22 False True False a2 b2 c2
2 3 13 23 False False True a3 b3 c3
3 4 14 24 False False False a4 b4 c4
4 5 15 25 True False True a5 b5 c5
5 6 16 26 False False True a6 b6 c6
6 7 17 27 False True False a7 b7 c7
7 8 18 28 True True True a8 b8 c8
8 9 19 29 True True False a9 b9 c9
I want a new 2 columns (Choise, Value) that meet the follwoing conditions for all values in "a_Cho", "b_Cho", And "c_Cho"
Thank you for updating your question. Following your edit, I believe .loc
will be useful to you. .loc
allows us to perform "logical indexing", grabbing rows based on column equalities.
For example, the following gets all rows where column a_Cho
is equal to "True"
,
>>> df.loc[df.a_Cho.eq('True'), ['a_n', 'a']]
a b c a_Cho b_Cho c_Cho a_n b_n c_n
0 1 11 21 True True False a1 b1 c1
4 5 15 25 True False True a5 b5 c5
7 8 18 28 True True True a8 b8 c8
8 9 19 29 True True False a9 b9 c9
We can also select a subset of columns using .loc
.
>>> df.loc[df.a_Cho.eq("True"), ["a_n", "a"]].rename(columns={"a_n": "choise", "a": "value"})
where I've used .rename()
to rename the columns.
Using the above approach, we can perform logical indexing for each of the conditions you state then concatenate the results.
def new_col_names(x):
return {x + "_n": "choise", x: "value"}
# logical criteria
only_a = df.a_Cho.eq("True")
only_b = df.a_Cho.eq("False") & df.b_Cho.eq("True")
only_c = df.a_Cho.eq("False") & df.b_Cho.eq("False") & df.c_Cho.eq("True")
invalid = df.a_Cho.eq("False") & df.b_Cho.eq("False") & df.c_Cho.eq("False")
df_a = df.loc[only_a, ["a_n", "a"]].rename(columns=new_col_names("a"))
df_b = df.loc[only_b, ["b_n", "b"]].rename(columns=new_col_names("b"))
df_c = df.loc[only_c, ["c_n", "c"]].rename(columns=new_col_names("c"))
df_inv = df.loc[invalid].assign(choise="invalide", value="invalide").copy()
df_inv = df_inv[["choise", "value"]]
df_new = pd.concat([df_a, df_b, df_c, df_inv])
The final DataFrame
looks like:
choise value
0 a1 1
4 a5 5
7 a8 8
8 a9 9
1 b2 12
6 b7 17
2 c3 23
5 c6 26
3 invalide invalide
Note that the index values along the left-hand side show the original row numbers of each entry. If you don't care about these numbers you can pass the ignore_index=True
option to .concat
.
Is this closer to what you wanted?
Hello and welcome to StackOverflow! I'm not sure if I completely understand your question. For example, in your sample code it does not appear that you use the loop variable x
in each iteration. It may help to see an example DataFrame
that has the same structure as the one you are working with.
My impression is that your question may be similar to this question which uses pd.melt
.
Is your DataFrame
structured like the following?
>>> import pandas as pd
>>> df = pd.DataFrame(
{"A": list(range(1, 10)), "B": list(range(11, 20)), "C": range(21, 30)}
)
>>> df.head()
A B C
0 1 11 21
1 2 12 22
2 3 13 23
3 4 14 24
4 5 15 25
5 6 16 26
6 7 17 27
7 8 18 28
8 9 19 29
If so, you can use pd.melt
to restructure it to have two columns, a "column names" column and a "column values" column, as you describe in your question.
The command and output for the example DataFrame
above would be:
>>> pd.melt(df, value_vars=['A', 'B', 'C'])
variable value
0 A 1
1 A 2
2 A 3
3 A 4
4 A 5
5 A 6
6 A 7
7 A 8
8 A 9
9 B 11
10 B 12
11 B 13
12 B 14
13 B 15
14 B 16
15 B 17
16 B 18
17 B 19
18 C 21
19 C 22
20 C 23
21 C 24
22 C 25
23 C 26
24 C 27
25 C 28
26 C 29
Is this similar to what you are asking? If not, could you provide an example of the DataFrame
you are working with and an example of what you want the final result to look like? It could be a simplified or "mock" example.