I am still very new to python programming
I have an array I am trying to break down into chuncks. My array seems to have multiple arrays within it (I think).
The output looks something like this:
[array([None, '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
'0', '0', '0', '0', '0', '0', '0', '0', None, None, None],
dtype=object)
array([None, None, '0', '0', '0', '1', '0', '0', '0', '0', None, None,
None, None, None, None, None, None, None, None, None, None, None,
None], dtype=object)
array([None, None, '0', '0', '0', '0', '0', '0', None, None, None, None,
None, None, None, None, None, None, None, None, None, None, None,
None], dtype=object)
This a snippet of the printed output. Is there any way to display this output in one array with 24 columns?
I created my array based off a dataframe I created with 24 columns. I wanted to populate those columns using a for loop. The loop works but it only populates the array.
Here is some sample output from my dataframe. I have 24 "status" columns and a column named "Account Opened Date"
this is the output of one of the status columns:
0 1
1 0
2 P
3 0
4 None
Name: status6, dtype: object
The idea is to take the output of all 24 status columns and place them in new columns named "stat" which will also have a range of 24. so the output of status 24 would be populated in stat 1 and status 23 would populate stat 2 etc.
I saw this example of how to break an array into chunks but I couldn't get the output I wanted. https://www.geeksforgeeks.org/break-list-chunks-size-n-python/
from datetime import date
import pandas as pd
df = pd.read_sql(sql,cnxn)
#add stat1-24 into the data frame
df = df.join(pd.DataFrame({
'stat1':'','stat2':'','stat3':'','stat4':'',
'stat5':'','stat6':'','stat7':'','stat8':'',
'stat9':'','stat10':'','stat11':'','stat12':'',
'stat13':'','stat14':'','stat15':'','stat16':'',
'stat17':'','stat18':'','stat19':'','stat20':'',
'stat21':'','stat22':'','stat23':'','stat24':'',},index=df.index))
#call status1-24 from the data frame and store the columns in an array
status = df.as_matrix(columns=df.columns[6:30])
#call stat1-24 from the data frame and store the columns in an array
stat = df.as_matrix(columns=df.columns[31:55])
l = len(df)
#calculate difference in months between startDate and AccountOpenedDate
def monthly_diff(d2,startDate):
return(d2.year - startDate.year) * 12 + d2.month - startDate.month
startDate = date(year=2017, month = 7, day = 1)
df['Difference_IN_Months'] = df['AccountOpenedDate']
for x in range(l):
d2_1=df['AccountOpenedDate'][x]
d2=d2_1.date()
df['Difference_IN_Months'][x]= monthly_diff(d2,startDate)
for i in range(0,23):
if 3 <= 24 - monthly_diff(d2,startDate) - i + 1 <=24:
stat[x,i] = status[24 - monthly_diff(d2,startDate) - i + 1]
else: stat[x,i]=''
print(stat[1,:])
I hope my code isn't too confusing. Everything works fine except the part where my array "stat" should populate my dataframe columns (stat1-stat24) with the relevant data.
This is the best I can understand from your code and question.
import pandas as pd
import numpy as np
start=0
l=[np.array([None, '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
'0', '0', '0', '0', '0', '0', '0', '0', None, None, None],
dtype=object),
np.array([None, None, '0', '0', '0', '1', '0', '0', '0', '0', None, None,
None, None, None, None, None, None, None, None, None, None, None,
None], dtype=object),
np.array([None, None, '0', '0', '0', '0', '0', '0', None, None, None, None,
None, None, None, None, None, None, None, None, None, None, None,
None], dtype=object)]
d={'stat1':'','stat2':'','stat3':'','stat4':'','stat5':'','stat6':'','stat7':'','stat8':'','stat9':'','stat10':'','stat11':'','stat12':'','stat13':'','stat14':'','stat15':'','stat16':'','stat17':'','stat18':'','stat19':'','stat20':'','stat21':'','stat22':'','stat23':'','stat24':''}
df = pd.DataFrame(d,index=[0])
print(df)
for i in l:
df.loc[len(df)] = i
print(df)
output:
stat1 stat2 stat3 stat4 stat5 stat6 stat7 stat8 stat9 ... stat16 stat17 stat18 stat19 stat20 stat21 stat22 stat23 stat24
0 ...
[1 rows x 24 columns]
stat1 stat2 stat3 stat4 stat5 stat6 stat7 stat8 stat9 ... stat16 stat17 stat18 stat19 stat20 stat21 stat22 stat23 stat24
0 ...
1 None 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 None None None
2 None None 0 0 0 1 0 0 0 ... None None None None None None None None None
3 None None 0 0 0 0 0 0 None ... None None None None None None None None None
[4 rows x 24 columns]