Search code examples
pythonexcelpandastextunique

Python Pandas df.Unique display/print contents


I am very new to python and am running into an issue when I am trying to display the data frame I made from unique values in my excel column.

So what is happening is I try to write "aa" I get

[u'a' u'b' u'c' u'd' u'e' u'f']

when what I want is

a, b, c, d, e, f   

or even

[a,b,c,d,e,f]  

whichever is more straightforward. My thought is that since I am using str() it is taking the data frame as is but, if I don't include str() when I write it I get

∞*

p!`!@‹!@˛

as output...

Here is my code:

df = pd.read_excel(open('/Users/keatonmaclean/Desktop/abcc.xlsx','rb'), sheetname='Sheet1')
# Set ipython's max row display
pd.set_option('display.max_row', 1000)

# Set iPython's max column width to 50
pd.set_option('display.max_columns', 50)

df.columns = df.iloc[0]
df = df[1:]

aa = str(df.loc[:,"Supplier"].unique())
#bb = str(df.loc[:,"CT #"].unique())
#cc = str(df.loc[:,"CT DESC"].unique())
#dd = str(df.loc[:,"CT START"].unique())
#ee = str(df.loc[:,"CT END"].unique())



import os.path

save_path = '/Users/keatonmaclean/Desktop/'

#name_of_file = raw_input("What is the name of the file: ")
name_of_file = "test"

completeName = os.path.join(save_path, name_of_file+".txt")         

file1 = open(completeName, "w+")

toFile =  aa 


file1.write(toFile)

file1.close()

Solution

  • I think you need DataFrame or Series constructor with DataFrame.to_csv or Series.to_csv:

    aa = np.array(['a','b','c','d','e','f'])
    
    df = pd.DataFrame([aa])
    print (df)
       0  1  2  3  4  5
    0  a  b  c  d  e  f
    
    df.to_csv(file, index=False)
    

    Or if need one columns:

    s = pd.Series(aa)
    print (s)
    0    a
    1    b
    2    c
    3    d
    4    e
    5    f
    dtype: object
    
    s.to_csv(file, index=False)
    

    But if need create file from all unique values is possible apply function unique.

    But if different length of unique values per column get NaNs and in output NaNs are replace with empty space.

    df = pd.DataFrame({'Supplier':list('abcceb'),
                       'CT #':[4,5,4,5,5,4],
                       'CT DESC':[7,8,9,4,2,3],
                       'CT START':[1,3,5,7,1,0],
                       'CT END':[5,3,6,9,2,4]})
    
    print (df)
       CT #  CT DESC  CT END  CT START Supplier
    0     4        7       5         1        a
    1     5        8       3         3        b
    2     4        9       6         5        c
    3     5        4       9         7        c
    4     5        2       2         1        e
    5     4        3       4         0        b
    
    df = df.apply(lambda x: pd.Series(x.unique())).astype(object)
    print (df)
      CT # CT DESC CT END CT START Supplier
    0    4       7      5        1        a
    1    5       8      3        3        b
    2  NaN       9      6        5        c
    3  NaN       4      9        7        e
    4  NaN       2      2        0      NaN
    5  NaN       3      4      NaN      NaN
    
    df.to_csv(file, index=False)