Search code examples
python-2.7csvnumpydouble-quotes

python/pandas:need help adding double quotes to columns


I need to add double quotes to specific columns in a csv file that my script generates.

Below is the goofy way I thought of doing this. For these two fixed-width fields, it works:

df['DATE'] = df['DATE'].str.ljust(9,'"')
df['DATE'] = df['DATE'].str.rjust(10,'"')

df['DEPT CODE'] = df['DEPT CODE'].str.ljust(15,'"')
df[DEPT CODE'] = df['DEPT CODE'].str.rjust(16,'"')

For the following field, it doesn't. It has a variable length. So, if the value is shorter than the standard 6-digits, I get extra double-quotes: "5673"""

df['ID'] = df['ID'].str.ljust(7,'"')
df['ID'] = df['ID'].str.rjust(8,'"')

I have tried zfill, but the data in the column is a series-- I get "pandas.core.series.Series" when i run

print type(df['ID'])

and I have not been able to convert it to string using astype. I'm not sure why. I have not imported numpy.

I tried using len() to get the length of the ID number and pass it to str.ljust and str.rjust as its first argument, but I think it got hung up on the data not being a string.

Is there a simpler way to apply double-quotes as I need, or is the zfill going to be the way to go?


Solution

  • You can add a speech mark before / after:

    In [11]: df = pd.DataFrame([["a"]], columns=["A"])
    
    In [12]: df
    Out[12]:
       A
    0  a
    
    In [13]: '"' + df['A'] + '"'
    Out[13]:
    0    "a"
    Name: A, dtype: object
    

    Assigning this back:

    In [14]: df['A'] = '"' + df.A + '"'
    
    In [15]: df
    Out[15]:
         A
    0  "a"
    

    If it's for exporting to csv you can use the quoting kwarg:

    In [21]: df = pd.DataFrame([["a"]], columns=["A"])
    
    In [22]: df.to_csv()
    Out[22]: ',A\n0,a\n'
    
    In [23]: df.to_csv(quoting=1)
    Out[23]: '"","A"\n"0","a"\n'