Strange output using Pandas when converting StringIO with read_csv

When the pandas CSV reader function 'read_csv" is used to convert StringIO values strange characters ('.1') are being appended at the end of the second field when delimiting certain fields. The desired results is the first test, but all the fields do not have spaces after the delimiter (','). Splitting "1.5M, 1.5M" should always return "1.5M", but when there is no spaces it returns the second field with "1.5M.1" (adding '.1' at the end of the field). Is there a way to resolve this issue?

>>>import pandas as pd
>>>from io import StringIO
>>>pd.read_csv(StringIO("1.5M, 1.5M"))
Empty DataFrame
Columns: [1.5M,  1.5M]
Index: []
>>> pd.read_csv(StringIO("1.5M,1.5M"))
Empty DataFrame
Columns: [1.5M, 1.5M.1]
Index: []
>>>

Solution

Notice in the first example with the space, your dataframe has zero rows and your column names include the space in the second column.

 df = pd.read_csv(StringIO("1.5M, 1.5M"))
 df.columns

 Index(['1.5M', ' 1.5M'], dtype='object')

In the second case, zero rows also, but you have duplicate column names without the space.

 df = pd.read_csv(StringIO("1.5M,1.5M"))
 df.columns

 Index(['1.5M', '1.5M.1'], dtype='object')

Hence, Pandas adds the '.1' to the duplicated column name.

However, if you want this '1.5M' as data in the dataframe and not as column headings.

Use

df = pd.read_csv(StringIO("1.5M, 1.5M"), header=None)

OR, it doesn't make a difference in this case:

df = pd.read_csv(StringIO("1.5M,1.5M"), header=None)

Output:

      0     1
0  1.5M  1.5M