Search code examples
python-3.xpandaspandas-datareader

How to handle such errors?


companies = pd.read_csv("http://www.richard-muir.com/data/public/csv/CompaniesRevenueEmployees.csv", index_col = 0)

companies.head()

I'm getting this error please suggest what approaches should be tried.

"utf-8' codec can't decode byte 0xb7 in position 7"


Solution

  • Downloading the file and opening it in notepad++ shows it is ansi-encoded. If you are on a windows system this should fix it:

    import pandas as pd
    
    url = "http://www.richard-muir.com/data/public/csv/CompaniesRevenueEmployees.csv"
    
    companies = pd.read_csv(url, index_col = 0, encoding='ansi')
    
    print(companies)
    

    If not (on windows), you need to research how to convert ansi-encoded text to something you can read.

    See: https://docs.python.org/3/library/codecs.html#standard-encodings

    Output:

                                           Name              Industry  \
    0                                   Walmart                Retail
    1                             Sinopec Group           Oil and gas
    2      China National Petroleum Corporation           Oil and gas
    ...                                     ...                   ...
    47               Hewlett Packard Enterprise           Electronics
    48                               Tata Group          Conglomerate
    
        Revenue (USD billions)  Employees
    0                      482    2200000
    1                      455     358571
    2                      428    1636532
    ...                    ...        ...
    47                     111     302000
    48                     108     600000