Search code examples
pythonpython-3.xpandasgisgeo

How to replace multiple values to one in Python3


I am currently trying to get countries from rows of data frame. Here is the code that i currently have:

l = [
 ['[Aydemir, Deniz\', \' Gunduz, Gokhan\', \' Asik, Nejla] Bartin 
   Univ, Fac Forestry, Dept Forest Ind Engn, TR-74100 Bartin, 
   Turkey\', \' [Wang, Alice] Lulea Univ Technol, Wood Technol, 
   Skelleftea, Sweden',1990],
 ['[Fang, Qun\', \' Cui, Hui-Wang] Zhejiang A&F Univ, Sch Engn, Linan 
   311300, Peoples R China\', \' [Du, Guan-Ben] Southwest Forestry 
   Univ, Kunming 650224, Yunnan, Peoples R China',2005],
 ['[Blumentritt, Melanie\', \' Gardner, Douglas J.\', \' Shaler 
   Stephen M.] Univ Maine, Sch Resources, Orono, ME USA\', \' [Cole, 
   Barbara J. W.] Univ Maine, Dept Chem, Orono, ME 04469 USA',2012],
 ['[Kyvelou, Pinelopi; Gardner, Leroy; Nethercot, David A.] Univ 
   London Imperial Coll Sci Technol & Med, London SW7 2AZ, 
   England',1998]]
dataf = pd.DataFrame(l, columns = ['Authors', 'Year'])

This is the data frame. And here is the code:

df = (dataf['Authors']
  .replace(r"\bUSA\b", "United States", regex=True)
  .apply(lambda x: geotext.GeoText(x).countries))

The problem was that GeoText didn't recognize "USA", but now I also saw that I need to change "England", "Scotland", "Wales" and "Northern Ireland" to "United Kingdom". How can I extend .replace to achieve this?


Solution

  • You can use the translate method of the Series.str module and pass a dictionary of replacements.

    dataf.Authors.str.translate({
        'USA': 'United States', 
        "England": "United Kingdom", 
        "Scotland": "United Kingdom", 
        "Wales": "United Kingdom",
        "Northern Ireland": "United Kingdom"
    })