Search code examples
pythondictionaryweb-scrapinganki

Python: How to replace every space in a dictionary to an underscore?


Im still a beginner so maybe the answer is very easy, but I could not find a solution (at least one I could understand) online. Currently I am learning famous works of art through the app "Anki". So I imported a deck for it online containing over 700 pieces.

Sadly the names of the pieces are in english and I would like to learn them in my mother language (german). So I wanted to write a script to automate the process of translating all the names inside the app. I started out by creating a dictionary with every artist and their art pieces (to fill this dictionary automatically reading the app is a task for another time).

art_dictionary = {
    "Wassily Kandinsky": "Composition VIII",
    "Zhou Fang": "Ladies Wearing Flowers in Their Hair",
 }

My plan is to access wikipedia (or any other database for artworks) that stores the german name of the painting (because translating it with a eng-ger dictionary often returns wrong results since the german translation can vary drastically):

  1. replacing every space character inside the name to an underscore

  2. letting python access the wikipedia page of said painting:

    import re
    from urllib.request import urlopen
    painting_name = "Composition_VIII" #this is manual input of course
    
    url = "wikipedia.org/wiki/" + painting_name
    page = urlopen(url)
    
    
  3. somehow access the german version of the site and extracting the german name of the painting.

    html = page.read().decode("utf-8")
    
    pattern = "<title.*?>.*?</title.*?>" #I think Wikipedia stores the title like <i>Title</i>
    match_results = re.search(pattern, html, re.IGNORECASE)
    title = match_results.group()
    title = re.sub("<.*?>", "", title)
    
  4. storing it in a list or variable

  5. inserting it in the anki app

maybe this is impossible or "over-engineering", but I'm learning a lot along the way.

I tried to search for a solution online, but could not find anything similar to my problem.


Solution

  • You can use dictionary comprehension with the replace method to update all the values (names of art pieces in this case) of the dictionary.

    art_dictionary = {
        "Wassily Kandinsky": "Composition VIII",
        "Zhou Fang": "Ladies Wearing Flowers in Their Hair",
     }
     
    art_dictionary = {key:value.replace(' ', '_') for key,value in art_dictionary.items()}
    
    print(art_dictionary)
    
    # Output: {'Wassily Kandinsky': 'Composition_VIII', 'Zhou Fang': 'Ladies_Wearing_Flowers_in_Their_Hair'}