Search code examples
listseleniumdictionaryurlhelium

How do I loop through python generated list without manually copying and pasting?


Context

I am pretty new to coding and have been learning through videos and trial and error. Though it seems to have run out of steam on this one.

I was able to download a group of youtube links using helium, a simpler version of Selenium. However, I want to loop through these lists to download the transcripts from them.

# Get the links
def Get_links():
    # For the class (categories with segments of information), find them all
    Lnk = find_all(S('.style-scope ytd-video-renderer'))
    fin = []

    # Within this class,
    for l in Lnk:
        # These variables exist
        # The xpath that contains the links
        ind_links = find_all(S('//*[@id="thumbnail"]'))
        # links in this this xpath
        href_list = [e.web_element.get_attribute('href') for e in ind_links]
        # We want to separate the duplicates
        # for every link in the href_lists variable
        for i in href_list:
            # within the empty list 'fin', if it is not in the empty list, then we append it.
            # This makes sense because if there is nothing in the list, then there will only be one copy of the list of links
            if i not in fin:
                fin.append(i)
                
    print(fin)

The output is the list of links

 ['https://www.youtube.com/watch?v=eHnXgh0j500', None, 
  'https://www.youtube.com/watch?v=wDHtXXApfbc', 
  'https://www.youtube.com/watch?v=CJhOGDU636k', 
  'https://www.youtube.com/watch?v=xIB6uNsgFb8', 
  'https://www.youtube.com/watch?v=u7Ckt6A6du8', 
  'https://www.youtube.com/watch?v=PnSC2BY4e7c', 
  'https://www.youtube.com/watch?v=UkIAsYWgciQ', 
  'https://www.youtube.com/watch?v=MqC_k2WxZro', 
  'https://www.youtube.com/watch?v=B0BpL20QHPU', 
  'https://www.youtube.com/watch?v=UujbkSBzuI0', 
  'https://www.youtube.com/watch?v=7Q8ZvFDyjhA', 
  'https://www.youtube.com/watch?v=Z8pVlfulkcw', 
  'https://www.youtube.com/watch?v=fy0clsby3v8', 
  'https://www.youtube.com/watch?v=oYJaLgJL0Ok', 
  'https://www.youtube.com/watch?v=rampRBuDIIQ', 
  'https://www.youtube.com/watch?v=BuhUXD0KH8k', 
  'https://www.youtube.com/watch?v=27mtHjDTgvQ', 
  'https://www.youtube.com/watch?v=kebonpz4bD0', 
  'https://www.youtube.com/watch?v=2KgH0UpiRiw', 
  'https://www.youtube.com/watch?v=TA-P5ilI_Vg', 
  'https://www.youtube.com/watch?v=TOTmOToM6zQ', 
  'https://www.youtube.com/watch?v=CRVYXC2OH7U', 
  'https://www.youtube.com/watch?v=g4TrGD2tDek', 
  'https://www.youtube.com/watch?v=tAO-Ff7_4CE', 
  'https://www.youtube.com/watch?v=fwe-PjrX23o', 
  'https://www.youtube.com/watch?v=Gu7-vlVFUnw', 
  'https://www.youtube.com/watch?v=oXOqExfdKNg', 
  'https://www.youtube.com/watch?v=zrh7P9fgga8', 
  'https://www.youtube.com/watch?v=HVdZ-ccwkj8', 
  'https://www.youtube.com/watch?v=vCdTLteTPtM']

Problem

Is there a way I can go into these links to open them in the browser using helium (or Selenium) to then download the transcripts without copying and pasting them manually as variables and then placing them in a list?


Solution

  • Example

    Your list with urls:

    fin = ['https://www.youtube.com/watch?v=eHnXgh0j500', None, 
      'https://www.youtube.com/watch?v=wDHtXXApfbc', 
      'https://www.youtube.com/watch?v=CJhOGDU636k'
      ]
    

    Looping the list and doing something:

    for url in fin:
        if url: #check for the NONE values
            #do something in selenium e.g. driver.get(url)
            print(url) #or just print