I am trying to read and append to a file but when I am using context manager it doesn't seem to work.
In this code I am trying to get all links on a site that contain one of the items in my 'serien' list. If the link is in the list, I am first checking whether the link is already in the file. If the link is found, it is supposed to not append the link again. But it does.
I am either guessing that I am not using the right mode or that I somehow screwed up with my context manager. Or am I completely wrong
import requests
from bs4 import BeautifulSoup
serien = ['izombie', 'grandfathered', 'new-girl']
serien_links = []
#Gets chapter links
def episode_links(index_url):
r = requests.get(index_url)
soup = BeautifulSoup(r.content, 'lxml')
links = soup.find_all('a')
url_list = []
for url in links:
url_list.append((url.get('href')))
return url_list
urls_unfiltered = episode_links('http://watchseriesus.tv/last-350-posts/')
with open('link.txt', 'a+') as f:
for serie in serien:
for x in urls_unfiltered:
#check whether link is already in file. If not write link to file
if serie in x and serie not in f.read():
f.write('{}\n'.format(x))
This is my first time using context managers. Tips would be appreciated.
Edit: Similar project without context manager. Here I also tried using context managers but gave up after I had the same problem.
file2_out = open('url_list.txt', 'a') #local url list for chapter check
for x in link_list:
#Checking chapter existence in folder and downloading chapter
if x not in open('url_list.txt').read(): #Is url of chapter in local url list?
#push = pb.push_note(get_title(x), x)
file2_out.write('{}\n'.format(x)) #adding downloaded chapter to local url list
print('{} saved.'.format(x))
file2_out.close()
And with context manager:
with open('url_list.txt', 'a+') as f:
for x in link_list:
#Checking chapter existence in folder and downloading chapter
if x not in f.read(): #Is url of chapter in local url list?
#push = pb.push_note(get_title(x), x)
f.write('{}\n'.format(x)) #adding downloaded chapter to local url list
print('{} saved.'.format(x))
as @martineau mentioned f.read()
reads the whole file and then gets empty string. try the below code. it reads the contents to list and later comparisons happens on the list.
import requests
from bs4 import BeautifulSoup
serien = ['izombie', 'grandfathered', 'new-girl']
serien_links = []
# Gets chapter links
def episode_links(index_url):
r = requests.get(index_url)
soup = BeautifulSoup(r.content, 'lxml')
links = soup.find_all('a')
url_list = []
for url in links:
url_list.append((url.get('href')))
return url_list
urls_unfiltered = episode_links('http://watchseriesus.tv/last-350-posts/')
with open('link.txt', 'a+') as f:
cont = f.read().splitlines()
for serie in serien:
for x in urls_unfiltered:
# check whether link is already in file. If not write link to file
if (serie in x) and (x not in cont):
f.write('{}\n'.format(x))