Search code examples
pythonpython-3.xfor-loopreadlines

readlines() error with for-loop in python


This error is hard to describe because I can't figure out how the loop is even affecting the readline() and readlines() Methods. When I try using the former, I get these unexpected Traceback errors. When I try the latter, my code runs and nothing happens. I have determined that the bug is located in the first eight lines. The first few lines of the Topics.txt file is posted.

Code

import requests
from html.parser import HTMLParser
from bs4 import BeautifulSoup

Url = "https://ritetag.com/best-hashtags-for/"
Topicfilename = "Topics.txt"
Topicfile = open(Topicfilename, 'r')
Line = Topicfile.readlines()
Linenumber = 0
for Line in Topicfile:
    Linenumber += 1
    print("Reading line", Linenumber)

    Topic = Line
    Newtopic = Topic.strip("\n").replace(' ', '').replace(',', '')
    print(Newtopic)
    Link = Url.join(Newtopic)
    print(Link)
    Sourcecode = requests.get(Link)

When I run this bit here, it prints the the URL preceded by the first character of the line.For example, it prints as 2https://ritetag.com/best-hashtags-for/4https://ritetag.com/best-hashtags-for/Hhttps://ritetag.com/best-hashtags-for/ etc. for 24 Hour Fitness.

Topics.txt

  • 21st Century Fox
  • 24 Hour Fitness
  • 2K Games
  • 3M

Full Error

Reading line 1 24HourFitness 2https://ritetag.com/best-hashtags-for/4https://ritetag.com/best-hashtags-for/Hhttps://ritetag.com/best-hashtags-for/ohttps://ritetag.com/best-hashtags-for/uhttps://ritetag.com/best-hashtags-for/rhttps://ritetag.com/best-hashtags-for/Fhttps://ritetag.com/best-hashtags-for/ihttps://ritetag.com/best-hashtags-for/thttps://ritetag.com/best-hashtags-for/nhttps://ritetag.com/best-hashtags-for/ehttps://ritetag.com/best-hashtags-for/shttps://ritetag.com/best-hashtags-for/s

Traceback (most recent call last): File "C:\Users\Caden\Desktop\Programs\LususStudios\AutoDealBot\HashtagScanner.py", line 17, in Sourcecode = requests.get(Link) File "C:\Python34\lib\site-packages\requests-2.10.0-py3.4.egg\requests\api.py", line 71, in get return request('get', url, params=params, **kwargs) File "C:\Python34\lib\site-packages\requests-2.10.0-py3.4.egg\requests\api.py", line 57, in request return session.request(method=method, url=url, **kwargs) File "C:\Python34\lib\site-packages\requests-2.10.0-py3.4.egg\requests\sessions.py", line 475, in request resp = self.send(prep, **send_kwargs) File "C:\Python34\lib\site-packages\requests-2.10.0-py3.4.egg\requests\sessions.py", line 579, in send adapter = self.get_adapter(url=request.url) File "C:\Python34\lib\site-packages\requests-2.10.0-py3.4.egg\requests\sessions.py", line 653, in get_adapter raise InvalidSchema("No connection adapters were found for '%s'" % url) requests.exceptions.InvalidSchema: No connection adapters were found for '2https://ritetag.com/best-hashtags-for/4https://ritetag.com/best-hashtags-for/Hhttps://ritetag.com/best-hashtags-for/ohttps://ritetag.com/best-hashtags-for/uhttps://ritetag.com/best-hashtags-for/rhttps://ritetag.com/best-hashtags-for/Fhttps://ritetag.com/best-hashtags-for/ihttps://ritetag.com/best-hashtags-for/thttps://ritetag.com/best-hashtags-for/nhttps://ritetag.com/best-hashtags-for/ehttps://ritetag.com/best-hashtags-for/shttps://ritetag.com/best-hashtags-for/s'


Solution

  • I think there are two issues:

    1. You seem to be iterating over Topicfile instead of Topicfile.readLines().
    2. Url.join(Newtopic) isn't returning what you think it is. .join takes a list (in this case, a string is a list of characters) and will insert Url in between each one.

    Here is code with these problems addressed:

    import requests
    
    Url = "https://ritetag.com/best-hashtags-for/"
    Topicfilename = "topics.txt"
    Topicfile = open(Topicfilename, 'r')
    Lines = Topicfile.readlines()
    Linenumber = 0
    for Line in Lines:
        Linenumber += 1
        print("Reading line", Linenumber)
    
        Topic = Line
        Newtopic = Topic.strip("\n").replace(' ', '').replace(',', '')
        print(Newtopic)
        Link = '{}{}'.format(Url, Newtopic)
        print(Link)
        Sourcecode = requests.get(Link)
    

    As an aside, I also recommend using lowercased variable names since camel case is generally reserved for class names in Python :)