Search code examples
pythonpraw

Python read from a file, and only do work if a string isn't found


So I'm trying to make a reddit bot that will exec code from a submission. I have my own sub for controlling these clients.

while __name__ == '__main__':
    string = open('config.txt').read()
    for submission in subreddit.get_new(limit = 1):
        if submission.url not in string:
            f.write(submission.url + "\n")
            f.close()
            f = open('config.txt', "a")
            string = open('config.txt').read()

So what this is suppose to do is read from the config file, then only do work if the submission url isn't in config.txt. However, it always sees the most recent post and does it's work. This is how F is opened.

if not os.path.exists('file'):
    open('config.txt', 'w').close()
f = open('config.txt', "a")

Solution

  • First a critique of your existing code (in comments):

    # the next two lines are not needed; open('config.txt', "a") 
    # will create the file if it doesn't exist.
    if not os.path.exists('file'):
        open('config.txt', 'w').close()
    f = open('config.txt', "a")
    
    # this is an unusual condition which will confuse readers
    while __name__ == '__main__':
        # the next line will open a file handle and never explicitly close it
        # (it will probably get closed automatically when it goes out of scope,
        # but it's not good form)
        string = open('config.txt').read()
        for submission in subreddit.get_new(limit = 1):
            # the next line should check for a full-line match; as written, it 
            # will match "http://www.test.com" if "http://www.test.com/level2"
            # is in config.txt
            if submission.url not in string:
                f.write(submission.url + "\n")
                # the next two lines could be replaced with f.flush()
                f.close()
                f = open('config.txt', "a")
                # this is a cumbersome way to keep your string synced with the file,
                # and it never explicitly releases the new file handle
                string = open('config.txt').read()
        # If subreddit.get_new() doesn't return any results, this will act as
        # a busy loop, repeatedly requesting new results as fast as possible.
        # If that is undesirable, you might want to sleep here.
    # file handle f should get closed after the loop
    

    None of the problems pointed out above should keep your code from working (except maybe the imprecise matching). But simpler code may be easier to debug. Here's some code that does the same thing. Note: I assume there is no chance any other process is writing to config.txt at the same time. You could try this code (or your code) with pdb, line-by-line, to see whether it works as expected.

    import time
    import praw
    r = praw.Reddit(...)
    subreddit = r.get_subreddit(...)
    
    if __name__ == '__main__':
        # open config.txt for reading and writing without truncating. 
        # moves pointer to end of file; closes file at end of block
        with open('config.txt', "a+") as f:
            # move pointer to start of file
            f.seek(0) 
            # make a list of existing lines; also move pointer to end of file
            lines = set(f.read().splitlines())
    
            while True:
                got_one = False
                for submission in subreddit.get_new(limit=1):
                    got_one = True
                    if submission.url not in lines:
                        lines.add(submission.url)
                        f.write(submission.url + "\n")
                        # write data to disk immediately
                        f.flush()
                        ...
                if not got_one:
                    # wait a little while before trying again
                    time.sleep(10)