Search code examples
pythonmacospytagcloud

Python Pytagcloud osx ValueError invalid literal for int() with base 10: '3)


always getting this error

  ValueError: invalid literal for int() with base 10: '3),'

reading from text file looks like that:

[('cloud', 3), 
('words', 2), 
('code', 1), 
('word', 1), 
('appear', 1)]

as you see I tried to replace some stuff with word.replace()

from pytagcloud import create_tag_image, make_tags
from pytagcloud.lang.counter import get_tag_counts


counts = []
with open("terms.txt") as FIN:
   for line in FIN:
  
       # Assume lines look like: word, number
       word,n = line.strip().split()
       word = word.replace(',', '')
       word = word.replace("'", "")
       word = word.replace("(", "")
       word = word.replace("[", "")
       word = word.replace(")", "")
       word = word.replace(" ", "")
       n = n.replace("'", "")
       n = n.replace(" ", "")

       counts.append([word,int(n.strip())])

       tags = make_tags(counts, maxsize=120)
create_tag_image(tags, 'cloud_large.png', size=(1200, 800), fontname='Crimson Text')


Solution

  • This happens because you're not replacing all non numeric characters from n. Now, the simplest solution (minimum changes) starting from your existing code, is to replace this line:

    counts.append([word,int(n.strip())])
    

    by:

    counts.append([word, int(n.strip(",)]"))])
    

    Of course, the code can be improved/simplified, but more changes are needed. Here's an example (replace this chunk of code from the snippet you provided):

    with open("terms.txt") as FIN:
        for line in FIN:
    
            # Assume lines look like: word, number
            word,n = line.strip().split()
            word = word.replace(',', '')
            word = word.replace("'", "")
            word = word.replace("(", "")
            word = word.replace("[", "")
            word = word.replace(")", "")
            word = word.replace(" ", "")
            n = n.replace("'", "")
            n = n.replace(" ", "")
    
            counts.append([word,int(n.strip())])
    

    by:

    with open("terms.txt") as FIN:
        for line in FIN:
            word, n = line.strip("[](), \r\n").split()
            counts.append([word.strip("',"), int(n.strip())])
    

    There's a 3rd form but that uses eval (which is highly discouraged); this is how you could get your counts contents (note that here, it will be a list of tuples not a list of lists):

    counts = []
    with open("terms.txt") as FIN:
        counts = eval(FIN.read())