I used random module in my ptoject. Im not a programmer, but i need to do a script which will create a randomized database from another database. In my case, I need the script to select random words lined up in a column from the "wordlist.txt" file, and then also absolutely randomly line them up in a line of 12 words, write them to another file (for example, "result1.txt") and switch to a new line, and so many times, ad infinitum. Everything seems to be working, but I noticed that it adds absolutely all words, except for words consisting of 8 letters. And also i want to increase the perfomance of this code.
Code:
import random
# Initial wordlists
fours = []
fives = []
sixes = []
sevens = []
eights = []
# Fill above lists with corresponding word lengths from wordlist
with open('wordlist.txt') as wordlist:
for line in wordlist:
if len(line) == 4:
fours.append(line.strip())
elif len(line) == 5:
fives.append(line.strip())
elif len(line) == 6:
sixes.append(line.strip())
elif len(line) == 7:
sevens.append(line.strip())
elif len(line) == 8:
eights.append(line.strip())
# Create new lists and fill with number of items in fours
fivesLess = []
sixesLess = []
sevensLess = []
eightsLess = []
fivesCounter = 0
while fivesCounter < len(fours):
randFive = random.choice(fives)
if randFive not in fivesLess:
fivesLess.append(randFive)
fivesCounter += 1
sixesCounter = 0
while sixesCounter < len(fours):
randSix = random.choice(sixes)
if randSix not in sixesLess:
sixesLess.append(randSix)
sixesCounter += 1
sevensCounter = 0
while sevensCounter < len(fours):
randSeven = random.choice(sevens)
if randSeven not in sevensLess:
sevensLess.append(randSeven)
sevensCounter += 1
eightsCounter = 0
while eightsCounter < len(fours):
randEight = random.choice(eights)
if randEight not in eightsLess:
eightsLess.append(randEight)
eightsCounter += 1
choices = [eights]
# Generate n number of seeds and print
seedCounter = 0
while seedCounter < 1:
seed = []
while len(seed) < 12:
wordLengthChoice = random.choice(choices)
wordChoice = random.choice(wordLengthChoice)
seed.append(wordChoice)
seedCounter += 0
with open("result1.txt", "a") as f:
f.write(' '.join(seed))
f.write('\n')
If I'm understanding you correctly, something like
import random
from collections import defaultdict
words_by_length = defaultdict(list)
def generate_line(*, n_words, word_length):
return ' '.join(random.choice(words_by_length[word_length]) for _ in range(n_words))
with open('wordlist.txt') as wordlist:
for line in wordlist:
line = line.strip()
words_by_length[len(line)].append(line)
for x in range(10):
print(generate_line(n_words=12, word_length=8))
should be enough – you can just use a single dict-of-lists to contain all of the words, no need for separate variables. (Also, I suspect your original bug stemmed from not strip()
ing the line before looking at its length.)
If you need a single line to never repeat a word, you'll want
def generate_line(*, n_words, word_length):
return ' '.join(random.sample(words_by_length[word_length], n_words))
instead.