I am creating a set of NUM_RECORDS tuples in Python. This is my code.
record_key_list = {(choice(tuple(studentID_list)),
choice(tuple(courseID_list)),
randint(2012, 2016),
choice(semesters),
choice(grades)[0])
for no_use in range(NUM_RECORDS)}
An alternative is to code the problem like this.
record_key_list = set()
while len(record_key_list) < NUM_RECORDS:
record_key_list.add((choice(tuple(studentID_list)),
choice(tuple(courseID_list)),
randint(2012, 2016),
choice(semesters),
choice(grades)[0]))
I timed the two code snippets and they are roughly the same as fast for 20000 records. I prefer the first version of the code stylistically.
Is the first version of the code a correct usage of set comprehension? Or should I always stick to the second method?
EDIT: Improved formatting as suggested. I mostly just copied and pasted from the IDE. Sorry about that, guys.
The first code snippet looks totally fine. If anything, I would extract the record creation to a function for clarity and easier refactoring.
def random_record():
studentID = choice(studentID_list)
courseID = choice(courseID_list)
year = randint(2012, 2016)
semester = choice(semesters)
grade = choice(grades)[0]
return (studentID, courseID, year, semester, grade)
# ...
record_key_list = {random_record() for _ in range(NUM_RECORDS)}