Generating a List of random strings, then using a for/in loop and also a List Comprehension expresison to fund the longest string and the length of that string.
Both techniques compute the max length correctly, but sometimes the for/in loop finds the same longest word as the List Comprehension, sometimes not. Why? What's the logic error?
import random
import string
def cobble_large_dataset(dataset_number_of_elements):
'''
Build a list of Lists, each List is a String of a random sequence of 1-10 characters
'''
myList = [] # Empty List
for i in range(0,dataset_number_of_elements):
string_length = random.randint(1, 10)
tmp = ''.join(random.choices(string.ascii_uppercase + string.digits, k=string_length)) # https://stackoverflow.com/questions/2257441/random-string-generation-with-upper-case-letters-and-digits
tmp = [tmp]
#print(tmp)
myList.extend([tmp])
return myList
def list_comprehension_test(wordsList):
'''
Process a List of Lists using List Comprehension.
Each List in the List of Lists is a single String
'''
start_time = time.time()
maximumWordLength, longest_word = max([(len(x[0]), x[0]) for x in wordsList]) # This works because x is a List of strings
return ((time.time() - start_time), longest_word, maximumWordLength)
def brute_force_test(wordsList):
'''
Process a List of Lists using a brute-force for/in loop.
Each List in the List of Lists is a single String
'''
start_time = time.time()
maximumWordLength = 0
for word in wordsList:
tmp = word[0]
#print(tmp)
if (len(tmp) >= maximumWordLength):
maximumWordLength = len(tmp)
longest_word = tmp
#print(tmp)
#print(longest_word + " : " + str(maximumWordLength))
return ((time.time() - start_time), longest_word, maximumWordLength)
import time
start_time = time.time()
dataset = cobble_large_dataset(100)
print (str(len(dataset)) + ' Strings generated in ' + str((time.time() - start_time)) + ' seconds.')
# Let's see if both techniques produce the same results:
result_brute_force = brute_force_test(dataset)
print('Results from Brute Force = ' + result_brute_force[1] + ', ' + str(result_brute_force[2]) + ' characters' )
result_list_comprehension = list_comprehension_test(dataset)
print('Results from List Comprehension = ' + result_list_comprehension[1] + ', ' + str(result_list_comprehension[2]) + ' characters' )
if (result_list_comprehension[1] == result_brute_force[1]):
print("Techniques produced the same results.")
else:
print("Techniques DID NOT PRODUCE the same results
In your list comprehension case, you want to tell max
to just operate on the first item in each of the pairs of values in the list. This is the equivalent of what the for-loop case is doing, since it only considers the length of each string. So you want:
maximumWordLength, longest_word = max(
[(len(x[0]), x[0]) for x in wordsList],
key = lambda x: x[0]) # This works because x is a List of strings
As others have already pointed out, you also want to change the >=
comparison in the brute-force case to >
. If you make these two changes, you will get the same result from your two methods.