Search code examples
pythonpandascsvrandomiteration

Random iteration over a dataframe


I'm trying to randomly iterate over a Dataframe. The file I'm reading from is a CSV file:

pyquiz.csv

variables,statements,True or False
f1, f_state1, F
f2, true_state1,T
f3, f_state2, F
f20, f_state20, F

Dataframe

df = pd.DataFrame({
    'variables': ['f1', 'f2', 'f3', 'f20'],
    'statements': ['f_state1', 'true_state1', 'f_state2', 'f_state20'],
    'True of False': ['F', 'T', 'F', 'F']
})

After I randomly iterate over the file, I want set a condition based on the third column.

Below is code that I wrote previously that I want to try to accomplish something similar using Pandas but with a CSV file instead of a list:

if user_input == 'pyquiz':
for value in sorted(quiz_list, key=lambda _: random.random()):
    print(value)

    x = input("Enter T or F: ")

    if value in true_statements and x == 'T':
        print("Correct!")
        y = input('\nPress enter to continue: ')

    if value in true_statements and x == 'F':
        print("Incorrect.")

    if value in false_statements and x == 'F':
        print("Correct!")
        y = input("\nPress enter to continue:\n ")

    if value in false_statements and x == 'T':
        print("Incorrect.")

Solution

  • You can try:

    from random import shuffle
    
    idx = df.index.to_list()  # get index to a list
    shuffle(idx)              # shuffle the list using `random.shuffle()`
    
    for i in idx:             # iterate over the shuffled list
        print(df.iloc[i])     # access the index using `.iloc`
    

    Prints (for example):

    variables                 f2
    statements       true_state1
    True or False              T
    Name: 1, dtype: object
    
    variables              f1
    statements       f_state1
    True or False           F
    Name: 0, dtype: object
    
    variables              f3
    statements       f_state2
    True or False           F
    Name: 2, dtype: object
    
    variables              f20
    statements       f_state20
    True or False            F
    Name: 3, dtype: object
    

    Dataframe used:

      variables   statements True or False
    0        f1     f_state1             F
    1        f2  true_state1             T
    2        f3     f_state2             F
    3       f20    f_state20             F