Search code examples
pythonmedian

python analysing data csv file


I have been trying to simply the way I analyse the data for response times as it would be impossible to do it manual for each participant. However, my code does not seem to work for some reason. So basically want to look at the response times for blocks 1 to 4 with accuracy of 1 and prob_trial of 1, however my code is obviously not allowing me to do it. Do you have any suggestions?

My csv file content looks like this:

Block,Trial_number,Position,Probability Position,Probability State,Probability trial,Response,Accuracy,RT (ms)
1,1,N,None,None,1,N,1,976.451326394
1,2,X,None,None,1,X,1,935.360659205
1,3,M,0.9,0.81,2,M,1,936.700751889
1,4,Z,0.81,None,2,Z,1,904.942057532
1,5,X,0.9,0.81,2,X,1,952.641545009
1,6,Z,0.81,None,2,Z,1,553.098919248

My code is this:

for fnam in d_list:
    if fnam[-4:] == '.csv':

        f_in = path1 + '/' + fnam



        with open(f_in) as csvfile:
            reader = csv.DictReader(csvfile)

            for row in reader:


                block_no.append(int(row['Block']))
                trial_no.append(int(row['Trial_number']))
                prob_trial.append(int(row['Probability trial']))
                accuracy.append(int(row['Accuracy']))
                rt.append(float(row['RT (ms)']))

           for x in block_no:
                if x < 5:f
                    for y in accuracy:
                        if y == 1:
                            for z in prob_trial:
                                if z == 1:
                                    epoch1_improbable.append(rt)

           epoch1_improbable_rt = mean(epoch1_improbable)

Solution

  • This is the perfect use case for pandas with which your desired result would be obtained as

    import pandas as pd
    df = pd.read_csv('data.csv')
    mask = (df['Block'] < 5) & (df['Accuracy'] == 1) & (df['Probability trial'] == 1)
    print(df[mask]['RT (ms)'].mean())  # 955.9059927994999