Search code examples
pythonarrayscsvopencvhistogram

Saving a cv2.calcHist into a single cell in a CSV file using python


I am working on a project using python where I need to save properties of a human that is being detected such as clothes, clothes colour and the Histogram of the detection. I wish to save each property into a single cell so that each row represents a single detection.

Now I wish that someone will tell me how it is best to save a histogram array into a single cell. Is this possible?

CSV, The arrow shows where I want the Histogram to be saved

The ndarray that I want to save in the CSV


Solution

  • You need to decide on a format such that it can be read back in easily. The following approach first creates some random histogram data for 3 rows of data. It first writes a header. Next it converts each histogram into text using the np.array2string() function. It uses a ; separator so as not to confuse it with the standard , for the other values. It then removes all unneeded spaces and newlines so the resulting two dimensional array is on a single line. It then uses a standard csv.writer() to write the rows.

    I then show how the data could be read back in again and converted back to numpy arrays:

    import numpy as np
    import csv
    import ast
    import sys
    
    # Create some data and histograms to save
    row = [(0, "Tops", "maroon"), (1, "Socks", "blue"), (2, "Shirts", "white")]
    hists = [np.random.randint(0, 20, size=(3, 10), dtype=np.int32) for _ in range(3)]
    
    with open('output.csv', 'w', newline='') as f_output:
        csv_output = csv.writer(f_output)
        csv_output.writerow(['index', 'type', 'colour', 'hist'])
        
        for item, hist in zip(row, hists):
            hist_text = np.array2string(hist, separator=';', threshold=sys.maxsize).replace(' ','').replace('\n', '')
            csv_output.writerow([*item, hist_text])
    
    # To read it back from the CSV into a numpy array
    with open('output.csv') as f_input:
        csv_input = csv.reader(f_input)
        header = next(csv_input)
        
        for index, type, colour, hist_text in csv_input:
            print(index, type, colour)
            hist = np.array(ast.literal_eval(hist_text.replace(';', ',')), dtype=np.int32)
            print(hist)
    

    So this would give you output.csv containing something like:

    index,type,colour,hist
    0,Tops,maroon,[[12;6;9;10;2;7;10;8;0;4];[16;10;5;16;15;17;7;14;2;4];[18;17;11;10;19;15;1;1;2;8]]
    1,Socks,blue,[[15;5;4;7;18;12;9;5;6;14];[11;6;12;1;2;11;11;15;0;12];[9;11;7;2;14;18;8;13;15;17]]
    2,Shirts,white,[[11;17;12;9;2;13;19;5;10;14];[16;16;3;8;5;8;6;0;18;15];[5;7;2;4;16;5;11;17;9;19]]
    

    And whilst reading it back in, it would display:

    0 Tops maroon
    [[12  6  9 10  2  7 10  8  0  4]
     [16 10  5 16 15 17  7 14  2  4]
     [18 17 11 10 19 15  1  1  2  8]]
    1 Socks blue
    [[15  5  4  7 18 12  9  5  6 14]
     [11  6 12  1  2 11 11 15  0 12]
     [ 9 11  7  2 14 18  8 13 15 17]]
    2 Shirts white
    [[11 17 12  9  2 13 19  5 10 14]
     [16 16  3  8  5  8  6  0 18 15]
     [ 5  7  2  4 16  5 11 17  9 19]]