Search code examples
pythonhistogramrecoverraw-data

How to reconstruct the raw data from a histogram?


I need to recover the "raw data" from a timing histogram provided by a timing counter as a .csv file.

I've got the code below but since the actual data has several thousands of counts in each bin, a for loop is taking a very long time, so I was wondering if there was a better way.

import numpy as np

# Example histogram with 1 second bins
hist = np.array([[1., 2., 3., 4., 5., 6., 7., 8., 9., 10.], [0, 17, 3, 34, 35, 100, 101, 107, 12, 1]])

# Array for bins and counts
time_bins = hist[0]
counts = hist[1]

# Empty data to append
data = np.empty(0)

for i in range(np.size(counts)):
    for j in range(counts[i]):
        data = np.append(data, [time_bins[i]])

I get that the resolution of the raw data will be the smallest time bin but that is fine for my purposes. In the end, this is to be able to produce another histogram with logarithmic bins, which I am able to do with the raw data.

EDIT

The code I'm using to load the CSV is

x = np.loadtxt(fname, delimiter=',', skiprows=1).T 
a = x[0] 
b = x[1] 

data = np.empty(0) 
for i in range(np.size(b)): 
    for j in range(np.int(b[i])): 
        data = np.append(data, [a[i]])

Solution

  • You can do this with a list comprehension and the numpy concatenation:

    import numpy as np
    hist = np.array([[1., 2., 3., 4., 5., 6., 7., 8., 9., 10.], [0, 17, 3, 34, 35, 100, 101, 107, 12, 1]])
    new_array = np.concatenate([[hist[0][i]]*int(hist[1][i]) for i in range(len(hist[0]))])