I want to extract the least significant bit of the RGB values of an image and convert those bits into ascii equivalent. The problem is that the method which I am adopting of looping over the numpy matrix in python is extremely slow. The same strategy when adopted using Java is about 100 times faster. The image size is not more than 1024 * 1024 and thus the matrix generated is at max 1024 * 1024 * 3 in size.
The append function as per python documentation is of O(1) time complexity and my loop is of O(n^2) where n <= 1024. I understand that the python uses interpreter whereas Java uses JIT compiler for compiling and thus is much faster. However, the time difference is far too great in here.
Can this operation be done in much more efficient way ?
def extract_info_from_lsb(self, path):
lsb_message_result = []
matrix = self.image_to_matrix(path)
for row in matrix:
lsb_message_list = []
for pixel in row:
for color in pixel:
lsb = color & 1
lsb_message_list.append(lsb)
lsb_message_result.append(lsb_message_list)
for i, lsb_message in enumerate(lsb_message_result):
lsb_message_result[i] = self.text_from_bits(lsb_message)
return lsb_message_result
The function for conversion of binary values to ascii which I have adopted is as follows :
def text_from_bits(self, bits):
chars = []
for b in range(len(bits) / 8):
byte = bits[b * 8:(b + 1) * 8]
chars.append(chr(int(''.join([str(bit) for bit in byte]), 2)))
return ''.join(chars)
The function for the conversion of image to matrix is :
def image_to_matrix(self, path):
image = Image.open(path)
matrix = np.array(image)
return matrix
A fast way to get the LSB from an ndarray is to vectorize the modulo operation (i.e. apply it to the whole array) to let numpy do the looping (see comments for explanation):
def extract_info_from_lsb(self, path):
lsb_message_result = []
matrix = self.image_to_matrix(path)
matrix = matrix.astype(int) # make sure the data type is integer (redundant)
lsb_matrix = matrix % 2 # modulo two to get the LSB of each element
lsb_message_result = lsb_matrix.ravel() # flatten to a 1D array
lsb_message_result = lsb_message_result.tolist() # optional: convert to list
Vectorized conversion to ASCII (it assumes the number of pixels in the image is an exact multiple of 8):
def text_from_bits(self, bits):
bits = np.reshape(bits, (-1, 8)) # matrix with 8 elements per row (1 byte)
bitvalues = [128, 64, 32, 16, 8, 4, 2, 1]
bytes = np.sum(bits * bitvalues, axis=1) # rows to bytes
chars = [chr(b) for b it bytes] # convert each byte to a character and put into a list
return ''.join(chars)
Note that you will get ASCII values in the range 0 - 255. This is not strictly ASCII, which traditionally is only in the range 0 - 127.
Relevant performance related concepts: