Search code examples
mysqlpython-3.xface-recognition

Why can't I save and retrieve my vector(binary) and special characters from the database?


i am using python module 'face_recognition' and 'pickle' and DB 'PostgreSQL' for saving and retrieving my vector(s) of the face from the database, but I'm having a problem with encodings.

Without UTF-8 encoding enabled i can't save spec character '𝕲' to my database.

With UTF-8 encoding enabled I cannot get back my object from the database.

import cv2
import pickle
import face_recognition
import mysql.connector as mysql        
      
conn = mysql.connect(
      host = 'localhost',
      user = 'root',
      passwd = 'superpassword'
    )
    

# Open image
img = face_recognition.load_image_file('test.jpg')
# Get vector
face_vector = face_recognition.face_encodings(img)[0]
    
    
 cur = conn.cursor(buffered=True)
 cur.execute("CREATE DATABASE IF NOT EXISTS test;")
    
 cur.execute("USE test;")
    
 # WITHOUT UTF-8 I GETTING THIS ERROR:
 # Error: mysql.connector.errors.DataError: 1366 (22007): Incorrect string value: '\xF0\x9D\x95\xB2' for column `test`.`test`.`bug_char` at row 1
 #cur.execute("SET NAMES 'utf8';")
 #cur.execute("SET CHARACTER SET utf8;")
    
 cur.execute("CREATE TABLE faces(bug_char VARCHAR(32), vectors BLOB)")
    
 data_insert = ('𝕲', pickle.dumps(face_vector))
 cur.execute('INSERT INTO faces(bug_char, vectors) VALUES(%s, %s)', data_insert)
    
 cur.execute("SELECT * FROM faces;")
 face_data = cur.fetchall()
    
    
 for f in face_data:
    print(pickle.loads(f[1]))
        
    
  # AND WITH UTF-8 I GETTING THIS ERROR WHEN I TRY TO GET MY OBJ FROM DB:
  '''
  Traceback (most recent call last):
  File "/home/user/Desktop/parser_steam/image_recognition/test/./test.py", line 203, in <module>
  print(pickle.loads(f[1]))
  _pickle.UnpicklingError: invalid load key, '?'.
  '''

Sorry for my english.


Solution

  • Every Face encoding is a list of 128 floats... The following maybe a good explanation why your issue being raised and what to do.

    "pickling is recursive, not sequential. Thus, to pickle a list, pickle will start to pickle the containing list, then pickle the first element… diving into the first element and pickling dependencies and sub-elements until the first element is serialized. Then moves on to the next element of the list, and so on, until it finally finishes the list and finishes serializing the enclosing list. In short, it's hard to treat a recursive pickle as sequential, except for some special cases. It's better to use a smarter pattern on your dump, if you want to load in a special way.

    The most common pickle, it to pickle everything with a single dump to a file -- but then you have to load everything at once with a single load. However, if you open a file handle and do multiple dump calls (e.g. one for each element of the list, or a tuple of selected elements), then your load will mirror that… you open the file handle and do multiple load calls until you have all the list elements and can reconstruct the list. It's still not easy to selectively load only certain list elements, however. To do that, you'd probably have to store your list elements as a dict (with the index of the element or chunk as the key) using a package like klepto, which can break up a pickled dict into several files transparently, and enables easy loading of specific elements."

    (source)