I want to access the content of a binary in a git repository using gitpython. Unfortunately repo.git.show
returns an unicode string and not a bytes object. So I want to convert the string into bytes and fail to do that.
#!/usr/bin/env python
from io import BytesIO
import git
# initialize repository
repo = git.Repo('.')
# use git show to get the content of example.jpg in revision 19e91a
u = repo.git.show("4cb2a02:example.jpg")
b = BytesIO(u.encode('utf-8'))
and run into
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcff' in position 0: surrogates not allowed
Which is not a surprise.
How can I convert this unicode string into bytes? Or better, how do i fetch the content of the file as byte object?
try
b = BytesIO(u.encode('utf-8','surrogateescape'))