Context
I'm converting a PNG sequence into a video using FFMPEG. The images are semi-transparent portraits where the background has been removed digitally.
Issue
The edge pixels of the subject are stretched all the way to the frame border, creating a fully opaque video.
Cause Analysis
The process worked fine in the previous workflow using rembg from command line however, since I started using rembg via python script using alpha_matting to obtain higher quality results, the resulting video has these issues.
The issue is present in both webm format (target) and mp4 (used for testing).
Command Used
Command used for webm is:
ffmpeg -thread_queue_size 64 -framerate 30 -i <png sequence location> -c:v libvpx -b:v 0 -crf 18 -pix_fmt yuva420p -auto-alt-ref 0 -c:a libvorbis <png output>
Throubleshooting Steps Taken
Image of the Issue
Sample PNG Image from sequence
Code Context
rembg call via API using alpha_matting seems to generate a premultiplied alpha which uses non black pixels for 0 alpha pixels.
remove(input_data, alpha_matting=True, alpha_matting_foreground_threshold=250,
alpha_matting_background_threshold=250, alpha_matting_erode_size=12)
A test using a rough RGB reset of 0-alpha pixels confirms that the images are being played with their RGB value ignoring Alpha.
def reset_alpha_pixels(img):
# Open the image file
# Process each pixel
data = list(img.getdata())
new_data = []
for item in data:
if item[3] == 0:
new_data.append((0, 0, 0, 0))
else:
new_data.append((item[0], item[1], item[2], item[3]))
# Replace the alpha value but keep the RGB
# Update the image data
img.putdata(new_data)
return img
Updates
The issue is related to the video player.
Most video players doesn't support transparency, and ignores the alpha (transparency) channel.
The video player displays the rgb content of the background even if the background is supposed to be hidden (background pixels are fully according to their alpha value).
Apparently, rembg output background is not filled with solid black or white, but having the stretched effect.
When opening the PNG image, and when video in Chrome browser for example, the background is transparent (RGB values are hidden), and we can't see the "stretched effect".
Solving the issue using FFMPEG is challenging.
We better fix the issue in the Python code after applying rembg.
For fixing the issue, me may select a solid background color like (200, 200, 200) gray background, and apply alpha compositing between RGB channels and the background.
foreground_rgb = image_after_rembg[:, :, 0:3] # Extract RGB channels.
alpha = image_after_rembg[:, :, 3].astype(np.float32) / 255 # Extract alpha (transparency) channel and convert from range [0, 255] to [0, 1].
alpha = alpha[..., np.newaxis] # Add axis - new alpha shape is (1024, 1024, 1). We need it for scaling 3D rgb by 2D alpha channel.
background_rgb = np.full_like(foreground_rgb, (200, 200, 200)) # Set background RGB color to light gray color (for example).
composed_rgb = foreground_rgb.astype(np.float32) * alpha + background_rgb.astype(np.float32) * (1 -alpha)
composed_rgb = composed_rgb.round().astype(np.uint8) # Convert to uint8 with rounding.
composed_rgba = np.dstack((composed_rgb, alpha_ch))
Complete Python code sample:
from PIL import Image
import numpy as np
#from rembg import remove
#image_file_before_rembg = 'input.png'
image_file_after_rembg = 'frame-00001.png'
# Assume code for removing background looks as follows:
#image_before_rembg = Image.open(image_file_before_rembg)
#image_after_rembg = remove(image_before_rembg)
#image_after_rembg.save(image_file_after_rembg)
image_after_rembg = Image.open(image_file_after_rembg) # Skip background removing, and read the result from a file.
image_after_rembg = np.array(image_after_rembg) # Convert PIL to NumPy array.
foreground_rgb = image_after_rembg[:, :, 0:3] # Extract RGB channels.
alpha_ch = image_after_rembg[:, :, 3] # Extract alpha (transparency) channel
alpha = alpha_ch.astype(np.float32) / 255 # Convert alpha from range [0, 255] to [0, 1].
alpha = alpha[..., np.newaxis] # Add axis - new alpha shape is (1024, 1024, 1). We need it for scaling 3D rgb by 2D alpha channel.
background_rgb = np.full_like(foreground_rgb, (200, 200, 200)) # Set background RGB color to light gray color (for example).
# Apply "alpha compositing" of rgb and background_rgb
composed_rgb = foreground_rgb.astype(np.float32) * alpha + background_rgb.astype(np.float32) * (1 -alpha)
composed_rgb = composed_rgb.round().astype(np.uint8) # Convert to uint8 with rounding.
composed_rgba = np.dstack((composed_rgb, alpha_ch)) # Add the original alpha channel to composed_rgb
Image.fromarray(composed_rgba).save('new_frame-00001.png') # Save the RGBA output image to PNG file
Executing FFmpeg:
ffmpeg -y -framerate 30 -loop 1 -t 5 -i new_frame-00001.png -vf "format=rgba" -c:v libvpx -crf 18 -pix_fmt yuva420p -auto-alt-ref 0 out.webm
When playing with Chrome browser, the background is transparent.
When playing with VLC Player, the background is light gray:
Using FFmpeg CLI, we have to use alphaextract
, overlay
and alphamerge
filters.
Example (5 seconds at 3fps for testing):
ffmpeg -y -framerate 3 -loop 1 -i frame-00001.png -filter_complex "color=white:r=3[bg];[0:v]format=rgba,split=2[va][vb];[vb]alphaextract[alpha];[bg][va]scale2ref[bg0][v0];[bg0][v0]overlay=shortest=1,format=rgb24[rgb];[rgb][alpha]alphamerge" -c:v libvpx -crf 18 -pix_fmt yuva420p -auto-alt-ref 0 -t 5 out.webm