I am a student who is quite new to C++ and I wanted to code something to get used to this language. A while ago, I coded a Python program that makes ASCII art "videos". Basically, it converts every frame of the video to strings that are stored in a list, and then displays them on the terminal. But the computations were way too slow, so I decided to switch to C++. In the end, I managed to speed up the conversion process by a lot just by switching to C++.
However, as I tried to print the ASCII frames on the terminal, I noticed that there was a weird stuttering problem that didn't occur with Python (see link): Video link
Here is the C++ code snippet that handles the printing.
(...)
auto start_time(chrono::high_resolution_clock::now());
int current_index(0);
int last_index(0);
while (current_index < number_of_frames - 1) {
// Computing the current index
double elapsed_time(chrono::duration_cast<chrono::milliseconds>(chrono::high_resolution_clock::now() - start_time).count() / 1000.);
double percentage(elapsed_time / audio_duration);
current_index = percentage * number_of_frames;
// Print the new frame if it's different from the last one to avoid spamming them
if (current_index != last_index) {
_write(_fileno(stdout), frames[current_index].c_str(), frames[current_index].size());
}
last_index = current_index;
}
(...)
In order to compare it with Python, here is the Python version of that piece of code. It uses the time
and sys
modules.
(...)
start_time = time.time()
current_frame = 0 # nombre de frames passées
last_frame = 0
while current_frame < len(frames) - 1:
elapsed_time = time.time() - start_time
percentage = (elapsed_time / float(audio_duration))
current_frame = int(percentage * number_of_frames)
if current_frame != last_frame:
sys.stdout.write(frames[current_frame])
last_frame = current_frame
(...)
While it may seem like a very specific question, it just didn't make any sense for Python to work better than C++ in this situation. Google didn't help either, so there must be a problem somewhere in my source code.
I tried calling different types of functions like std::cout
, std::prinf
and the raw write
and even messing a bit with the frame data like resizing some of them but to no avail. I really don't know what is causing this issue.
Here are some steps to reproduce the issue in C++ (pseudocode/general outline):
terminal_height*terminal_width
to cover your whole screen. It can be anything, but I recommend an easily identifiable pattern like ASCII art.Here is the full C++ code. It uses OpenCV and FFmpeg. I did not try to optimise anything yet.
#include <iostream>
#include <filesystem>
#include <Windows.h>
#include <fstream>
#include <string>
#include <cmath>
#include <vector>
#include <chrono>
#include <io.h>
#include <opencv2/core/core.hpp>
#include <opencv2/imgcodecs/imgcodecs.hpp>
#pragma comment (lib, "winmm.lib")
using namespace std;
namespace fs = filesystem;
// Prototypes
size_t number_of_files(fs::path path);
int main()
{
// Defining the grey scale ramp
string char_ramp("$@B%8&WM#*oahkbdpqwmZO0QLCJUYXzcvunxrjft/\|()1{}[]?-_+~<>i!lI;:,\" ^ `'. ");
reverse(char_ramp.begin(), char_ramp.end());
// Ask user for input about the video
string video_name("");
int brightness(0);
cout << "Mettez le terminal en plein ecran." << endl;
cout << "Nom de la video et seuil de luminosite : ";
cin >> video_name >> brightness;
// Get terminal info
CONSOLE_SCREEN_BUFFER_INFO csbi;
GetConsoleScreenBufferInfo(GetStdHandle(STD_OUTPUT_HANDLE), &csbi);
const unsigned int target_frame_width(csbi.srWindow.Right - csbi.srWindow.Left + 1);
const unsigned int target_frame_height(csbi.srWindow.Bottom - csbi.srWindow.Top + 1);
// Get video and audio path
const string video_path("videos/" + video_name);
const string audio_path("tmp/audio.wav");
cout << "[1/3] Nettoyage..." << endl;
// Delete previous folders and make new ones for the new video
fs::remove_all(audio_path);
fs::remove_all("tmp/frames");
fs::create_directory("tmp/frames");
cout << "[2/3] Extraction des images et de l'audio..." << endl;
// Ffmpeg extracts frames and audio
system(("ffmpeg -loglevel warning -i " + video_path + " -vf scale=" + to_string(target_frame_width) + ":" + to_string(target_frame_height) + " tmp/frames/%0d.bmp").c_str());
system(("ffmpeg -loglevel warning -i " + video_path + " tmp/audio.wav").c_str());
cout << "[3/3] Conversion..." << endl;
const size_t number_of_frames(number_of_files("tmp/frames"));
vector<string> frames(number_of_frames);
// Frame to ASCII frame conversion
for (int frame_index(1); frame_index <= number_of_frames; ++frame_index) {
string frame_path("tmp/frames/" + to_string(frame_index) + ".bmp");
string current_frame("");
cv::Mat current_bmp(cv::imread(frame_path));
for (int i(0); i < current_bmp.rows; ++i) {
for (int j(0); j < current_bmp.cols; ++j) {
// Le vecteur des couleurs B, G, R du pixel (i, j)
cv::Vec3b bgr(current_bmp.at<cv::Vec3b>(i, j));
// On utilise la luminance
double greyscale_index(sqrt(0.299*bgr[2]*bgr[2] + 0.587*bgr[1]*bgr[1] + 0.114*bgr[0] *bgr[0]));
current_frame += char_ramp[(int)(greyscale_index / 255 * (char_ramp.size() - 1)) / brightness];
}
current_frame += "\n";
}
frames[frame_index - 1] = current_frame;
}
// Grabbing audio duration
FILE* fp;
char var[40];
double audio_duration(0);
fp = _popen(("ffprobe -loglevel warning -i " + audio_path + " -show_entries format=duration -v quiet -of csv=\"p=0\"").c_str(), "r");
while (fgets(var, sizeof(var), fp) != NULL) {
audio_duration = atof(var);
}
_pclose(fp);
PlaySound(TEXT("E:/Programmation/Cpp/SeuillageImage/tmp/audio.wav"), NULL, SND_FILENAME | SND_ASYNC | SND_LOOP);
// Begin printing frames
auto start_time(chrono::steady_clock::now());
int current_index(0);
int last_index(0);
while (current_index < number_of_frames - 1) {
// Computing the current index
double elapsed_time(chrono::duration_cast<chrono::milliseconds>(chrono::steady_clock::now() - start_time).count() / 1000.);
double percentage(elapsed_time / audio_duration);
current_index = percentage * number_of_frames;
// Print the new frame if it's different from the last one to avoid spamming them
if (current_index != last_index) {
_write(_fileno(stdout), frames[current_index].c_str(), frames[current_index].size());
}
last_index = current_index;
}
// Stop audio
PlaySound(NULL, NULL, 0);
return 0;
}
size_t number_of_files(fs::path path) {
return distance(fs::directory_iterator(path), fs::directory_iterator{});
}
Here is the Python version of that code. It uses the Python Imaging Library (PIL
) and an optional progress.bar
module. It is much slower than the C++ code.
import argparse
import os
import shutil
import subprocess
import time
import sys
import wave
import winsound
from math import sqrt
from PIL import Image
from progress.bar import IncrementalBar
parser = argparse.ArgumentParser()
parser.add_argument("video_name", type=str, help="Nom de la vidéo")
parser.add_argument("brightness", type=int, help="Seuil de luminosité")
arguments = parser.parse_args()
# http://paulbourke.net/dataformats/asciiart/
# "Standard" character ramp for grey scale pictures from black to white
char_ramp = """$@B%8&WM#*oahkbdpqwmZO0QLCJUYXzcvunxrjft/\|()1{}[]?-_+~<>i!lI;:,"^`'. """[::-1]
video_name = arguments.video_name
brightness = arguments.brightness
video_path = os.path.join(os.getcwd(), fr"videos\{video_name}")
audio_path = os.path.join(os.getcwd(), fr"tmp\audio.wav")
# ÉTAPE 1 : On supprime les fichiers déjà existants
print("[INFO] Étape 1 sur 3 : nettoyage...")
# On vérifie s'il existe déjà un dossier de stockage temporaire
if os.path.isdir("tmp"):
# On supprime le dossier contenant les images de la vidéo et on en crée un nouveau
if os.path.isdir("tmp/frames"):
shutil.rmtree("tmp/frames", ignore_errors=False)
os.mkdir("tmp/frames")
# Idem pour le fichier audio correspondant
if os.path.isfile("tmp/audio.wav"):
os.remove("tmp/audio.wav")
# Sinon, on crée un nouveau dossier de stockage temporaire
else:
os.mkdir("tmp")
os.mkdir("tmp/frames")
# ÉTAPE 2 : On extrait les images successives de la vidéo avec ffmpeg une par une pour les convertir
print("[INFO] Étape 2 sur 3 : extraction des images et de l'audio...")
# On définit la longueur et la largeur de la fenêtre d'exécution
window_size = os.get_terminal_size()
target_frame_width = window_size[0] - 1
target_frame_height = window_size[1]
# On utilise le module externe ffmpeg pour extraire les images successives de la vidéo
subprocess.call(
f"ffmpeg -loglevel warning -i {video_path} -vf scale={target_frame_width}:{target_frame_height} tmp/frames/%0d.bmp")
subprocess.call(f"ffmpeg -loglevel warning -i {video_path} tmp/audio.wav")
# ÉTAPE 3 : On convertit la vidéo en ASCII
print("[INFO] Étape 3 sur 3 : conversion de la vidéo en cours, cela peut prendre un certain temps...")
# On crée une liste dans laquelle on stocke les images et quelques autres variables utiles
frames = []
number_of_frames = len(
[frame for frame in os.listdir("tmp/frames") if os.path.isfile(os.path.join("tmp/frames", frame))])
bar = IncrementalBar("Progression", max=number_of_frames)
for frame_index in range(1, number_of_frames + 1):
frame_path = os.path.join(os.getcwd(), fr"tmp\frames\{frame_index}.bmp")
# On construit l'image ASCII à partir de niveaux de gris dans l'image bitmap redimensionnée obtenue
frame_builder = ""
with Image.open(frame_path) as frame:
# Pour chaque pixel de l'image
for y in range(frame.height):
for x in range(frame.width):
# Conversion de chaque pixel en niveau de gris
rgb = frame.getpixel((x, y))
# On utilise la luminance plutôt que la moyenne RGB
greyscale_index = sqrt(0.299 * rgb[0] ** 2 + 0.587 * rgb[1] ** 2 + 0.114 * rgb[2] ** 2)
# On ajoute le caractère ASCII à la chaine une fois identifié (mapping linéaire)
frame_builder += char_ramp[int(greyscale_index / 255 * (len(char_ramp) - 1)) // brightness]
# On revient à la ligne pour chaque ligne traitée
frame_builder += "\n"
# On ajoute l'image Ascii crée à la liste des images de la vidéo
frames.append(frame_builder)
bar.next()
bar.finish()
# On initialise les fichiers audio et vidéo pour des calculs de fréquence
audio_wave = wave.open(audio_path, "rb")
audio_sample_rate = audio_wave.getframerate()
audio_frames = audio_wave.getnframes()
audio_duration = audio_frames / audio_sample_rate
# On joue la musique et on lance le chrono
winsound.PlaySound(audio_path, winsound.SND_ASYNC)
start_time = time.time()
# On imprime les frames en fonction de leur position dans le temps par rapport à l'audio (synchronisation)
current_frame = 0 # nombre de frames passées
last_frame = 0
while current_frame < len(frames) - 1:
elapsed_time = time.time() - start_time
percentage = (elapsed_time / float(audio_duration))
current_frame = int(percentage * number_of_frames)
if current_frame != last_frame:
sys.stdout.write(frames[current_frame])
last_frame = current_frame
I hope that this extra information can help you understand my problem better.
Basically, cmd.exe is too slow for my program. I switched to Windows Terminal and it worked perfectly fine. I also had to adjust the cursor's position so that it stays at (0, 0) to prevent the screen from scrolling.