I'm trying to extract the audio from a pytube
video, then convert it into wav
format. For extracting the audio from the video, I tried to use moviepy
, but I can't find a way to open a video file from bytes with VideoFileClip
. I don't want to keep saving files then reading them.
My attempt:
from pytube import YouTube
import moviepy.editor as mp
yt_video = BytesIO()
yt_audio = BytesIO()
yt = YouTube(text)
videoStream = yt.streams.get_highest_resolution()
videoStream.stream_to_buffer(yt_video) # save video to buffer
my_clip = mp.VideoFileClip(yt_video) # processing video
my_clip.audio.write_audiofile(yt_audio) # extracting audio from video
You can get the URL of the stream and extract the audio using ffmpeg-python.
ffmpeg-python module executes FFmpeg as sub-process and reads the audio into memory buffer.
FFmpeg transcode the audio to PCM codec in a WAC container (in memory buffer).
The audio is read from stdout pipe of the sub-process.
Here is a code sample:
from pytube import YouTube
import ffmpeg
text = 'https://www.youtube.com/watch?v=07m_bT5_OrU'
yt = YouTube(text)
# https://github.com/pytube/pytube/issues/301
stream_url = yt.streams.all()[0].url # Get the URL of the video stream
# Probe the audio streams (use it in case you need information like sample rate):
#probe = ffmpeg.probe(stream_url)
#audio_streams = next((stream for stream in probe['streams'] if stream['codec_type'] == 'audio'), None)
#sample_rate = audio_streams['sample_rate']
# Read audio into memory buffer.
# Get the audio using stdout pipe of ffmpeg sub-process.
# The audio is transcoded to PCM codec in WAC container.
audio, err = (
ffmpeg
.input(stream_url)
.output("pipe:", format='wav', acodec='pcm_s16le') # Select WAV output format, and pcm_s16le auidio codec. My add ar=sample_rate
.run(capture_stdout=True)
)
# Write the audio buffer to file for testing
with open('audio.wav', 'wb') as f:
f.write(audio)
Notes: