I want to:
Download audio files from Youtube
which I have done with pytube, however, it is formatted in mp4 even though I set only_audio to True.
then turn the audio files to numpy arrays
There are libraries that work on mp3, for example, pydub, but not mp4. When I tried moviepy, it failed because there is no video and therefore no framerate. I don't want to download the video because it will take much longer.
note that I want the audio, not the video.
How can:
download audio from youtube, and turn it into numpy arrays?
Thanks for any helps :)
EDIT
Thanks to the comments, I've managed to turn the mp4 into mp3 using ffmpeg
However, when I tried to turn it into numpy arrays using the code from this question, which looks like this:
def read(f, normalized=False):
"""MP3 to numpy array"""
a = pydub.AudioSegment.from_mp3(f)
y = np.array(a.get_array_of_samples())
if a.channels == 2:
y = y.reshape((-1, 2))
if normalized:
return a.frame_rate, np.float32(y) / 2**15
else:
return a.frame_rate, y
it raised this error:
Traceback (most recent call last):
File "C:\Users\myname\Google Drive\Python\Projects\Music\Downloads\Music Read.py", line 63, in <module>
print(read(x,True))
......
File "C:\Users\myname\AppData\Local\Programs\Python\Python36\lib\subprocess.py", line 1017, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
This is weird because as demonstrated below, the path should work perfectly
for f in os.listdir(path):
if (f.endswith(".mp3")):
print(f)
x = 'C:/Users/myname/Google Drive/Python/Projects/Music/Downloads/{}'.format(f)
print(os.path.exists(x))
print(open(x))
print(read(x,True))
outputs:
test-Copy.mp3
True
c:/users/myname/google drive/python/projects/music/downloads/test-copy.mp3
<_io.TextIOWrapper name='c:/users/myname/google drive/python/projects/music/downloads/test-copy.mp3' mode='r' encoding='cp1252'>
Also, when I input a file path that actually doesn't exist, it outputs a different error:
......
File "C:\Users\myname\AppData\Local\Programs\Python\Python36\lib\site-packages\pydub\utils.py", line 57, in _fd_or_path_or_tempfile
fd = open(fd, mode=mode)
FileNotFoundError: [Errno 2] No such file or directory: 'c:/users/myname/google drive/python/projects/music/downloads/hi'
How can use the code from this question to turn the mp3 into numpy arrays, if I can't, how else?
btw I'm running on Win10 with python 3.6
I really hope I have made myself clear enough, and again thanks in advance for any bits of advice :)
This is weird answering my own question but:
I got around the pydub issue by using this code:
def decode (fname):
# If you are on Windows use full path to ffmpeg.exe
cmd = ["C:/Users/allen/Google Drive/Python/Tools/ffmpeg-20190604-d3f236b-win64-static/bin/ffmpeg.exe", "-i", fname, "-f", "wav", "-"]
# If you are on W add argument creationflags=0x8000000 to prevent another console window jumping out
p = Popen(cmd, stdin=PIPE, stdout=PIPE, stderr=PIPE)
data = p.communicate()[0]
return np.fromstring(data[data.find(data)+4:], np.int16)