Search code examples
pythonperformanceaudiofile-manipulation

(python3.6)How to check audio data from video file and move it effectively


I have a folder full of videos. Some of them have audio and others are mute( literally no audio stream). My goal with the follwoing small program i've made is to move the videos without audio to a folder named gifs.

My questions is : How can i optimize?

Here it is the progamm:

from subprocess import check_output
from os.path import join,splitext
from os import rename,listdir
from shutil import move




def noAudio(path):
    cmd =("ffprobe -i {0} -show_streams -select_streams a  -loglevel error".format(path))
    output = check_output(cmd,shell=True)
    boolean = (output == b'')
    return boolean

def del_space(file):
    rename(join(src,file),join(src,file.replace(' ','')))
    Newf = file.replace(' ','')
    return Newf

def StoreNoAudio(src,dist):

    target = [".mp4",".MP4",".gif"]
    GifMoved = 0
    print("processing...")

    for file in listdir(src):   
        direction,extension = splitext(file) 

        try:

            if extension in target:
                #find space related errors and correct them
                if ' ' in file:
                    file = del_space(file)

                path = join(src,file)
                distination = join(dist,file)
                #move file without audio streams
                if(extension == '.gif' or noAudio(path) == True):
                    move(path,distination)
                    GifMoved += 1
        except Exception as e:
            print(e)


    print('Mute videos moved:',GifMoved)
    print('finished!')


dist = (r"C:\Users\user\AppData\Roaming\Phyto\G\Gif")
src =  (r"C:\Users\user\AppData\Roaming\Phyto\G\T")

StoreNoAudio(src,dist)

*I'm new to stackoverflow feel free to tell me if i'm doing something wrong.


Solution

  • If I understand correctly, your program works correctly already and you are looking for ways to reduce running time.

    You could use the multiprocessing package to parallelize your program into per-file subprocesses.

    To do so, put the code in your for loop into a function (let's call it process_file), and then:

    import multiprocessing
    pool = multiprocessing.Pool(multiprocessing.cpu_count())
    pool.map(process_file, listdir(src))
    

    This will create as many subprocesses as there are cpus/cores and will distribute the work onto those. This should result in a significant reduction of running time, depending on the number of available cores in your machine of course.

    Keeping track of the number of moved files does not directly work anymore, however, because the variable GifMoved is not accessible to the child processes. Your function could return a 1 if the file was moved and a 0 if not, then you could sum up the results of all calls to process_file like this:

    GifMoved = sum(pool.map(process_file, listdir(src))) # instead of last line above