Search code examples
bashaudio

Total length of all audio files (various formats) in directory (with several layers of subfolders) (Linux)


I have a large directory with various kinds of audio files, mp3, ogg, opus, m4a, and possibly others.

I'd like to get the combined length of all these files. If it were only a small number of files, I'd copy them all into another folder and use mp3info, but it's several hundred GB so that's not an option.

I don't think I can use mp3info in my usecase as is, because I can't find an option for it to search recursively. mp3info2 seems to have a recursive option with -R but I can't find a lot of info on mp3info2 so I'm not sure how to use it.

I've tried this

tot=0; while read -r i; do tmp=0;  tmp=`ffprobe "$i" -show_format 2>/dev/null | grep "^duration" | cut -d '=' -f 2 | cut -d '.' -f 1`; if [ -n "$tmp" ]; then let tot+=$tmp; fi;    done < <(find . -type f -iname "*[.mp3,.wav,.m3u,.m4a,.m4b,.mpga,.opus,.opus]"); echo "Total duration: $(($tot/60)) minutes"

but get

bash: let: tot+=N/A: division by 0 (error token is "A")
bash: let: tot+=N/A: division by 0 (error token is "A")
bash: let: tot+=N/A: division by 0 (error token is "A")
bash: let: tot+=N/A: division by 0 (error token is "A")

repeating.

I've tried soxi -D *.mp3 and then would do the different file types but get this

soxi FAIL formats: can't open input file `*.mp3': No such file or directory

The directory format is Letter/Author/Book Title, e.g. K/King, Stephen/The Stand/The Stand.mp3

As a bonus question: How would I do this same thing for video files (in a different directory)

Thanks


Solution

  • It's most probably because your find will find a lot of files that are not multimedia files. When encountering such a file, ffprobe may output something like this:

    [FORMAT]
    filename=./pong.cpp
    nb_streams=1
    nb_programs=0
    format_name=xbm_pipe
    format_long_name=piped xbm sequence
    start_time=N/A
    duration=N/A
    size=17182
    bit_rate=N/A
    probe_score=99
    [/FORMAT]
    

    As you can see, duration is here N/A.

    • First, fix the find command. Either use multiple -inames with -o (or) between, or use -regex instead of iname if your find supports it.
    • Even if you've found a file with the correct filename, it may be corrupt and the duration may still come back as N/A so make the check safer by checking that you got a number back.

    Example:

    #!/bin/bash
    
    tot=0
    while read -r i; do
        tmp=$(ffprobe "$i" -show_format -loglevel -8 |
              sed -nE 's/^duration=([0-9]+).*$/\1/p')
    
        if [[ $tmp =~ ^[0-9]+$ ]]; then
            (( tot+=tmp ))
        fi
    done < <(find . -type f -iname '*.mp3' -o -iname '*.wav' -o -iname '*.m3u' -o \
                            -iname '*.m4[ab]' -o -iname '*.mpga' -o -iname '*.opus')
    
    echo "Total duration: $((tot/60)) minutes"
    

    An alternative could be to collect all the durations, including the subsecond part, in an array and then form an arithmetic expression that you pass on to bc:

    #!/bin/bash
    
    join() {
        local IFS="$1"
        shift
        echo "$*"
    }
    
    readarray -t durations < <(
        find . -type f \( -iname '*.mp3' -o -iname '*.wav' -o -iname '*.m3u' -o \
                          -iname '*.m4[ab]' -o -iname '*.mpga' -o -iname '*.opus' \) \
                          -exec ffprobe {} -show_format -loglevel -8 \; |
        sed -nE 's/^duration=([0-9\.]+)$/\1/p')
    
    # add + between all durations:
    expression=$(join + ${durations[@]})
    
    # calculate the total, including subseconds, round to whole minutes:
    tot=$(bc -q <<< "scale=0;($expression)/60")
    
    echo "Total duration: $tot minutes"