Search code examples
vb.netparsingvideotextffprobe

Populate field values in a VB form by parsing ffprobe output


Ok. I am attempting to grab pertinent media information from video files. To do this, I run ffprobe from my script. Something like this:

Shell("cmd /c [ffpath]\ffprobe.exe -probesize 1000000 -hide_banner -i ""[path]\[video].mp4"" >""[path]\[video]_probe.log"" 2>&1")

Which creates a file containing something like this:

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '[path]\[video].mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf57.8.102
  Duration: 00:04:34.41, start: 0.033333, bitrate: 957 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 720x480 [SAR 32:27 DAR 16:9], 820 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      handler_name    : SoundHandler

I say, "something like this," since the data on the Stream #0:0(und): line is not always in the same order. I need to grab the following information from the log:

  • Duration (00:04:34.41)
  • Codec (h264)
  • Resolution (720x480)
  • Width (from Resolution before the x)
  • Height (from Resolution after the x)
  • Frame rate (29.97 fps)
  • Audio Codec (aac)
  • Audio Sampling Freq (48 KHz)
  • Audio Quality (192 kb/s)

In a batch script, I would use tokens and delimiters to scan for my information, and check to see that I grabbed the correct data; if not, try the next token.

Here is an example of how I performed the task in batch script:

REM Find frame rate and resolution
set count=0
for /f "tokens=1-18* delims=," %%a in (input.log) do (
    REM we only need to parse the first line
    if !count!==0 (
        set fps=%%e
        echo !fps! >res.tmp
        findstr "fps" res.tmp >nul
        if not !errorlevel!==0 (
            set fps=%%f
        )
        set resolution=%%c
        echo !resolution! >res.tmp
        findstr "x" res.tmp >nul
        rem echo errorlevel = !errorlevel!
        if not !errorlevel!==0 (
            set resolution=%%d
        )
        del null
        set fps=!fps:~1,-4!
    )
    set /A count+=1
)

How can I do this in VB? I am using Visual Studio Express 2015 for desktop.

Ok, after doing some more digging and playing, here is how I managed to accomplish the task:

 Sub main()
        Dim strDuration As String
        Dim strCodec As String
        Dim strRes As String
        Dim lngWidth, lngHeight As Long
        Dim strAudCodec As String
        Dim dblFPS As Double
        Dim audFreq As Double
        Dim audQual As Double
        Using logReader As New Microsoft.VisualBasic.FileIO.TextFieldParser("..\..\Sirach and Matthew003_probe.log")
            logReader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited
            logReader.SetDelimiters(",")
            Dim curRow As String()
            While Not logReader.EndOfData
                Try
                    curRow = logReader.ReadFields()
                    'look for and assign duration
                    If Mid(curRow(0), 1, 8) = "Duration" Then
                        strDuration = Mid(curRow(0), 11, 11)
                    End If
                    'look for the video stream row
                    If Mid(curRow(0), 19, 6) = "Video:" Then
                        'Assign the video codec
                        strCodec = Mid(curRow(0), 26, Len(curRow(0)))
                        strCodec = Mid(strCodec, 1, InStr(1, strCodec, " ", CompareMethod.Text) - 1)
                        'Look in each field of current row
                        For i = 0 To 10 Step 1
                            'look for the field containing the resolution ("x" should be the 4th or 5th character)
                            If InStr(1, curRow(i), "x", CompareMethod.Text) = 4 Or InStr(1, curRow(i), "x", CompareMethod.Text) = 5 Then
                                'Assign resolution
                                strRes = Mid(curRow(i), 1, InStr(1, curRow(i), " ", CompareMethod.Text))
                                'Assign Width
                                lngWidth = Mid(strRes, 1, InStr(1, strRes, "x", CompareMethod.Text) - 1)
                                'Assign Heigh
                                lngHeight = Mid(strRes, InStr(1, strRes, "x", CompareMethod.Text) + 1, Len(strRes))
                            End If
                            'loof for fps suffix
                            If Mid(curRow(i), Len(curRow(i)) - 2, 3) = "fps" Then
                                'Assign frame rate
                                dblFPS = Mid(curRow(i), 1, Len(curRow(i)) - 4)
                            End If
                        Next i
                    End If
                    'Look for the audio stream row
                    If Mid(curRow(0), 19, 6) = "Audio:" Then
                        'Assign the audio codec
                        strAudCodec = Mid(curRow(0), 26, Len(curRow(0)))
                        strAudCodec = Mid(strAudCodec, 1, InStr(1, strAudCodec, " ", CompareMethod.Text) - 1)
                        For i = 0 To 10 Step 1
                            'look for the field containing the audio sampling frequency
                            If InStr(1, curRow(i), "Hz", CompareMethod.Text) Then
                                'Assign Audio Sampling Frequency
                                audFreq = Mid(curRow(i), 1, InStr(1, curRow(i), " ", CompareMethod.Text) - 1)
                            End If
                            'look for the field containing the audio quality
                            If InStr(1, curRow(i), "kb/s", CompareMethod.Text) Then
                                'assign audio quality
                                audQual = Mid(curRow(i), 1, InStr(1, curRow(i), " ", CompareMethod.Text) - 1)
                            End If
                        Next
                    End If
                Catch ex As Exception
                End Try
            End While
        End Using
        Dim strMsg As String
        strMsg = "Duration: " & strDuration & Chr(13) _
            & "Codec: " & strCodec & Chr(13) _
            & "Resolution: " & strRes & Chr(13) _
            & "Width: " & lngWidth & Chr(13) _
            & "Height: " & lngHeight & Chr(13) _
            & "Frame Rate: " & dblFPS & " fps" & Chr(13) _
            & "Audio Codec: " & strAudCodec & Chr(13) _
            & "Audio Sampling Freq: " & audFreq & " Hz" & Chr(13) _
            & "Audio Quality: " & audQual & " kb/s" & Chr(13)
        MsgBox(strMsg)
    End Sub

Which yields the result:

enter image description here

All I need to do now is to populate my form's fields using the variables in the form's code: Me.[field].text = [variable] and it should be good to go.

As you can see, I took advantage of the curRow() array to narrow down the fields using mid() and InStr() functions. Though it seems to have accomplished the task, I'm not sure if it was the best way to do it, or if there might be another way.

Please let me know if you have any suggestions for improvement.


Solution

  • Please let me know if you have any suggestions for improvement.

    There is a more direct way to get the info rather than using the old Shell command, and a slightly more organized way to access the info. The following will read the result from ffprobe directly from standard output. To do this, use the shiny new Net Process class instead of the legacy Shell.

    FFProbe also supports json output (and xml, I think), which will allow you to query a JObject to fetch back params by "name". The command line argument for this is -print_format json.

    The following will get the same properties you show but will put them in a spiffy ListView rather than a MsgBox. Because of the encoding, the references can be a bit long, but I'll show how to shorten them. Most important, there is no string parsing involved which will be more direct. You will need Newtonsoft's JSON.NET for this:

    Imports Newtonsoft.Json
    Imports Newtonsoft.Json.Linq
    '...
    ' the part to run ffprobe and read the result to a string:
    
    Dim Quote = Convert.ToChar(34)
    Dim json As String
    Using p As New Process
        p.StartInfo.FileName = "...ffprobe.exe"  ' use your path
        p.StartInfo.Arguments = String.Format(" -v quiet -print_format json -show_streams {0}{1}{0}",
                                              Quote, theFileName)
        p.StartInfo.RedirectStandardOutput = True
        p.StartInfo.CreateNoWindow = True
        p.StartInfo.UseShellExecute = False
    
        p.Start()
    
        json = p.StandardOutput.ReadToEnd()
    End Using
    

    The resulting json looks like this (partial display!):

    {
        "streams": [{
            "index": 0,
            "codec_name": "h264",
            "codec_long_name": "H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10",
            "profile": "High",
            "codec_type": "video",
            "codec_time_base": "50/2997",
            "codec_tag_string": "avc1",
            "codec_tag": "0x31637661",
            "width": 720,
            "height": 400,
            ...
    

    "streams" is a 4 element array; apparently Video is element 0 and Audio is 1. I have no idea what the other 2 are. Each of those can be deserialized to a Dictionary of Name-Value pairs representing the media properties.

    After Parsing it, the video codec would be: myJObj("streams")(0)("codec_name").ToString():

    • ("streams")(0) references the 0th streams array element
    • ("codec_name") is the key or name of that property. Most values are string, but a few are numeric.

    The following code (continued from above), will shorten those references by creating a reference to the desired property set. As I said, I am putting these in a ListView and using Groups.

    ' parse to a temp JObject
    Dim jobj= JObject.Parse(json)
    ' get a list/array of dictionary of prop name/value pairs
    Dim medprops = JsonConvert.DeserializeObject( _
                Of List(Of Dictionary(Of String, Object)))(jobj("streams").ToString)
    
    ' get Video list:
    Dim Props = medprops(0)
    
    AddNewLVItem("Codec", Props("codec_name").ToString, "Video")
    
    Dim secs As Double = Convert.ToDouble(Props("duration").ToString)
    Dim ts = TimeSpan.FromSeconds(secs)
    AddNewLVItem("Duration", ts.ToString("hh\:mm\:ss"), "Video")
    
    AddNewLVItem("Fr Width", Props("width").ToString, "Video")
    AddNewLVItem("Fr Height", Props("height").ToString, "Video")
    
    ' get avg fr rate text
    Dim afr = Props("avg_frame_rate").ToString
    Dim AvgFrRate As String = "???"
    ' split on "/"
    If afr.Contains("/") Then
        Dim split = afr.Split("/"c)
        ' calc by dividing (0) by (1)
        AvgFrRate = (Convert.ToDouble(split(0)) / Convert.ToDouble(split(1))).ToString
    End If
    AddNewLVItem("Avg Frame Rate", AvgFrRate, "Video")
    
    ' NB: audio stream values come from element (1):
    Props = medprops(1)
    AddNewLVItem("Audio Codec", Props("codec_name").ToString, "Audio")
    AddNewLVItem("Audio Sample Rate", Props("sample_rate").ToString, "Audio")
    
    ' avg bit rate is apparently decimal (1000) not metric (1024)
    Dim abr As Integer = Convert.ToInt32(Props("bit_rate").ToString)
    abr = Convert.ToInt32(abr / 1000)
    AddNewLVItem("Audio Bit Rate", abr.ToString, "Audio")
    

    Almost everything is returned as string, and many need some calculation or formatting performed. For instance, avg frame rate comes back as "2997/100" so you can see where I split it, convert to integer and divide. Also the audio bit rate appears to be decimal not metric and can be something like 127997.

    You will be able to see the various keys and values in the debugger to locate other values. You can also paste the resulting json string to jsonlint to better understand the structure and read the keys.

    The LV helper is simple:

    Private Sub AddNewLVItem(text As String, value As String, g As String)
        Dim LVI = New ListViewItem
        LVI.Text = text
        LVI.SubItems.Add(value)
        LVI.Group = myLV.Groups(g)
        myLV.Items.Add(LVI)
    End Sub
    

    The references can be dense or "wordy" but it seems far less tedious and involved than chopping up strings. The result:

    enter image description hereenter image description here


    I find MediaInfo.dll often fails to return the frame rate, and almost always provides an inaccurate frame count

    Compared to what? I did a quick test on 6 files and MediaInfo matched both Explorer and ffprobe. It too has multiple entries and there is some issue with CBR/VBR.