I have a Hikvision NVR that stores security camera footage that I need to display on a website. I know that Hikvision uses proprietary H.264 codec that makes it impossible to play (coherently) in popular video players, like VLC, unless you install that codec everywhere you play it.
My plan was to transcode the video using ffmpeg to regular H.264 codec and AAC for audio but the produced file has the same issues as the original - no audio when playing and very disruptive video. So the question is, does ffmpeg support encoding from Hikvision video/audio codecs? Or perhaps should try to convert to different web-capable codecs using ffmpeg? My ffmpeg command looks like this:
ffmpeg -i C:\1.mp4 -c:v libx264 -preset fast -crf 30 -b:v 200k -c:a aac -strict experimental -movflags faststart -threads 0 C:\2.mp4
EDIT: What's interesting is that ffplay.exe
opens and plays the original video files with no problem whatsoever, even on a computer where Hikvision codecs are not isntalled, therefore I figured conversion should be possible as well?
Mediainfo output of the video file in question:
General
CompleteName : C:\DownLoad\1.mp4
Format : MPEG-PS
FileSize/String : 8.60 MiB
Duration/String : 2 h 7 min
OverallBitRate/String : 9 395 b/s
FileExtension_Invalid : mpeg mpg m2p vob pss evo
Video
ID/String : 224 (0xE0)
Format : AVC
Format/Info : Advanced Video Codec
Format_Profile : Baseline@L4
Format_Settings : 1 Ref Frames
Format_Settings_CABAC/String : No
Format_Settings_RefFrames/String : 1 frame
Format_Settings_GOP : M=1, N=30
Duration/String : 2 min 0 s
Width/String : 1 920 pixels
Height/String : 1 080 pixels
DisplayAspectRatio/String : 16:9
FrameRate_Mode/String : Variable
ColorSpace : YUV
ChromaSubsampling/String : 4:2:0
BitDepth/String : 8 bits
ScanType/String : Progressive
Audio
ID/String : 192 (0xC0)
Format : MPEG Audio
Duration/String : 2 h 7 min
Compression_Mode/String : Lossy
Video_Delay/String : -33 min 40 s
Output of ffmpeg:
C:\ffmpeg\bin>ffmpeg -i C:\DownLoad\1.mp4 -c:v libx264 -preset fast -crf 30 -b:v 75k -c:a aac -strict experimental -movflags faststart -threads 0 C:\DownLoad\2.mp4
ffmpeg version N-86537-gae6f6d4 Copyright (c) 2000-2017 the FFmpeg developers
built with gcc 7.1.0 (GCC)
configuration: --enable-gpl --enable-version3 --enable-cuda --enable-cuvid --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-nvenc --enable-avisynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libfreetype --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-libzimg --enable-lzma --enable-zlib
libavutil 55. 66.100 / 55. 66.100
libavcodec 57. 99.100 / 57. 99.100
libavformat 57. 73.100 / 57. 73.100
libavdevice 57. 7.100 / 57. 7.100
libavfilter 6. 94.100 / 6. 94.100
libswscale 4. 7.101 / 4. 7.101
libswresample 2. 8.100 / 2. 8.100
libpostproc 54. 6.100 / 54. 6.100
Input #0, mpeg, from 'C:\DownLoad\1.mp4':
Duration: 02:07:57.93, start: 789.820800, bitrate: 9 kb/s
Stream #0:0[0x1e0]: Video: h264 (Baseline), yuv420p(progressive), 1920x1080, 25 fps, 25 tbr, 90k tbn, 50 tbc
Stream #0:1[0x1c0]: Audio: pcm_mulaw, 8000 Hz, mono, s16, 64 kb/s
File 'C:\DownLoad\2.mp4' already exists. Overwrite ? [y/N] y
Stream mapping:
Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
Stream #0:1 -> #0:1 (pcm_mulaw (native) -> aac (native))
Press [q] to stop, [?] for help
[aac @ 0000000002cd0280] Too many bits 8832.000000 > 6144 per frame requested, clamping to max
[libx264 @ 0000000002514c80] using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX XOP FMA4
[libx264 @ 0000000002514c80] profile High, level 4.0
[libx264 @ 0000000002514c80] 264 - core 150 r2833 df79067 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=2 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=6 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=12 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=1 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=30 rc=crf mbtree=1 crf=30.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'C:\DownLoad\2.mp4':
Metadata:
encoder : Lavf57.73.100
Stream #0:0: Video: h264 (libx264) ([33][0][0][0] / 0x0021), yuv420p, 1920x1080, q=-1--1, 75 kb/s, 25 fps, 12800 tbn, 25 tbc
Metadata:
encoder : Lavc57.99.100 libx264
Side data:
cpb: bitrate max/min/avg: 0/0/75000 buffer size: 0 vbv_delay: -1
Stream #0:1: Audio: aac (LC) ([64][0][0][0] / 0x0040), 8000 Hz, mono, fltp, 48 kb/s
Metadata:
encoder : Lavc57.99.100 aac
[mp4 @ 00000000010e9e00] Starting second pass: moving the moov atom to the beginning of the file speed= 116x
frame= 3269 fps= 66 q=-1.0 Lsize= 11086kB time=01:34:24.38 bitrate= 16.0kbits/s dup=269 drop=0 speed= 115x
video:10429kB audio:592kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.594114%
[libx264 @ 0000000002514c80] frame I:14 Avg QP:21.86 size: 59795
[libx264 @ 0000000002514c80] frame P:833 Avg QP:24.81 size: 8993
[libx264 @ 0000000002514c80] frame B:2422 Avg QP:28.70 size: 970
[libx264 @ 0000000002514c80] consecutive B-frames: 1.0% 0.2% 1.4% 97.4%
[libx264 @ 0000000002514c80] mb I I16..4: 18.9% 66.3% 14.8%
[libx264 @ 0000000002514c80] mb P I16..4: 4.0% 7.7% 0.4% P16..4: 16.2% 2.0% 0.6% 0.0% 0.0% skip:69.1%
[libx264 @ 0000000002514c80] mb B I16..4: 0.6% 0.2% 0.0% B16..8: 5.5% 0.1% 0.0% direct: 0.7% skip:92.9% L0:44.0% L1:55.0% BI: 1.0%
[libx264 @ 0000000002514c80] 8x8 transform intra:59.0% inter:83.3%
[libx264 @ 0000000002514c80] coded y,uvDC,uvAC intra: 25.3% 36.1% 7.7% inter: 1.0% 2.3% 0.1%
[libx264 @ 0000000002514c80] i16 v,h,dc,p: 23% 24% 43% 10%
[libx264 @ 0000000002514c80] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 37% 26% 23% 2% 2% 3% 2% 3% 3%
[libx264 @ 0000000002514c80] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 43% 23% 12% 4% 4% 5% 4% 4% 2%
[libx264 @ 0000000002514c80] i8c dc,h,v,p: 81% 7% 9% 3%
[libx264 @ 0000000002514c80] Weighted P-Frames: Y:1.0% UV:0.0%
[libx264 @ 0000000002514c80] ref P L0: 73.6% 26.4%
[libx264 @ 0000000002514c80] ref B L0: 80.9% 19.1%
[libx264 @ 0000000002514c80] ref B L1: 90.0% 10.0%
[libx264 @ 0000000002514c80] kb/s:653.30
[aac @ 0000000002cd0280] Qavg: 64512.656
C:\ffmpeg\bin>
Download link to sample:
https://www.dropbox.com/s/9ccptsuiqk2ntsv/1.zip?dl=0
This sample is exactly 2 minutes long, but VLC will tell you otherwise.
I was able to produce a normalized video file by doing the following:
ffmpeg
and using -acodec aac
ffmpeg
and -v:c copy
and using -t
option to specify the actual duration of the videoResult is a file that is playable in any video player. Tested on VLC, MPC-HC.
edit:20180730
Since then I have had multiple other issues with the same video sources and in the end decided to re-encode both video and audio tracks to get a normalized output. One of the main problems was difference in duration of video and audio tracks when I separated them from the original file - sometimes the audio would be 7-15 seconds longer than the video and sometimes it would be shorter. And sometimes, the video would have extra time of unknown duration appended to it for no apparent reason. To solve this issue I had to re-encode both audio and video tracks based on which one needed correction. (note: I knew the real time of the video, since I would manually request the exact chunks that I needed from the Hikvision NVR using its Web interface) So here is the logic of C# code that I came up with:
Split the input.mp4 file into video and audio tracks using ffmpeg:
ffmpeg -y -i 1.mp4 -vn -c:a libmp3lame -ar 44100 -aq 0 2-a.mp3
ffmpeg -y -i 1.mp4 -an -c:v copy 2-v.mp4
Note: I encode the audio into libmp3lame since Hikvision devices use G.711 PCM for audio in their mp4 container and that was not suitable for me.
Get the durations of the video and audio tracks as ffmpeg identifies them using ffprobe:
ffprobe -show_entries stream=duration -of compact -v 0 2-a.mp3
ffprobe -show_entries stream=duration -of compact -v 0 2-v.mp4
The durations are shown in the output of these two commands and I capture this output and filter it to get that particular string. Alternatively you can just manually take note of it if you do not plan to automate this whole process.
Compare these durations to the actual duration and act accordingly:
If the audio duration matches the actual one but the video duration is bigger - shrink the video track using ffmpeg and setpts
filter like this:
ffmpeg -y -i 2-v.mp4 -filter:v setpts=RATIO*PTS 2-v-edit.mp4
Where RATIO
is a number you get by dividing the audio track's duration by the video track's duration. For example, if video duration is: 45.11 seconds and audio duration is 39.76 seconds then RATIO = 39.76 / 45.11 = 0.8814010197
And PTS
is the current PTS of the video track that ffmpeg inputs itself, this string is part of the command and not something you need to change.
If the video duration matches the actual one, but the audio is shorter OR longer then I re-encode the audio using ffmpeg's atempo
filter like this:
ffmpeg -y -i 2-a.mp3 -acodec libmp3lame -filter:a atempo=RATIO 2-a-edit.mp3
Where RATIO
is audio duration / video duration.
After this I get normalized video and audio tracks that I can merge using ffmpeg like this for example:
ffmpeg -i 2-v-edit.mp4 -i 2-a-edit.mp3 -c copy 2.mp4
If given a choice, I would never work with another Hikvision device in my life.