Search code examples
c#ffmpegvlcaccord.net

c# how to capture audio from nvlc and raise Accord.Audio.NewFrameEventArgs


I'm working on the application in c# that record video stream from IP cameras.

I'm using Accord.Video.FFMPEG.VideoFileWriter and nVLC C# wrapper. I have a class that captures audio from the stream using nVLC, which should implement the IAudioSource interface, so I've used CustomAudioRendere to capture sound data, then raised the event NewFrame that contains the signal object. The problem is when saving the signal to video file, the sound is terrifying(discrete) when the record from RTSP stream, but in good quality when the record from the local mic(from the laptop). Here is the code that raises the event:

public void Start()
    {
        _mFactory = new MediaPlayerFactory();
        _mPlayer = _mFactory.CreatePlayer<IAudioPlayer>();
        _mMedia = _mFactory.CreateMedia<IMedia>(Source);
        _mPlayer.Open(_mMedia);

        var fc = new Func<SoundFormat, SoundFormat>(SoundFormatCallback);
        _mPlayer.CustomAudioRenderer.SetFormatCallback(fc);
        var ac = new AudioCallbacks { SoundCallback = SoundCallback };
        _mPlayer.CustomAudioRenderer.SetCallbacks(ac);

        _mPlayer.Play();
    }

    private void SoundCallback(Sound newSound)
    {

        var data = new byte[newSound.SamplesSize];
        Marshal.Copy(newSound.SamplesData, data, 0, (int)newSound.SamplesSize);

        NewFrame(this, new Accord.Audio.NewFrameEventArgs(new Signal(data,Channels, data.Length, SampleRate, Format)));
    }

    private SoundFormat SoundFormatCallback(SoundFormat arg)
    {


        Channels = arg.Channels;
        SampleRate = arg.Rate;
        BitPerSample = arg.BitsPerSample;

        return arg;

    }

And here is the code that captures the event:

private void source_NewFrame(object sender, NewFrameEventArgs eventArgs)
    {
        Signal sig = eventArgs.Signal;

        duration += eventArgs.Signal.Duration;
        if (videoFileWrite == null)
        {


            videoFileWrite = new VideoFileWriter();
            videoFileWrite.AudioBitRate = sig.NumberOfSamples*sig.NumberOfChannels*sig.SampleSize;
            videoFileWrite.SampleRate = sig.SampleRate;
            videoFileWrite.FrameSize = sig.NumberOfSamples/sig.NumberOfFrames;


            videoFileWrite.Open("d:\\output.mp4");
        }
        if (isStartRecord)
        {
            DoneWriting = false;

            using (MemoryStream ms = new MemoryStream())
            {
                encoder = new WaveEncoder(ms);
                encoder.Encode(eventArgs.Signal);
                ms.Seek(0, SeekOrigin.Begin);
                decoder = new WaveDecoder(ms);
                Signal s = decoder.Decode();
                videoFileWrite.WriteAudioFrame(s);

                encoder.Close();
                decoder.Close();

            }
            DoneWriting = true;
        }
    }

Solution

  • I've solved this problem by taking only one channel from Sound object "newSound" in SoundCallback void, then create a signal from that array of bytes and raise the event "NewFrame". The main idea behind this solution is that when the audio stream contains more than one channel, the SampleData property in Sound object in SoundCallback void contains array of bytes for all channels by the following formation: let's assume that sound data for one channel for samples of two bytes are A1A2B1B2C1C2...etc, so the SampleData would be for two channels as: A1A2A1A2B1B2B1B2C1C2C1C2.....etc.

    hope that could help you out.