How to convert sample rate from AV_SAMPLE_FMT_FLTP to AV_SAMPLE_FMT_S16?

I found 2 resample function from FFMPEG. The performance maybe better.

avresample_convert() http://libav.org/doxygen/master/group__lavr.html
swr_convert() http://spirton.com/svn/MPlayer-SB/ffmpeg/libswresample/swresample_test.c

Thanks Reuben for a solution to this. I did find that some of the sample values were slightly off when compared with a straight ffmpeg -i file.wav. It seems that in the conversion, they use a round() on the value.

To do the conversion, I did what you did with a bid of modification to work for any amount of channels:

if (audioCodecContext->sample_fmt == AV_SAMPLE_FMT_FLTP)
{
    int nb_samples = decoded_frame->nb_samples;
    int channels = decoded_frame->channels;
    int outputBufferLen = nb_samples & channels * 2;
    short* outputBuffer = new short[outputBufferLen/2];

    for (int i = 0; i < nb_samples; i++)
    {
         for (int c = 0; c < channels; c++)
         {
             float* extended_data = (float*)decoded_frame->extended_data[c];
             float sample = extended_data[i];
             if (sample < -1.0f) sample = -1.0f;
             else if (sample > 1.0f) sample = 1.0f;
             outputBuffer[i * channels + c] = (short)round(sample * 32767.0f);
         }
    }

    // Do what you want with the data etc.

}

I went from ffmpeg 0.11.1 -> 1.1.3 and found the change of sample format annoying. I looked at setting the request_sample_fmt to AV_SAMPLE_FMT_S16 but it seems the aac decoder doesn't support anything other than AV_SAMPLE_FMT_FLTP anyway.

EDIT 9th April 2013: Worked out how to use libswresample to do this... much faster!

At some point in the last 2-3 years FFmpeg's AAC decoder's output format changed from AV_SAMPLE_FMT_S16 to AV_SAMPLE_FMT_FLTP. This means that each audio channel has it's own buffer, and each sample value is a 32-bit floating point value scaled from -1.0 to +1.0.

Whereas with AV_SAMPLE_FMT_S16 the data is in a single buffer, with the samples interleaved, and each sample is a signed integer from -32767 to +32767.

And if you really need your audio as AV_SAMPLE_FMT_S16, then you have to do the conversion yourself. I figured out two ways to do it:

1. Use libswresample (recommended)

#include "libswresample/swresample.h"

...

SwrContext *swr;

...

// Set up SWR context once you've got codec information
swr = swr_alloc();
av_opt_set_int(swr, "in_channel_layout",  audioCodec->channel_layout, 0);
av_opt_set_int(swr, "out_channel_layout", audioCodec->channel_layout,  0);
av_opt_set_int(swr, "in_sample_rate",     audioCodec->sample_rate, 0);
av_opt_set_int(swr, "out_sample_rate",    audioCodec->sample_rate, 0);
av_opt_set_sample_fmt(swr, "in_sample_fmt",  AV_SAMPLE_FMT_FLTP, 0);
av_opt_set_sample_fmt(swr, "out_sample_fmt", AV_SAMPLE_FMT_S16,  0);
swr_init(swr);

...

// In your decoder loop, after decoding an audio frame:
AVFrame *audioFrame = ...;
int16_t* outputBuffer = ...;
swr_convert(&outputBuffer, audioFrame->nb_samples, audioFrame->extended_data, audioFrame->nb_samples);

And that's all you have to do!

2. Do it by hand in C (original answer, not recommended)

So in your decode loop, when you've got an audio packet you decode it like this:

AVCodecContext *audioCodec;   // init'd elsewhere
AVFrame *audioFrame;          // init'd elsewhere
AVPacket packet;              // init'd elsewhere
int16_t* outputBuffer;        // init'd elsewhere
int out_size = 0;
...
int len = avcodec_decode_audio4(audioCodec, audioFrame, &out_size, &packet);

And then, if you've got a full frame of audio, you can convert it fairly easily:

    // Convert from AV_SAMPLE_FMT_FLTP to AV_SAMPLE_FMT_S16
    int in_samples = audioFrame->nb_samples;
    int in_linesize = audioFrame->linesize[0];
    int i=0;
    float* inputChannel0 = (float*)audioFrame->extended_data[0];
    // Mono
    if (audioFrame->channels==1) {
        for (i=0 ; i<in_samples ; i++) {
            float sample = *inputChannel0++;
            if (sample<-1.0f) sample=-1.0f; else if (sample>1.0f) sample=1.0f;
            outputBuffer[i] = (int16_t) (sample * 32767.0f);
        }
    }
    // Stereo
    else {
        float* inputChannel1 = (float*)audioFrame->extended_data[1];
        for (i=0 ; i<in_samples ; i++) {
             outputBuffer[i*2] = (int16_t) ((*inputChannel0++) * 32767.0f);
             outputBuffer[i*2+1] = (int16_t) ((*inputChannel1++) * 32767.0f);
        }
    }
    // outputBuffer now contains 16-bit PCM!

I've left a couple of things out for clarity... the clamping in the mono path should ideally be duplicated in the stereo path. And the code can be easily optimized.

How to convert sample rate from AV_SAMPLE_FMT_FLTP to AV_SAMPLE_FMT_S16?

Tags:

Ffmpeg

Sample

Android Ndk

Pcm

Libav

Related

Recent Posts