Audio spectrum extraction from audio file by python

I think your question has three separate parts:

How to load audio files into python?
How to calculate spectrum in python?
What to do with the spectrum?

1. How to load audio files in python?

You are probably best off by using scipy, as it provides a lot of signal processing functions. For loading audio files:

import scipy.io.wavfile

samplerate, data = scipy.io.wavfile.read("mywav.wav")

Now you have the sample rate (samples/s) in samplerate and data as a numpy.array in data. You may want to transform the data into floating point, depending on your application.

There is also a standard python module wave for loading wav-files, but numpy/scipy offers a simpler interface and more options for signal processing.

2. How to calculate the spectrum

Brief answer: Use FFT. For more words of wisdom, see:

Analyze audio using Fast Fourier Transform

Longer answer is quite long. Windowing is very important, otherwise you'll have strange spectra.

3. What to do with the spectrum

This is a bit more difficult. Filtering is often performed in time domain for longer signals. Maybe if you tell us what you want to accomplish, you'll receive a good answer for this one. Calculating the frequency spectrum is one thing, getting meaningful results with it in signal processing is a bit more complicated.

(I know you did not ask this one, but I see it coming with a probability >> 0. Of course, it may be that you have good knowledge on audio signal processing, in which case this is irrelevant.)

You can compute and visualize the spectrum and the spectrogram this using scipy, for this test i used this audio file: vignesh.wav

from scipy.io import wavfile # scipy library to read wav files
import numpy as np

AudioName = "vignesh.wav" # Audio File
fs, Audiodata = wavfile.read(AudioName)

# Plot the audio signal in time
import matplotlib.pyplot as plt
plt.plot(Audiodata)
plt.title('Audio signal in time',size=16)

# spectrum
from scipy.fftpack import fft # fourier transform
n = len(Audiodata) 
AudioFreq = fft(Audiodata)
AudioFreq = AudioFreq[0:int(np.ceil((n+1)/2.0))] #Half of the spectrum
MagFreq = np.abs(AudioFreq) # Magnitude
MagFreq = MagFreq / float(n)
# power spectrum
MagFreq = MagFreq**2
if n % 2 > 0: # ffte odd 
    MagFreq[1:len(MagFreq)] = MagFreq[1:len(MagFreq)] * 2
else:# fft even
    MagFreq[1:len(MagFreq) -1] = MagFreq[1:len(MagFreq) - 1] * 2 

plt.figure()
freqAxis = np.arange(0,int(np.ceil((n+1)/2.0)), 1.0) * (fs / n);
plt.plot(freqAxis/1000.0, 10*np.log10(MagFreq)) #Power spectrum
plt.xlabel('Frequency (kHz)'); plt.ylabel('Power spectrum (dB)');


#Spectrogram
from scipy import signal
N = 512 #Number of point in the fft
f, t, Sxx = signal.spectrogram(Audiodata, fs,window = signal.blackman(N),nfft=N)
plt.figure()
plt.pcolormesh(t, f,10*np.log10(Sxx)) # dB spectrogram
#plt.pcolormesh(t, f,Sxx) # Lineal spectrogram
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [seg]')
plt.title('Spectrogram with scipy.signal',size=16);

plt.show()

i tested all the code and it works, you need, numpy, matplotlib and scipy.

cheers

Audio spectrum extraction from audio file by python

Tags:

Python

Audio

Beat Detection

Related

Recent Posts