Audio spectrum extraction from audio file by python
I think your question has three separate parts:
- How to load audio files into python?
- How to calculate spectrum in python?
- What to do with the spectrum?
1. How to load audio files in python?
You are probably best off by using scipy
, as it provides a lot of signal processing functions. For loading audio files:
import scipy.io.wavfile
samplerate, data = scipy.io.wavfile.read("mywav.wav")
Now you have the sample rate (samples/s) in samplerate
and data as a numpy.array
in data
. You may want to transform the data into floating point, depending on your application.
There is also a standard python module wave
for loading wav-files, but numpy
/scipy
offers a simpler interface and more options for signal processing.
2. How to calculate the spectrum
Brief answer: Use FFT. For more words of wisdom, see:
Analyze audio using Fast Fourier Transform
Longer answer is quite long. Windowing is very important, otherwise you'll have strange spectra.
3. What to do with the spectrum
This is a bit more difficult. Filtering is often performed in time domain for longer signals. Maybe if you tell us what you want to accomplish, you'll receive a good answer for this one. Calculating the frequency spectrum is one thing, getting meaningful results with it in signal processing is a bit more complicated.
(I know you did not ask this one, but I see it coming with a probability >> 0. Of course, it may be that you have good knowledge on audio signal processing, in which case this is irrelevant.)
You can compute and visualize the spectrum and the spectrogram this using scipy, for this test i used this audio file: vignesh.wav
from scipy.io import wavfile # scipy library to read wav files
import numpy as np
AudioName = "vignesh.wav" # Audio File
fs, Audiodata = wavfile.read(AudioName)
# Plot the audio signal in time
import matplotlib.pyplot as plt
plt.plot(Audiodata)
plt.title('Audio signal in time',size=16)
# spectrum
from scipy.fftpack import fft # fourier transform
n = len(Audiodata)
AudioFreq = fft(Audiodata)
AudioFreq = AudioFreq[0:int(np.ceil((n+1)/2.0))] #Half of the spectrum
MagFreq = np.abs(AudioFreq) # Magnitude
MagFreq = MagFreq / float(n)
# power spectrum
MagFreq = MagFreq**2
if n % 2 > 0: # ffte odd
MagFreq[1:len(MagFreq)] = MagFreq[1:len(MagFreq)] * 2
else:# fft even
MagFreq[1:len(MagFreq) -1] = MagFreq[1:len(MagFreq) - 1] * 2
plt.figure()
freqAxis = np.arange(0,int(np.ceil((n+1)/2.0)), 1.0) * (fs / n);
plt.plot(freqAxis/1000.0, 10*np.log10(MagFreq)) #Power spectrum
plt.xlabel('Frequency (kHz)'); plt.ylabel('Power spectrum (dB)');
#Spectrogram
from scipy import signal
N = 512 #Number of point in the fft
f, t, Sxx = signal.spectrogram(Audiodata, fs,window = signal.blackman(N),nfft=N)
plt.figure()
plt.pcolormesh(t, f,10*np.log10(Sxx)) # dB spectrogram
#plt.pcolormesh(t, f,Sxx) # Lineal spectrogram
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [seg]')
plt.title('Spectrogram with scipy.signal',size=16);
plt.show()
i tested all the code and it works, you need, numpy, matplotlib and scipy.
cheers