Python Wave byte data
If you want to understand what the 'frame' is you will have to read the standard of the wave file format. For instance: https://web.archive.org/web/20140221054954/http://home.roadrunner.com/~jgglatt/tech/wave.htm
From that document:
The sample points that are meant to be "played" ie, sent to a Digital to Analog Converter(DAC) simultaneously are collectively called a sample frame. In the example of our stereo waveform, every two sample points makes up another sample frame. This is illustrated below for that stereo example.
sample sample sample
frame 0 frame 1 frame N
_____ _____ _____ _____ _____ _____
| ch1 | ch2 | ch1 | ch2 | . . . | ch1 | ch2 |
|_____|_____|_____|_____| |_____|_____|
_____
| | = one sample point
|_____|
To convert to mono you could do something like this,
import wave
def stereo_to_mono(hex1, hex2):
"""average two hex string samples"""
return hex((ord(hex1) + ord(hex2))/2)
wr = wave.open('piano2.wav','r')
nchannels, sampwidth, framerate, nframes, comptype, compname = wr.getparams()
ww = wave.open('piano_mono.wav','wb')
ww.setparams((1,sampwidth,framerate,nframes,comptype,compname))
frames = wr.readframes(wr.getnframes()-1)
new_frames = ''
for (s1, s2) in zip(frames[0::2],frames[1::2]):
new_frames += stereo_to_mono(s1,s2)[2:].zfill(2).decode('hex')
ww.writeframes(new_frames)
There is no clear-cut way to go from stereo to mono. You could just drop one channel. Above, I am averaging the channels. It all depends on your application.
For wav file IO I prefer to use scipy. It is perhaps overkill for reading a wav file, but generally after reading the wav it is easier to do downstream processing.
import scipy.io.wavfile
fs1, y1 = scipy.io.wavfile.read(filename)
From here the data y1, will be N samples long, and will have Z columns where each column corresponds to a channel. To convert to a mono wav file you don't say how you'd like to do that conversion. You can take the average, or whatever else you'd like. For average use
monoChannel = y1.mean(axis=1)
As a direct answer to your question: two bytes make one 16-bit integer value in the "usual" way, given by the explicit formula: value = ord(data[0]) + 256 * ord(data[1])
. But using the struct
module is a better way to decode (and later reencode) such multibyte integers:
import struct
print(struct.unpack("HH", b"\x00\x00\x00\x00"))
# -> gives a 2-tuple of integers, here (0, 0)
or, if we want a signed 16-bit integer (which I think is the case in .wav files), use "hh"
instead of "HH"
. (I leave to you the task of figuring out how exactly two bytes can encode an integer value from -32768 to 32767 :-)